Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: Phone Interview
m log n, where m is number of unique ages, what if all ages are unique, then u will end up with n logn, which is violates the question constraint, and that you can easily have index out of bounds with count[input[begin]], not sure why this answer is given up votes
@notanexpoert because good luck finding an array of unique ages larger than 122. Which the poster clearly stated. This one and the top answer are asymptotically equivalent. Not sure what the point of your comment is.
time complexity: O(m log k)
m = number of unique data
k = number of times each data appears in any set
n = array size
if all data are unique then it would be O(m log 1) == O(m) == O(n)
This code is good but it doesn't check for empty arrays. The first line of "public static int[] count(int[] input)" should be a check to see if the length of input is 0... and if it is - it should simply return a copy of itself.
@CT I coded up your approach in Java (IntelliJ) and hit a stack overflow. Suggestions/comments?
What was the size of the Input? It is pretty trivial to rewrite recursion using explicit stack if the Input sizwe is too big for recursion.
Something with a binary search is almost certainly the expected answer. I think the question is misleading because the Big O of this will always be O(n), but there are solutions (binary search) which could get better results dependent on the input
This is indeed nice code, but the usage of the word "count" makes it ambiguous. It is confusing for readers of this code to disambiguate between count() (the method) and count[] (the array).
I would probably go with count() and counts[], which is probably still to ambiguous, but I digress.
This problem can be solved in less than O(n) using a modified binary search.
For each element in the array ages[] (starting from 0) we record the first index i where this age is present, then we search using binary search the last index where this age is present.
The number of people with the same age will be given by lastindex-firstindex for this age.
Then we retake the search from lastindex+1 for the next age.
1+(lastindex-startindex) to get the number of people with the same age for each key in the map.
public static Map<Integer,Integer> countAges(int[] ages) {
if(ages==null || ages.length==0) {
return new HashMap<Integer,Integer>();
}
int i = 0;
int end = 0;
Map<Integer,Integer> count = new HashMap<Integer,Integer>();
int from = 0;
int to = 0;
while(i<ages.length) {
from = i;
end=binSearchEnd(ages,i,ages.length);
to = end;
count.put(ages[i], 1+to-from);
i=end+1;
}
return count;
}
and this is the method to search for the last index of an age in the array of ages using binary search:
public static int binSearchEnd(int[] ages, int start, int end) {
if(start+1>ages.length-1 || ages[start]!=ages[start+1]) return start;
if(ages[start]==ages[ages.length-1]) return ages.length-1;
int i = start+1;
int k = ages[start];
while(start<i && i+1<ages.length) {
//System.out.println("i: "+i);
if(ages[i]==k && ages[i+1]!=k) return i;
else if(ages[i]>k) {
end=i;
i=(start+i)/2;
}
else { //ages[i]==k && ages[i+1]==k
start=i;
i=(i+end)/2;
}
if(i>=ages.length-1) return i;
}
return i;
}
And Here is the complete code to test the algorithm, you can use
int[] genAgesArray(int n)
to generate a sorted array of ages and:
void printArray(int[] a)
to print the array
here is the full code for testing:
import java.util.*;
public class CountAgesSortedArray {
public static Map<Integer,Integer> countAges(int[] ages) {
if(ages==null || ages.length==0) {
return new HashMap<Integer,Integer>();
}
int i = 0;
int end = 0;
Map<Integer,Integer> count = new HashMap<Integer,Integer>();
int from = 0;
int to = 0;
while(i<ages.length) {
from = i;
end=binSearchEnd(ages,i,ages.length);
to = end;
count.put(ages[i], 1+to-from);
i=end+1;
}
return count;
}
public static int binSearchEnd(int[] ages, int start, int end) {
if(start+1>ages.length-1 || ages[start]!=ages[start+1]) return start;
if(ages[start]==ages[ages.length-1]) return ages.length-1;
int i = start+1;
int k = ages[start];
while(start<i && i+1<ages.length) {
//System.out.println("i: "+i);
if(ages[i]==k && ages[i+1]!=k) return i;
else if(ages[i]>k) {
end=i;
i=(start+i)/2;
}
else { //ages[i]==k && ages[i+1]==k
start=i;
i=(i+end)/2;
}
if(i>=ages.length-1) return i;
}
return i;
}
public static int[] genAgesArray(int n) {
if(n<1) {
System.out.println("ERROR Generating Ages: N must be > 0");
return null;
}
Random r = new Random();
int[] ages = new int[n];
ages[0] = r.nextInt(2);
for(int i=1;i<n;i++) {
ages[i]=ages[i-1]+r.nextInt(r.nextInt(5)+1);
}
return ages;
}
public static void printArray(int[] a) {
if(a==null || a.length==0) {
System.out.println("ERROR Printing Array: Empty Array");
return;
}
for(int i=0;i<a.length-1;i++) {
System.out.print(a[i]+",");
}
System.out.println(a[a.length-1]);
}
public static void main(String[] args) {
int[] ages = genAgesArray(10);
//int[] ages = {0,1,1,1,1,1,1,1,1,1};
printArray(ages);
System.out.println(countAges(ages));
}
}
This is O(nlogn) in worst case, no? Think about the worst case scenario where each age is present only once, eg [8,9,11,12,15,16,18,20]
it is O(n) in your example ([8,9,11,12,15,16,18,20]), check the first control if in the binSearchEnd method, it immediately returns the lastindex if the next element is different (so lastindex will be equal to firstindex for that age) -> if there is just one element for that age, it won't perform binary search. O(n) is the best you can do in the example you provide, because we need to retrieve all the ages present in the input array, so if there is just one person for each age, we will have to retrieve n elements
First, let me preface this with saying that your approach is the best I can think of.
What about for the example [8, 8, 9, 9, 11, 11, 12, 12]?
At first it would appear that this, too, is O(n log n). But maybe this should be thought of in a different way. What about if we let 'k' represent the number of different ages in the array. Then the problem becomes O(k log n). And, the only reason that the discrepancy appears is when k approaches n. When k is larger, then the performance approaches O(n log n).
Sure, that is a good analysis, always good to think about extreme cases. In a real problem we can think of k much smaller than n (millions of people -> a lot of repetitions in ages). The largest is the array and the more repetitions we will have, the more the performance will improve with respect to n;
Another improvement we could make is:
before starting the binary search for each element, compare the element at startindex to the last element of the array, if they are equal we have already finished, just return array.length-startindex to get the number of people with that age. This will speed up the algorithm when we have cases like:
[13,13,13,13,13,13,13,13,13,13,13,13,13]
or
[1,4,6,15,37,37,37,37,37,37,37,37,37,37]
with lot of repetitions in the last part of the array, or array with all the same values.
Well it's not really an extreme case. This solution is faster than O(n) only when k*log(n) < n -> k < n/log(n), so it's not faster than O(n) in general.
@gigi84 - Solution proposed by you is definitely much faster than O(n). Initially it looks like nlogn but actually it is klogn where k is no. of different ages possible.
I think it can be safely assumed that ages are in the range 1 - 130 so K can be taken as 130. Now if there are a million ages (i.e. 2^20) in the sorted collection, this algorithm gives 130*log2^20 (i.e. O(nlogn) which is way better than 2^20 (i.e. O(n)).
#include <map>
const int SORTED_AGES[] = {8,8,8,9,9,11,15,16,16,16};
int getlastindex(int start, int arrsize);
int main()
{
std::map<int, int> ages;
int arrsize = sizeof(SORTED_AGES) / sizeof(SORTED_AGES[0]);
int first = 0, last = 0;
// Note: We assume the array has at least one element
do
{
last = getlastindex(first, arrsize);
ages[SORTED_AGES[first]] = last - first + 1;
first = last + 1;
} while (first < arrsize);
for (std::map<int, int>::const_iterator it = ages.begin(); it != ages.end(); ++it)
{
printf("%d: %d\n", it->first, it->second);
}
}
int getlastindex(int start, int arrsize)
{
int val = SORTED_AGES[start];
int end = arrsize;
int mid;
while (end - start > 1)
{
mid = start + (end - start) / 2;
if (SORTED_AGES[mid] <= val)
{
start = mid;
}
else
{
end = mid;
}
}
return start;
}
Update: This solution is not optimal for multiple reasons.
1) I used a map, which is typically implemented as a binary tree, so every age insertion is O(log n). I should have used an unordered_map, as the question did not say the output must be sorted, only that the input was.
2) I'm doing a binary search for every age barrier. In worst case, where none of the ages are repeated, that means there will be a binary search for every age. The overall algorithm is O(n log n). (Considering that the worst case would only lead to a little over 100 binary searches, that's not the end of the world, but still, CT's answer above is better.)
Here is CT's answer, replicated in C++ with an unordered_map:
#include <iostream>
#include <unordered_map>
using namespace std;
void countAges(const int ages[], int first, int last, unordered_map<int, int> &ageCounts)
{
if (ages[first] == ages[last])
{
if (ageCounts.find(ages[first]) == ageCounts.end())
{
ageCounts[ages[first]] = 0;
}
ageCounts[ages[first]] += last - first + 1;
}
else
{
countAges(ages, first, first + (last - first) / 2, ageCounts);
countAges(ages, first + (last - first) / 2 + 1, last, ageCounts);
}
}
void printAgeCounts(const int ages[], int size)
{
unordered_map<int, int> ageCounts;
countAges(ages, 0, size - 1, ageCounts);
for (unordered_map<int, int>::const_iterator it = ageCounts.begin(); it != ageCounts.end(); ++it)
{
cout << it->first << ": " << it->second << endl;
}
}
There is a trick to counting sort that you can apply here.
// Assumes array is sorted and that minElem and maxElem, if provided, are sane.
var dupBinSearch = function(array, target, minElem, maxElem) {
minElem = minElem !== undefined ? minElem : 0;
maxElem = maxElem !== undefined ? maxElem : array.length -1;
var guess = Math.floor((maxElem-minElem)/2) + minElem;
if (minElem > maxElem) {
return -1;
}
if (array[guess] === target) {
while(array[guess+1] === target) {
guess += 1;
}
return guess;
}
if (array[guess] > target) {
return dupBinSearch(array, target, minElem, guess-1);
} else {
return dupBinSearch(array, target, guess+1, maxElem);
}
};
//sorted array of ages. The ages assumption is key.
var ageCount = function(array) {
var maxAge = 150;
var index;
var lastIndex = 0;
var result = {};
for (var i = 0; i <= maxAge; i += 1) {
index = dupBinSearch(testArray, i);
if (index !== -1) {
result[i] = index - lastIndex;
lastIndex = index;
}
}
return result;
};
apply reverse binary search for each set
{8,8,8,9,9,11,15,16,16,16}
check indexes 1 2 4 8 16 from each set start value
consider these while checking
i=0, j= 1 (2 4 8 16 ....)
a[j] == a[i] and a[i]==a[j+1] then j = j*2
a[j] == a[i] and a[i]<a[j+1] then a[j-1] is last value of the set and a[j] will be the start value of next set
a[j] > a[i] then move left side
time complexity: O(m log k)
m = number of unique data
k = number of times each data appears in any set
void PrintAges(int* AgeList, int size)
{
if (AgeList == NULL || size < 1) return;
int previousAge = AgeList[0];
int Occurrence = 1;
for (int i = 1; i < Size; i++)
{
if (AgeList[i] == PreviousAge ) Occurrence++;
else
{
Cout << PreviousAge << ": " << Occurrence;
PreviousAge = AgeList[i] ;
Occurrence = 1;
}
}
Cout << PreviousAge << ": " << Occurrence;
}
I assume that the range of ages is from 0 to 120, which is relative low, so I have an array with the number of people for each age ( $ret ). I made a modified binary search, but the important point is that if the first element of the subarray is the same than the last, then I can update $ret in O(1). Otherwise I split the subarray in two halves and continue with the recursion. Overall the complexity is less than O(n), I think it is O(120 * log(n) ) = O(log(n) )
My implementation in Hack:
<?hh
function get($handle) :string{
return rtrim ( fgets ( $handle ) );
}
function syso($s) {
print $s . "\n";
}
function go(int $left, int $right, Vector<int> $vec, Vector<int> $ret):void{
if($left==$right)
return;
if ($vec [$left] == $vec [$right - 1]) {
$ret [$vec [$left]] += $right - $left;
return;
}
$mid = (int) (($left + $right) / 2) ;
go ( $left, $mid, $vec, $ret );
go ( $mid, $right, $vec, $ret);
}
$handle = fopen ( "php://stdin", "r" );
$n = intval ( get ( $handle ) );
$line = explode ( " ", get ( $handle ) );
$vec = new Vector< int > (array_map($s ==> intval($s), $line));
$MAX = 121;
$ret = new Vector<int>();
for($i=0;$i<$MAX;$i++)
$ret[] = 0;
go(0,$n,$vec,$ret);
foreach($ret as $k =>$v )
if($v!= 0)
syso($k . " " . $v);
Well I saw a couple of O(klogn) where k <<<< n becomes O(log(n)) solutions and some O(n)
The way I see it is that there are two options here. either O(n-1) where you go all the way till you find A[n] value and based on the difference on the array length is the count of the last age.
The second one is looking if there are no repetitions what is the most ages we will see.
You can determine that by looking at the first and last and age and assume that everything in between exist and if there is more in the array than expected those should be considered the number of duplicates that exists.
Do a binary search from start till end where:
start : current change A[i] < A[i+1]
end : min(Missing numbers to find + remaining duplicates, A.Length)
So the solution I see is of run time.
T(n) => { O(n-1) when: n < k*log(n)
{ O(k log n)
Also because these are ages smaller compare to n with goes to infinity meaning (k << n) so O(klogn) really becomes O(log(n)).
First Option of O(n-1) which is really O(n)
Dictionary<int, int> FindAges_N_minus_1_(int[] A)
{
Dictionary<int, int> result = new Dictinary<int, int();
int k = A[A.Length -1] - A[0]; // Difference of ages
int n = A.Length;
// Solution #1 for smaller sets with lots of age difference
if((n-1) < k*Math.Log(2, n))
{
int last = A[A.Length-1]
int i;
for(i = 0; A[i] != last; i++)
{
if(result.ContainsKey(i))
{
result[i]++;
}
else
{
result.Add(result[i], 1);
}
}
result.Add(last, A.Length - i);
}
else
{
int lastAge = A[A.Length-1]
int duplicates = A.Length - k;
int start = 0;
int end = 0;
for(int i = A[0]; i < lastAge;)
{
int age = A[start];
end = BinarySearchEnd(
A,
start,
A.Length) + 1;
int count = end - start;
int skipped = A[end] - age;
// Removing skipped numbers remaining numbers from the jump
k -= skipped ;
// Removing found duplicates and consider skipped as duplicates
duplicates = duplicates - count + skipped -1;
result.Add(age, count);
start = end;
}
}
return result;
}
int BinarySearchEnd(
int[] A,
int start,
int end)
{
if((start+1) > (A.Length - 1) || A[start] != A[start + 1])
{
return start;
}
if(A[start] == A[A.Length - 1])
{
return A.Length-1;
}
int i = start + 1;
int k = A[start];
while(start < i && (i + 1) < A.Length)
{
if(A[i] == k && A[i+1] != k)
return i;
else if(A[i] > k) {
end= 1;
i = (start + i) / 2;
}
else {
start = i;
i = (i + end) / 2;
}
if( i >= A.Length-1)
return i;
}
return i;
}
I have used LinkedList to store the result, but we can also use a HashMap. LinkedList would have been better if the worst case was O(log(n)). Implementation wise, HashMap would have been easier.
The worst case will always be O(n), but the average case would be O(log(n)). My assumption is that the native LinkedList.addAll implementation in Java is O(1). If it is not, we can write our own implementation.
import java.util.LinkedList;
class BinarySearchCount {
private int [] sortedArray;
class Count {
Count (int num, int count) {
this.num = num;
this.count = count;
}
int num;
int count;
}
public BinarySearchCount(int [] sortedArray) {
this.sortedArray = sortedArray;
}
public LinkedList<Count> getCount(int start, int end) {
LinkedList<Count> array;
if (this.sortedArray[start] == this.sortedArray[end]) {
array = new LinkedList<Count>();
Count count = new Count(this.sortedArray[start], end + 1 - start);
array.add(count);
}
else {
LinkedList<Count> leftList = this.getCount(start, start + (end-start)/2);
LinkedList<Count> rightList = this.getCount(start + (end-start)/2 + 1, end);
if (leftList.getLast().num == rightList.getFirst().num) {
leftList.getLast().count += rightList.getFirst().count;
rightList.remove();
}
leftList.addAll(rightList);
array = leftList;
}
return array;
}
public static void main (String [] args) {
// Testing the algorithm.
int array[] = {1, 1, 3, 3, 3, 9, 9, 9, 11, 11, 13};
BinarySearchCount bscTest = new BinarySearchCount(array);
LinkedList<Count> result = bscTest.getCount(0, array.length - 1);
for (Count count : result) {
System.out.println(count.num + ": " + count.count);
}
}
}
/* package whatever; // don't place package name! */
import java.util.*;
import java.lang.*;
import java.io.*;
/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
public static void binarysearch(int a[],int key,int i,int j)
{
int low =i;
int high = j-1;
while(low<=high)
{
int mid = (low+high)/2;
if(a[mid] == key)
{
int times;
if(high == low)
{
times = 1;
low = high+1;
}
else
{
times = a[low]== key? high-low+1:high-low;
low = high;
}
System.out.print(" "+key+" : "+times+" ");
high = a.length;
if(low == (a.length-1))
{
return;
}
binarysearch(a,a[low],low,high);
}
else if( a[mid] > key)
{
high = mid-1;
}
else
{
low = mid +1;
}
}
}
public static void main (String[] args) throws java.lang.Exception
{
// your code goes here
int[] a = {1,1,1,1,1,1,1,1,1};//{8,8,8,9,9,11,15,16,16,16};
binarysearch(a,a[0],0,a.length);
}
}
[code=java]
package frequency;
import java.util.Enumeration;
import java.util.Hashtable;
public class CountIntegers {
Hashtable<Integer,Counter> numberCount = new Hashtable<Integer,Counter>();
class Counter{
int count;
Counter(){
}
int countOccurrence(){
count++;
return count;
}
}
void countNumber(int[] array){
int size = array.length;
for(int i=0;i<size;i++) {
if(numberCount.containsKey(array[i])){
numberCount.get(array[i]).countOccurrence();
}else
numberCount.put(array[i],new Counter());
}
}
void printTable(int sizeOfArray){
Enumeration<Integer> e = numberCount.keys();
while(e.hasMoreElements()){
int value = (Integer)e.nextElement();
System.out.println("Number "+value+" occurred: "
+numberCount.get(value).countOccurrence()+" times");
}
}
public static void main(String[] args) {
int [] numberArray = {8,8,8,9,9,11,15,16,16,16};
CountIntegers count = new CountIntegers();
count.countNumber(numberArray);
count.printTable(numberArray.length);
}
}
[/code]
#include<stdio.h>
#include<stdlib.h>
int main()
{
int array[] = {8,8,8,9,9,11,15,16,16,16};
int len;
int i = 0;
int compare,count =0;
len = sizeof(array)/sizeof(int) ;
while(i < len)
{
compare = array[i];
printf("%d :" , compare);
while(compare == array[i])
{
i++;
count++;
}
printf("%d\n", count);
count = 0;
}
return 0;
}
private static void printOccurrences(int[] arr) {
int val = arr[0];
int i = 0;
int count = 1;
while(i < arr.length) {
if((i+2) >= arr.length) {
System.out.println(val+":"+count);
return;
}
if(val == arr[i+2]) {
count=count+2;
i=i+2;
val = arr[i];
} else if(val == arr[i+1]) {
count=count+1;
i=i+1;
val = arr[i];
} else {
System.out.println(val+":"+count);
i=i+1;
val = arr[i];
count=1;
}
}
}
public class CountOccuranceEachDup {
private static final Exception START = null;
//binary search : O(log n)
public int getCount(int[] arr,int key,boolean searchFirst){
int low=0,high= arr.length-1, result =-1;
while(low<=high){
int mid= (low+high)/2;
if(arr[mid]==key){
result =mid;
if(searchFirst){
high=mid-1;
}else{
low=mid+1;
}
}else if(key>arr[mid]){
low= mid+1;
}else{
high=mid-1;
}
}
return result;
}
public void printOccurance(int[] arr){
int len=0;
// called for each unique occurrence of the element
while(len<arr.length){
int key = arr[len];
int count;
// search first occurrence in : O(log n)
int firstIndex= getCount(arr,key,true);
// search last occurrence in : O(log n)
int lastIndex= getCount(arr,key,false);
count = lastIndex-firstIndex+1;
System.out.println(key +"->"+count);
len= lastIndex+1;
}
}
public static void main(String[] args){
CountOccuranceEachDup c = new CountOccuranceEachDup ();
int[] arr= {8,8,8,9,9,9,9,9,9,10,10,10,10,10,10,10,10};
c.printOccurance(arr);
}
}
i/p: {8,8,8,9,9,9,9,9,9,10,10,10,10,10,10,10,10};
o/p:
8->3
9->6
10->8
i/p: {8,8,8,8}
o/p : 8>4
#include <iostream>
#include <vector>
using namespace std;
void show_count(vector<int> vin) {
int step = 1;
int idx = 0;
if (vin.size() < 1) return;
int curr = vin[0];
int count = 1;
while (idx < vin.size()) {
while (idx + step < vin.size() && vin[idx + step] == curr) {
idx += step;
count += step;
step *= 2;
}
if (step == 1) {
cout << "NUM: " << curr << " COUNT: " << count << endl;
count = 1;
idx += 1;
if (idx < vin.size()) curr = vin[idx];
}
else {
step = 1;
}
}
}
int main() {
vector<int> aim;
for (int i = 0; i < 3; i++) aim.push_back(2);
for (int i = 0; i < 99; i++) aim.push_back(3);
for (int i = 0; i < 1000; i++) aim.push_back(10);
for (int i = 0; i < 1; i++) aim.push_back(20);
for (int i = 0; i < 10; i++) aim.push_back(500);
show_count(aim);
}
#include<iostream>
#include<list>
#include<utility>
using namespace std;
typedef list< pair<int,int> > * LP;
typedef list< pair<int,int> > L;
//merges the counts of the two halves.
//Since the array is sorted, just need to check the last element of list1 and first element of list2, O(1)
LP merge(LP list1, LP list2) {
if(list1->size() == 0) throw "List1 is empty";
if(list2->size() == 0) throw "List2 is empty";
pair<int,int> & lastList1 = list1->back();
pair<int,int> & firstList2 = list2->front();
if(lastList1.first == firstList2.first) {
lastList1.second += firstList2.second;
list2->pop_front();
list1->splice(list1->end(), *list2);
return list1;
} else {
list1->splice(list1->end(), *list2);
return list1;
}
}
LP count(const vector<int>& v, int i, int j) {
//Base case
if(i == j) {
pair<int,int> p;
p.first = v[i];
p.second = 1;
LP l = new L();
l->push_back(p);
return l;
}
//If first and last element are same
else if(v[i] == v[j]) {
pair<int,int> p;
p.first = v[i];
p.second = j - i + 1;
LP l = new L();
l->push_back(p);
return l;
}
//Divide and conquer
else {
int mid = (i + j)/2;
LP l1 = count(v,i,mid);
LP l2 = count(v,mid+1,j);
return merge(l1,l2);
}
}
int main() {
int N = 10;
vector<int> v(N);
v[0] = 8; v[1] = 8; v[2] = 8; v[3] = 9; v[4] = 9;
v[5] = 11; v[6] = 15; v[7] = 16; v[8] = 16; v[9] = 16;
LP l = count(v, 0, N-1);
for(L::const_iterator it = l->begin(); it != l->end(); it++) {
cout<<it->first<<"\t"<<it->second<<endl;
}
return 0;
}
C# implementation for a modified binary search:
public static void PrintAges(int[] ages)
{
printAges(ages, ages[0], 1, 1, ages.Length - 1, ages.Length - 1);
}
public static void printAges(int[] ages, int age, int count, int start, int end, int endIndex)
{
if (start >= end)
{
if (start == end && ages[start] == age)
{
count++;
}
Console.WriteLine(string.Format("Age:{0}, Times:{1}", age, count));
if (start == end && ages[start] != age)
{
printAges(ages, ages[start], 1, start + 1, endIndex, endIndex);
}
else if (end != endIndex)
{
printAges(ages, ages[end + 1], 0, end + 1, endIndex, endIndex);
}
}
else
{
int middleIndex = (start + end) / 2;
if (ages[middleIndex] == age)
{
printAges(ages, age, count + middleIndex - start+1, middleIndex + 1, end, endIndex);
}
else
{
printAges(ages, age, count, start, middleIndex - 1, endIndex);
}
}
}
If we are iterating till the middle of the array, we could have 2 pointers, one at the start of the array that will move forward, and one at the end of the array that will move backward.
It will give N/2 time complexity. But does this count as less than O(N)?
private static void printNumberOfOccurences(){
Map<Integer,Integer> map = new HashMap<Integer,Integer>();
int[] a = {8,8,8,9,9,11,15,16,16,16};
boolean isOdd = a.length%2!=0;
if(isOdd){
map.put(a[middle],1);
}
int middle = a.length/2;
int end = a.length-1;
for(int i=0; i<middle; i++){
check(a[i], map);
check(a[end--], map);
}
for(Map.Entry<Integer,Integer> entry:map.entrySet()){
System.out.println(entry.getKey()+": "+entry.getValue());
}
}
private static void check(int n, Map<Integer, Integer> map){
if(!map.containsKey(n)){
map.put(n, 1);
}else{
int counter = map.get(n);
map.put(n, ++counter);
}
}
Feedback is welcomed. Thank you.
my first cc comment :)
/*
The problem:
Given an array of ages (integers) sorted lowest to highest
output the number of occurrences for each age.
This should be done in less than O(n).
For instance:
[8,8,8,9,9,11,15,16,16,16]
should output something like:
8: 3
9: 2
11: 1
15: 1
16: 3
Analysis:
Obviously, in worst case scenario, the lower bound of the
generalized version of this problem is O(n);
because we have to check every element if no repetions in
the input array. Suppose this input: [1,2,3,4,5,6,7,8]
there is no way you can get all occurences
without checking all elements no matter what algorithm you use.
However, the problem here has a special background that the input array
is people's ages. According to "Pigeonhole principle", and given that
people's age varies within a tight range of 1 to about 120,
(and maybe the interview will tell you that there're millions of records),
there'll be many repetitions. Then the binary search method is
a much more interesting solution. I think the interview should be happy
to see it and hear you to explain why it works.
*/
//O(mlogn) solution
//where m is the # of unique elements
void printOccurencesSorted(int[] A){
int i=0;
while(i<A.length){
int l=i,r=A.length-1;
while(l<=r){
int m=l+(r-l)/2;
if(A[i]<A[m])
r=m-1;
else l=m+1;
}
System.out.println(A[i]+":"+(l-i));
i=l;
}
}
// simple O(n) solution
void printOccurencesSorted(int[] A){
int start=0;
for(int i=1;i<=A.length;i++){
if(i==A.length || A[i]!=A[start]){
System.out.println(A[start]+":"+(i-start));
start=i;
}
}
}
what's wrong with this solution ?
It's short and simple , and run in O(n)
public class Counter {
public static void main (String[] args) {
int[] a = {8,8,8,9,9,11,15,16,16,16};
ageCounter(a);
}
public static void ageCounter (int[] ages)
{
Hashtable<Integer, Integer> agesCounter = new Hashtable<Integer, Integer>();
for (int number : ages)
{
Integer counter = agesCounter.get(number);
if (counter != null)
{
agesCounter.put(number, ++counter);
}else
{
agesCounter.put(number, 1);
}
}
for (Integer number : agesCounter.keySet())
{
System.out.println(number + " : " + agesCounter.get(number));
}
}
Took me a while to get a O(log n) solution to this that I was happy with and could write from scratch with confidence.
1. lower bound binary search for the value
2. if no lower bound found return nil
3. otherwise return upper bound binary search value minus lower bound plus 1
You could a general version that does both upper and lower bound but I think it's much clearer and more understandable to have two explicit different versions that are only slightly different:
def bsearch_lower(a, v)
lo = 0
hi = a.count - 1
best = nil
loop do
break if lo > hi
mid = lo + ((hi - lo) / 2)
if a[mid] == v
best = mid
# lower bound = try for values lower than this
hi = mid - 1
elsif a[mid] > v
hi = mid - 1
else
lo = mid + 1
end
end
best
end
def bsearch_upper(a, v)
lo = 0
hi = a.count - 1
best = nil
loop do
break if lo > hi
mid = lo + ((hi - lo) / 2)
if a[mid] == v
best = mid
# upper bound = try for values larger than this
lo = mid + 1
elsif a[mid] > v
hi = mid - 1
else
lo = mid + 1
end
end
best
end
def bsearch_count(a, v)
lower = bsearch_lower(a, v)
unless lower.nil?
return bsearch_upper(a, v) - lower + 1
end
nil
end
package facebook;
public class GetAgeDistributionV2 {
public GetAgeDistributionV2() {
}
int[] getAgeDistribution(int[] a){
int n = a.length;
int maxAge = a[n-1];
int[] frequencies = new int[maxAge+1];
for(int i = 0; i <= maxAge; i++ ){
frequencies[i] = 0;
}
int pivot;
int pivotStart = 0;
while(pivotStart < n){
pivot = a[pivotStart];
int pivotEnd = getLastIndexAndUpdateFrequency( a, pivot, pivotStart, n-1, frequencies);
pivotStart = pivotEnd+1;
}
return frequencies;
}
/**
* assume a[start] = pivot
*/
int getLastIndexAndUpdateFrequency(int[] a, int pivot, int start, int end, int[] frequencies){
if(end==start){
frequencies[pivot]++;
return end;
} else if ( end == (start+1)){
if(a[end] == pivot){
frequencies[pivot] += 2;
return end;
} else {
frequencies[pivot]++;
return start;
}
} else {
int mid = (int) Math.floor((start + end+1)/2.0);
//System.out.println(start + " -> " + end + "\t" + mid);
if(a[mid] == pivot){
frequencies[pivot] += (mid - start);
return getLastIndexAndUpdateFrequency( a, pivot, mid, end, frequencies);
} else {
// a[mid] > pivot
return getLastIndexAndUpdateFrequency( a, pivot, start, mid, frequencies);
}
}
}
public static void main(String[] args){
int[] testcase = { 8 , 8 , 8 , 9 , 9 , 11 , 15 , 16, 16, 16 };
GetAgeDistributionV2 wrapper = new GetAgeDistributionV2();
int freq[] = wrapper.getAgeDistribution(testcase);
for(int i = 0; i < freq.length; i++) {
if(freq[i] != 0 ) {
System.out.println(i + "\t" + freq[i]);
}
}
}
}
solution using upper binary search
public static void arrayFreqLogn() {
int[] arr = {8,8,8,8,8,8,9,16,16,16,17};
int i=0;
while(i<arr.length) {
int end = findEndIdx(arr, arr[i], i, arr.length);
System.out.println(arr[i] +":" +(end-i+1));
i = end+1;
}
}
public static int findEndIdx(int[] a, int n, int l, int r) {
int idx = -1;
while(l<r) {
int mid = l+(r-l)/2;
if(n < a[mid]) {
r = mid - 1;
} else if(n > a[mid]) {
l = mid + 1;
} else {
idx = mid;
if(idx < a.length-1 && a[idx] == a[idx+1]) {
return findEndIdx(a, n, mid+1, r);
} else {
return idx;
}
}
}
return idx;
}
public static void arrayFreqLogn() {
int[] arr = {8,8,8,8,8,8,9,16,16,16,17};
int i=0;
while(i<arr.length) {
int end = findEndIdx(arr, arr[i], i, arr.length);
System.out.println(arr[i] +":" +(end-i+1));
i = end+1;
}
}
public static int findEndIdx(int[] a, int n, int l, int r) {
int idx = -1;
while(l<r) {
int mid = l+(r-l)/2;
if(n < a[mid]) {
r = mid - 1;
} else if(n > a[mid]) {
l = mid + 1;
} else {
idx = mid;
if(idx < a.length-1 && a[idx] == a[idx+1]) {
return findEndIdx(a, n, mid+1, r);
} else {
return idx;
}
}
}
return idx;
}
public class Solution {
int[] ages = null;
int[] cnt = null;
void count(int l , int h) {
if (l <= h) {
if (ages[l] == ages[h]) {
cnt[ages[l]] += h - l + 1;
}
else {
int m = (l + h) / 2;
count(l, m);
count(m + 1, h);
}
}
}
void countAges(int[] array) {
ages = array;
cnt = new int[ages[ages.length - 1] + 1];
int l = 0;
int h = ages.length - 1;
count(l, h);
for (int i = 0; i < cnt.length; i++) {
System.out.println(i + " " + cnt[i]);
}
}
}
In Java:
private static void countAges(Integer[] array) {
List<Integer> list = Arrays.asList(array);
int index = 0;
while (index < list.size()) {
int latestPos = list.lastIndexOf(array[index]);
System.out.println("Age: " + array[index] + " Occurences: " + ((latestPos - index) + 1));
index = latestPos + 1;
}
}
C++ solution with binary search.
Running time is O(M log N), where N is the max occurrence of numbers and M is number of unique numbers.
#include<iostream>
#include<map>
using namespace std;
void count(int *input, int s, int e, map<int,int> &cnt)
{
if (s>e)
return;
if (input[s] == input[e])
{
if (cnt.find(input[s]) != cnt.end())
cnt[input[s]] = cnt[input[s]]+ (e-s+1);
else
cnt[input[s]] = (e-s+1);
}else
{
int half = s+ (e-s)/2;
if (cnt.find(input[half]) != cnt.end())
cnt[input[half]] +=1;
else
cnt[input[half]] = 1;
count(input, s, half-1, cnt);
count(input, half+1, e, cnt);
}
}
void count_occurrence(int *input, int len)
{
map<int, int> cnt;
map<int, int>::iterator iter;
count(input, 0, len-1, cnt);
for (iter = cnt.begin(); iter != cnt.end(); iter++)
cout<<iter->first << ": "<<iter->second<<endl;
}
int main()
{
int input[10] = {8,8,8,9,9,11,15,16,16,16};
count_occurrence(input, 10);
return 1;
}
Here is the solution with O(MlogN) complexity.
public class CountInteger {
public static void main(String[] args) {
int[] numbers = {8,8,8,9,9,11,15,16,16,16,17,19,19,19,21,22,23};
for (int i = 0; i < numbers.length; i++) {
i = count(numbers, numbers[i], i, numbers.length - 1);
}
}
private static int count(int[] numbers, int key, int lo, int hi) {
//System.out.println(key + " " + lo + " " + hi);
if (numbers[lo] == numbers[hi]) {
System.out.println(numbers[lo] + ":" + ((hi - lo) + 1));
return hi;
}
if (key < numbers[(lo+hi)/2 + 1]) {
return count(numbers, key, lo, (lo+hi)/2);
} else if (key > numbers[(lo+hi)/2 + 1]) {
return count(numbers, key, (lo+hi)/2 + 1, hi);
} else {
System.out.println(numbers[lo] + ":" + ((lo+hi)/2 + 1));
return lo;
}
}
}
void occurences(int[] a) {
int count = 0;
int last = 0;
for (int i = 0; i < a.length; i++) {
if (i == 0) {
last = a[i];
count++;
} else if (a[i] == last) {
count++;
} else {
System.out.println(last + " : " + count);
count = 1;
last = a[i];
}
if (i == a.length - 1) System.out.println(last + " : " + count);
}
}
public void countAge2()//op- 9:2 4:3 7:1 1:1 8:1 3:1 2:2 6:1 10:1 11:1 0:0 0:0 0:0 13 count
{
int[] input = {9,4,7,1,8,3,2,6,4,10,9,4,2,11};
int[] unique = new int[input.length];
int[] count = new int[input.length];
int sum=0;
boolean found=false;
int whichOne;
for(int startindex =0;startindex<input.length;startindex++)
{
found=false;
whichOne=input[startindex];
count[whichOne]++;
for(int n=0;n<sum;n++)
{
if(unique[n]==input[startindex])
{
found=true;
break;
}
}
if(!found)
{
unique[sum++]=input[startindex];
}
}
for(int i = 0 ;i<input.length;i++)
{
System.out.println(unique[i] + " : " + count[unique[i]]);
}
}
{
public void countAge2()//op- 9:2 4:3 7:1 1:1 8:1 3:1 2:2 6:1 10:1 11:1 0:0 0:0 0:0 13 count
{
int[] input = {9,4,7,1,8,3,2,6,4,10,9,4,2,11};
int[] unique = new int[input.length];
int[] count = new int[input.length];
int sum=0;
boolean found=false;
int whichOne;
for(int startindex =0;startindex<input.length;startindex++)
{
found=false;
whichOne=input[startindex];
count[whichOne]++;
for(int n=0;n<sum;n++)
{
if(unique[n]==input[startindex])
{
found=true;
break;
}
}
if(!found)
{
unique[sum++]=input[startindex];
}
}
for(int i = 0 ;i<input.length;i++)
{
System.out.println(unique[i] + " : " + count[unique[i]]);
}
}
}
public void countAge2()//op- 9:2 4:3 7:1 1:1 8:1 3:1 2:2 6:1 10:1 11:1 0:0 0:0 0:0 13 count
{
int[] input = {9,4,7,1,8,3,2,6,4,10,9,4,2,11};
int[] unique = new int[input.length];
int[] count = new int[input.length];
int sum=0;
boolean found=false;
int whichOne;
for(int startindex =0;startindex<input.length;startindex++)
{
found=false;
whichOne=input[startindex];
count[whichOne]++;
for(int n=0;n<sum;n++)
{
if(unique[n]==input[startindex])
{
found=true;
break;
}
}
if(!found)
{
unique[sum++]=input[startindex];
}
}
for(int i = 0 ;i<input.length;i++)
{
System.out.println(unique[i] + " : " + count[unique[i]]);
}
}
It can be done in sublinear, you know that the array is sorted. With your argument binary search should be linear.
mmm.... Well the sublinear occurs when there is tons of repetition though if there is not a log of repetition it become nlogn.
I agree that it can be done sublinear but it is a specific case.
I though of a solution where it assumes that the ages from start till end every are represented in the array and i just look for all of those ages.
Well, if all the ages are different all algorithms will run in linear time. There are posted solutions using a modified binary search which are logarithmic with repetitions, but still O(n) in the worst case ( not nlog(n) ) .
Those modified binary search solutions mean nothing to the problem though.
Here is an O(1) best case solution, and O(n) worst case.
Check first element, check if last element is same as first element, if they are, answer is obvious (print ELEMENT:arraylength)
Else go to second element, and repeat (i.e., compare with last element)...
{i.e., the solution I had above, can be modified to make it's best case O(1) }
The interviewer is telling that they are ages so it is expected to have a lot of repetition (as n get larger). Although is true that the complexity of the previous binaries are O(n), is like saying bfs has complexity O(n^2) instead of O(m+n) where n are the nodes and m the edges.
It was an analogy. What I meant is that BFS has complexity O(n + m) where n is the number of nodes and m the number of edges, but the complexity is also O(n^2) because in a dense graph m is O(n^2). Though by definition both are correct, O(n+m) is more accurate. I think here happens something similar.
m log n, where m is number of unique ages and n is number of elements. Since m is practically a constant (unfortunately our age is bound), the time complexity is log n. If you look at this very simple and very short code, this is nothing else than binary search for the border point between two consecutive ages.
- CT January 01, 2015