Google Interview Question
Software Engineer InternsCountry: Brazil
Interview Type: In-Person
Could you expand on this please? How would you use a variant of binary search to count how many people are of each age?
public static int lastIndexVariantOfBinarySearch(int number, int[] array, int start, int end) {
if(end<start)
return -1;
int mid = (start+end)/2;
if(array[mid] == number && (mid == end || array[mid+1] != number))
return mid;
if(array[mid] <= number) {
return lastIndexVariantOfBinarySearch(number, array, mid+1, end);
}
else {
return lastIndexVariantOfBinarySearch(number, array, start, mid-1);
}
}
Full solution:
//return: index age, value count of this age
public static int[] count(int[] input) {
int[] count = new int[input[input.length-1]+1];
count (input, 0, input.length-1, count);
return count;
}
private static void count(int[] input, int begin, int end, int[] count) {
if (input[begin] == input[end]) {
count[input[begin]] += end-begin+1;
}
else {
count(input, begin, (begin+end)/2, count);
count(input, (begin+end)/2 + 1, end, count);
}
}
Need 2 binary search operations. The difference between the 2 returned indices is the count of the age.
int BinarySearchCountUpper(vector<long> source, int toSearch, int start, int end)
{
int mid = start + (end - start) / 2 + (end - start) % 2;
if (end < start)
return 0;
if (source[mid] == toSearch && (mid == end || source[mid + 1] != toSearch))
return mid;
else if (source[mid] <= toSearch)
return BinarySearchCountUpper(source, toSearch, mid + 1, end);
else
return BinarySearchCountUpper(source, toSearch, start, mid - 1);
}
int BinarySearchCountLower(vector<long> source, int toSearch, int start, int end)
{
int mid = start + (end - start) / 2 + (end - start) % 2;
if (end < start)
return 0;
if (source[mid] == toSearch && (mid == start || source[mid - 1] != toSearch))
return mid;
else if (source[mid] < toSearch)
return BinarySearchCountLower(source, toSearch, mid + 1, end);
else
return BinarySearchCountLower(source, toSearch, start, mid - 1);
}
Test:
count = BinarySearchCountUpper(ages, 20, 0, ages.size() - 1) - BinarySearchCountLower(ages, 20, 0, ages.size() - 1) + 1;
reverse binary search for each age.
let a[0] be an age then check the indexes of 1 2 4 8 16 32 64, if number is different at any two indexes then apply binary search in that range.
ex. a[0] = 0; and
a[0] == a[1] == a[2] == a[4] == a[8] == a[16] != a[32], here apply binary search between 16 and 32 indexes and find the last index value with 0
time complexity: O(log n) * number of unique age groups
Why not just use at HashTable, It will only take O(n) to create, no need to use anything like a binary search tree here.
HashTable<int, int> findAges(int[] ages){
HashTable<int, int> table = new HashTable<int, int>();
for(int i = 0; i < ages.Lentgh; i++){
if(table.hasHey(ages[i])){
table[ages[i]]++;
} else {
table.add(ages[i], 1);
}
}
return table;
}
Because if you "have the age of every person" going through every element is much, much more expensive than doing a bunch of logarithmic searches
This is correct code. Yet you didn't use it fully to your advantage... you should use the fact that the array is sorted. If it wasn't, i believe using an array where the index is the age and the data it has is the frequency of the index found is the optimal solution (Please correct me if i'm wrong). Knowing it is sorted, and the population of the earth is around 7 billion, and let's assume the oldest person is 125, it is fair to jump 7billion/125 steps each time and check the value, if its the same add 7B/125 to that value in the array if not, locate where it stopped by doing binary search and add the frequency which is the difference between the index before jumping, and the located index. From there we continue on finding the frequency of the next age.
U have to use kind of DP.If if you have to find number of ppl of each age then.Keep the rightmost index of just smaller no and find the rightmost index of current age.
then no of ppl of current age =(rightmost index of current no - rightmost index of prev no)
rightmost index of any number in a sorted array can be found in o(log n) time.so over all complexity is klog(n) where k is different age of ppl.
code is below
int binary_search(vector<int> v,int low,int high,int val)
{
if(low >= high)
return low;
int mid;
mid = low +(high-low)/2;
if(v[mid]<=val && v[mid+1]>val)
return mid+1;
else if(v[mid]<=val)
return binary_search(v,mid,high,val);
else
return binary_search(v,low,mid,val);
}
just write a function to call the above code for different value of age and keep the prev_index so that for current you can find diff from current_rightmost_index and prev_rightmost_index
Well, you're going to have to somehow check every person in the World but you also know that ages are roughly limited in the range [0..130]. So, you could theoretically use some sort of recursive halving sort to find the ages. This would give you answers in roughly O (log n * 130) :
public static int[] getCountOfAges(int[] ages){
//first compute the indices at which each age group will start
int[] ageCounts = new int[130];
ageCounts[0] = 0;
for(int i = 1; i < ageCounts.length; i++){
ageCounts[i] = findFirstInstance(ageCounts[i-1], ages, i);
}
//convert the indices to actual values:
for(int i = 0; i < ageCounts.length - 1; i++){
ageCounts[i] = ageCounts[i+1] - ageCounts[i];
}
ageCounts[ageCounts.length-1] = ages.length - ageCounts[ageCounts.length-1];
return ageCounts;
}
private static int findFirstInstance(int loValueIndex , int[] arr, int value){
//find an index where arr[index] = value
int valueIndex = findIndex(lo, arr, value);
//find where arr[index -1] < arr[index]
while(valueIndex - loValueIndex > 1){
int mid = (valueIndex + loValueIndex)>>1;
if(arr[mid] == arr[valueIndex]){
valueIndex = mid;
}
else{
loValueIndex = mid;
}
}
//return index
return valueIndex;
}
private static int findIndex(int lo, int[] arr, int value){
int hi = arr.length -1;
while(lo < hi){
int mid = (lo + hi)>>1;
if(arr[mid] < value){
lo = mid;
}
else if(arr[mid] > value){
hi = mid;
}
else{
return mid;
}
}
return arr[hi] == value ? hi : lo;
}
//This solution assumes the input array contains only ages 0-130 inclusive
//Time complexity BigO(130*logn) = O(logn)
public class agecounter {
static final int MAX_AGE = 130;
public static void main(String[] args){
int i = 0;
int[] ages = {0,0,0,0,0,1,1,1,1,2,4,4,4,4,4,4,4,4,9,9,9,9,10,66};
int[] display = ageCount(ages);
for (i = 0; i < display.length; i++){
System.out.printf("age=%d count=%d", i, display[i]);
System.out.println();
}
}
public static int[] ageCount(int[] src){
int[] result = new int[MAX_AGE+1];
int prev = 0;
int i = 0;
for(i = 0; i < MAX_AGE+1; i++){
System.out.println(prev);
if(prev == findIndex(src, prev, i)){
//if we are at the end of input array, fill rest of result with 0's
if(prev == src.length){
result[i]=0;
}else{ //else we saw a single value from findIndex
result[i] = 1;
prev++;
}
continue;
}else if(findIndex(src, prev, i) == -1){//no values found
result[i] = 0;
continue;
}//more than one value found
int count = findIndex(src, prev, i) - prev + 1;
prev = findIndex(src, prev, i);
result[i] = count;
prev++;
}
return result;
}
//recursive binary search to arrive at last index of **current** age, not the
//beginning of **next** age
public static int findIndex(int[] a, int startIndex, int age){
int ptr = 1;
if(startIndex+ptr > a.length-1) return startIndex;
if(a[startIndex] != age) return -1;
if(a[startIndex+1] != age){
return startIndex;
}
while(a[startIndex+ptr] == age){
ptr *= 2;
if(startIndex+ptr > a.length-1){
ptr = ptr/2;
while(startIndex+ptr < a.length-1){
ptr++;
}
return startIndex + ptr;
}
}
return findIndex(a, startIndex + (ptr/2), age);
}
}
output
age=0 count=5
age=1 count=4
age=2 count=1
age=3 count=0
age=4 count=8
age=5 count=0
age=6 count=0
age=7 count=0
age=8 count=0
age=9 count=4
age=10 count=1
age=11 count=1
age=12 count=0
age=13 count=0
...
age=66 count=1
if we have to count for all ages (as your output suggests) then we can't do better than O(n).
@Victor: we have to count number of people for EACH age, isn't it?
So, just count all the ages into a counter with O(n+MaxAge) time, O(MaxAge) space.
Can't do better than O(n) time!
int AgeCnt[150], MaxAge= -1;
for(int i = 0; i<n; i++){
MaxAge= max(MaxAge, Input[i]);
AgeCnt [Input[i]]++;
};
for(int i = 0; i<MaxAge; i++)
cout <<"There are "<<AgeCnt[i] <<" people have age "<<i<<endl;
We can use a Map for this:
{ Map<Integer,Integer> result = new HashMap<Integer,Integer>();
for (int i =0;i<array.length;i++){
Integer count = result.get(array[i]);
if(count == null){
result.put(array[i],1);
}else{
result.put(array[i],count+1);
}
}
}
I will use a usual binary search to find first occurrences of ages and then I will take the differences to find counts. Here is my Python code:
def binarySearch(arr, low, high, value):
# this is usual binary search algorithm
if high - low < 2:
if arr[low] == value:
return low
elif arr[high] == value:
return high
else:
return None
else:
mid = (low + high)/2
if arr[mid] < value:
return binarySearch(arr, mid, high, value)
else:
return binarySearch(arr, low, mid, value)
def countSorted(arr):
# using binary search I will find first occurrences of each age, I will run binary search on the rest of the array with age + 1
# I will assume there is no jump in the ages, I am not assuming any upper bound on age
firstOcc = [0]
while firstOcc[-1] != None:
firstOcc.append(binarySearch(arr,firstOcc[-1],len(arr) - 1,len(firstOcc)))
print firstOcc
return takeDiff(arr,firstOcc)
def takeDiff(arr,firstOcc):
counts = []
for i in range(len(firstOcc) - 2):
counts.append(firstOcc[i+1] - firstOcc[i])
counts.append(len(arr) - firstOcc[len(firstOcc) - 2])
return counts
arr = [0,0,1,1,1,2,3,3,3,4,4,4]
print countSorted(arr)
//return: index age, value count of this age
public static int[] count ( int[] input ) {
int[] count = new int[input[input.length-1]+1];
count ( input, 0, input.length-1, count);
return count;
}
private static void count ( int[] input, int begin, int end, int[] count ) {
if ( input[begin] == input[end] ) {
count[input[begin]] += end-begin+1;
}
else {
count( input, begin, (begin+end)/2, count);
count( input, (begin+end)/2 + 1, end, count);
}
}
Just count the items inside the list and append each counter to another list. O(n) for the iterations through the array.
def myListcount(lista):
listB = []
counter = 1
for i,j in enumerate(lista):
if i == len(lista) -1:
break
x = lista[i+1]
if x == j:
counter +=1
else:
listB.append(counter)
counter = 1
listB.append(counter)
print listB
Lazy C++ verson
std::vector<uint64_t> histo_people(const std::vector<int>& s) {
using namespace std;
vector<uint64_t> counts(s.back()+1);
auto cur_age = s.cbegin();
while (cur_age != s.end()) {
auto ub = upper_bound(cur_age,s.cend(),*cur_age);
counts[*cur_age] = ub-cur_age;
cur_age = ub;
}
return counts;
}
#include<iostream>
#include<unordered_map>
using namespace std;
class countAge{
public:
countAge(int* array) : ages(array){}
void countAges(int, int);
void displayAges();
private:
int* ages;
unordered_map<int, int> result;
};
void countAge :: countAges(int i, int j){
if(i<=j){
int mid = (i+j)/2;
if(ages[i] == ages[mid]){
result[ages[i]]+= mid-i+1;
} else {
countAges(i, mid);
}
countAges(mid+1, j);
}
};
void countAge :: displayAges(){
for(map<int, int> :: iterator it = result.begin(); it!=result.end(); ++it){
cout<<it->first<<"-"<<it->second<<endl;
}
};
int main(){
int ages[] = {0,0,0,0,0,1,1,1,1,1,4,4,4,4,4,7,7,7,8,8,10,10,10,10,10,10,10,15,15,15,15,30,30,30,30,30,49,49,49,49,49,49,49,49,49,49,49,100,100,100,100};
int length = sizeof(ages)/sizeof(ages[0]);
countAge c(ages);
c.countAges(0, length-1);
c.displayAges();
return 0;
}
#include <stdio.h>
/// C implementation
///
/// this problem can be solved by doing a binary search to
/// find the start and end of each age.
///
/// e.g. [0,0,0,1,1,1,9,9,9,11,13,13]
///
/// for a[0], we need to find where it starts and where it ends.
/// 1. to find where it starts. we need to find the element that
/// is smaller than it.
/// 2. to find where it ends. we need to find the element that
/// is bigger than it.
/// 3. once we have that we have the number of occurs and also we
/// have the next element to process, i.e. the bigger element.
///
static int find_lower(int *a, int s, int e, int elem) {
if (s+1==e) {
if (a[s] == elem)
return s-1;
return s;
}
int m=(s+e)/2;
if (a[m] < elem) {
/// if a[m] is smaller than elem. try to shrink left side.
return find_lower(a, m, e, elem);
} else if (a[m] > elem) {
/// if a[m] is bigger than elem. try to shrink right side.
return find_lower(a, s, m, elem);
} else {
/// if a[m] is the same as elem. try to shrink right side.
return find_lower(a, s, m, elem);
}
}
static int find_higher(int *a, int s, int e, int elem) {
if (s+1==e) {
if (a[e] == elem)
return e+1;
return e;
}
int m=(s+e)/2;
if (a[m] < elem) {
/// if a[m] is smaller than elem. try to shrink left side.
return find_higher(a, m, e, elem);
} else if (a[m] > elem) {
/// if a[m] is bigger than elem. try to shrink right side.
return find_higher(a, s, m, elem);
} else {
/// if a[m] is the same as elem. try to shrink right side.
return find_higher(a, m, e, elem);
}
}
int main() {
int a[] = {0,0,0,1,1,1,9,9,9,11,13,13};
int elem;
int l=0,h=0;
int count = sizeof(a)/sizeof(a[0]);
while(h<count) {
int elem = a[l];
l = find_lower(a, l, count-1, elem);
h = find_higher(a, l,count-1, elem);
printf("there are %d %ds\n", h-1-l, elem);
l = h;
}
return 0;
}
Could use a varient of binary search to look for samller than and greater than. The difference b/w the indices of lesser and smaller than element will give the count.
- Stuart November 21, 2014