Google Interview Question
Software EngineersCountry: United States
Interview Type: In-Person
As each element from the stream arrives it is added to a frequency table. The size of the frequency table = N if the input is restricted to range 0.. N-1. A total of the number of elements processed from the stream should also be maintained. The median can be read out from the frequency table by adding up frequencies from index=0 and moving to the right till the cumulative frequency = 1/2 (total)
import java.util.stream.Stream;
public class MedianOfStream {
public static void main(String[] args) {
// assume range of each input int to be in 0.. N-1
final int MAX = 100;
final int [] aux = new int [MAX];
final int total [] = new int [1];
Stream.of (0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12)
.forEach(x -> {total[0]++; aux [x]++; });
System.out.println(findMedian (aux, total[0]));
}
private static int findMedian(int[] aux, int tot) {
int i=0, j=0;
while ( i<aux.length ) {
j+=aux[i];
if ( j>=tot/2) return i;
i++;
}
return -1;
}
}
If the interviewer didn't specify that all values are within 1..n, then, I'd use the max and min heaps solution, which would have time complexity for adding N numbers: O(N log N), and time complexity for getting the median: O(1).
Since they mentioned 1..n, probably they'd like to see something like below, with time complexity for adding N numbers: O(N), and time complexity for getting the median: O(N).
class Median {
public:
void AddNumber(uint32_t n)
{
if (n >= data_.size()) {
data_.resize(max(n + 1, (uint32_t)(data_.size() * 2)));
}
++data_[n];
++size_;
}
double Get()
{
double median = 0;
if (size_ > 0) {
int idx = (size_ - 1) / 2;
uint32_t v1 = 0;
uint32_t v2 = 0;
GetVals(v1, v2, idx);
median = size_ % 2 == 0 ? (v1 + v2) / 2.0 : v1;
}
return median;
}
private:
void GetVals(uint32_t &v1, uint32_t &v2, int idx)
{
v1 = v2 = 0;
int count = 0;
for (uint32_t i = 0; i < data_.size(); ++i) {
count += data_[i];
if (count > idx) {
v1 = i;
v2 = count > idx + 1 ? i : i + 1;
return;
}
}
}
vector<int> data_;
int size_;
};
The following is an implementation in Java 8 (`n` is the upper bound of the range):
static int median(IntStream strm, int n) {
int[] save = new int[n];
strm.forEach(x -> save[x]++);
List<Integer> list = new LinkedList<>();
for(int i=0; i<n; i++) {
while (save[i] > 0) {
list.add(i);
save[i]--;
}
}
return list.get(list.size() / 2);
}
And it's called: with:
System.out.println(median(IntStream.of(1,2,3,4,5,6,7,8,9,10), 12));
There is an algorithm to find the k-th order element in an unsorted array in time O(k) - its much like quicksort() but we recursively operate on just side part. its running time is O(k) amortized time.
since the Median is the K/2 ordered element, we simply activate the above mentioned algorithm on the input array.
if the stream isnt inside an array, we consume it into an array which is O(n).
so its amortized O(n) - not O(n)
How about this. Have an auxilary array to hold the frequencies of each input seen in the stream. This array will have size N if the range of input was [0..N-1]. Each input will arrive and get added to the auxilary array. A total counter will also increment.
- Anonymous August 18, 2017{{
import java.util.stream.Stream;
public class MedianOfStream {
public static void main(String[] args) {
// assume range of each input int to be in 0.. N-1
final int MAX = 100;
final int [] aux = new int [MAX];
final int total [] = new int [1];
Stream.of (0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12)
.forEach(x -> {total[0]++; aux [x]++; });
System.out.println(findMedian (aux, total[0]));
}
private static int findMedian(int[] aux, int tot) {
int i=0, j=0;
while ( i<aux.length ) {
j+=aux[i];
if ( j>=tot/2) return i;
i++;
}
return -1;
}
}
}}