Google Interview Question
Software Engineer / DevelopersNote the qualifier: "large number". If Google says large number, that should be massive. That amount of data can only be stored in hard drive or tape etc, where IO dominates the efficiency. So, to speed up searching, B-tree is optimal.
For median, I guess we can search in the leaf nodes. B-tree is ordered. So, you can find the median half way through keys of the leaf nodes.
Note the qualifier: "large number". If Google says large number, that should be massive. That amount of data can only be stored in hard drive or tape etc, where IO dominates the efficiency. So, to speed up searching, B-tree is optimal.
For median, I guess we can search in the leaf nodes. B-tree is ordered. So, you can find the median half way through keys of the leaf nodes.
<pre lang="" line="1" title="CodeMonkey45093" class="run-this">Use two heaps
Max heap and Min heap
( The constraint is that the size difference between minheap and maxheap should be 0 or 1)
Read the elements from source:
median is represented by the root of maxheap ( if number of elements is odd)
median is represented by the average of the root of maxheap and minheap (if the number of elements is even)
if the incoming element is less than the median
if the size of maxheap is greater than minheap by 1. Delete the element from maxheap and insert to minheap.
insert the element to maxheap
else
if the size of minheap is greater than maxheap by 1. Delete the element from minheap and insert to maxheap.
insert the element to minheap
comments Welcome :-)</pre><pre title="CodeMonkey45093" input="yes">
</pre>
Here is my solution (C). For the even tree_size, traverse_inorder may be optimized for returning two nodes instead of one (using a struct)
/**
* Traverse inorder
* left-root-right
* @param N Nth number in the ordered list
*/
int traverse_inorder(Node_t *root, int N)
{
static int counter = 0;
if(!root) return;
traverse_inorder(root->left, N);
counter++;
if(counter == N) /*Nth number in list is found */
{
return root->data;
}
traverse_inorder(root->right, N);
}
find_median(Node_t *root, int tree_size)
{
int N = 0;
int lower = 0;
int higher = 0;
if(tree_size & 1)
{
N = (tree_size/2) + 1;
return traverse_inorder(root, N);
}
else
{
N = tree_size/2;
lower = traverse_inorder(root, N);
N++;
higher = traverse_inorder(root, N);
return (lower + higher)/2;
}
}
Your function traverse_inorder uses a static counter. The function won't return an accurate result more than once because the counter is not reset during subsequent calls.
Why can't we use arrays to store large number of integers
- abhimanipal February 02, 2010We can use the Select Algorithm to find the median. This algorithm also has time complexity of O(n)