Suppose we use binary search t

Google Interview Question for Software Engineer / Developers

1

of 1 vote

19
Answers
Suppose we use binary search tree to implement set, design an algorithm that we can get an random element from the set, while maintain all the other set operations have same complexities.
- ywdong8809 March 07, 2013 in China | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: China
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 8 vote

Here is my thought: all we need is an API which will find the kth element in the tree, such as tree.find(int k). For the kth element, we can achieve O(n) time complexity using normal binary search tree itself. If we want better, we can store some counts of child elements in the node, and so the delete and insert operation would need to update the count of all its Ancestors. The final performance of search would depend on the snap shot of tree at the time. If it is more like balanced binary search tree, you got the O(log(n)). If it is more like a list, it is O(n). But it will all guarantee the uniform probability distribution at any snapshot of the tree.

- chenlc626 March 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 5 vote

What should be the distribution? If not specified, we can start in root and decide randomly in each node whether to continue to left or to right. If uniform, we need to maintain weights of left or righ subtree in each node. Than, start in root and decide in every node in the following way: if weight of left subtree is L and right is R, then D = random (1, L+R); if D < L then go left; else go right.

- tpcz March 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 3 votes

That approach will never select the root node of the tree, I would do an inOrder traversal then store it on an array and select a random value from the array, that would guarantee equal random probability for all nodes.

- vhmj1982 March 07, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

But it need O（n）extra space.

- ywdong8809 March 07, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@vhmj1982: This is a reasonable solution. Not picking the root is a minor issue and can easily be corrected.

This way, for a balanced tree, insert/delete and getRandom all have O(log n) time.

Doing an inorder traversal will be O(n) time (btw, if you do decide to do inorder, you can save the array space of O(n) by using reservoir sampling).

@ywdong8809: The problem seems to imply that the binary tree implementation is in your hands, and so we expect the L/R will be available as a node field. Did you clarify this with the interviewer?

- Loler March 07, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Loler Yes,i have clarify with the interviewer. We can maintain some extra data in this BST, just guarantee not slow down the other operation s.

- ywdong8809 March 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

but in this solution when does it return it decides randomly to continue left or right, how do you know when to stop and return the element?

- Anonymous March 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

you don't need to store the array, you generate first the kth number and do the traverse and stop when you've seen k nodes and return the element you are at, no extra space needed

- Anonymous March 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Anonymous: If you don't know the number of nodes in the tree, you cannot generate a random number and then do an inorder traversal stopping at that point.

- Anonymous March 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Anonymous: once again it depends on what you are willing to trade, if you don't want to add extra space, you can count them in one pass and then stop at k in the second pass. If you can use the extra space using the array is perfectly valid.

- Anonymous March 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Anonymous: That is why Loler suggest reservoir sampling. You don't need to make 2 passes. During the count phase itself you can get the random number... Not sure what your point is, the usage of array has already been dismissed as unnecessary.

- Anonymous March 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Traverse the tree once and record the size (i.e. the number of nodes) of each subtree at each node. Then, starting from the root, do the following:

leftSize := size(left) or 0 if left == null
rightSize := size(right) or 0 if right == null
totalSize := leftSize + rightSize + 1

with probability 1 / totalSize, return current node
with probability leftSize / totalSize, go to the left subtree and repeat
with probability rightSize / totalSize, go to the right subtree and repeat

This should return each element with equal probability. However, I couldn't find a way to utilize the fact that the tree in question is a BST. This algorithm will work for any binary tree. Any ideas for that?

- tolga.ceng March 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

If we know the size of the subtree rooted at each node of the tree (and hence of the whole tree) we could get a random element in O(log n) time (assuming the tree is balanced and its height is log n, O(h) in the general case)

Check the root and see how many elements are in the tree, say n. Then generate a random int k between 0 and n. We just have to select the k-th element: but we know how many elements there are in each subtree, and we can use a recursive function.
So if size(root->left) == k-1 we return the root, if size(root->left) < k - 1 we can return the (k - size(root->left) - 1)-th element of the right tree, otherwise we return the k-th element of the left subtree.
No more than O(h) calls are needed

- Anonymous April 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Warning: if k (the index of a node) starts from 0, then one must check that size(node->left) == k (and not k-1) at each step.

Here is the code:

public Value randomNode() throws IllegalStateException {
       if (root == null) {
           throw new IllegalStateException();
       }
       int k = generator.nextInt(root.N);
       
       
       return getKthNode(root, k);
    }
    
    private Value getKthNode(Node node, int k) throws IllegalArgumentException {
        if (node == null || k > node.N) {
            throw new IllegalArgumentException();
        }
        int lsize;
        try {
            lsize = node.left.N;
        } catch (NullPointerException e) {
            lsize = 0;
        }
        
        if (lsize == k) {
            return node.val;
        } else if (lsize > k) { 
            return getKthNode(node.left, k);
        } else {
            //Invariant: node.right != null 
            //(node.N > k - 1 && node.left.N < k => 
            //node.right.N = node.N - node.left.N - 1 > k - (k - 1) - 1 = 0 => right!=null
            return getKthNode(node.right, k - lsize - 1);
        }
    }

I tested it using this code:

public static void testRandomNodeDistribution() {
        int i, n = 10000, m = 100000000, s[] = randomSequence(n);
        BST tree = new BST();
        HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
        
        for (i=0; i < n; i++) {
            tree.put(s[i]);
        }
        
        for (i=0; i < m; i++) {
            int x = tree.randomNode();
            if (map.containsKey(x)) {
                map.put(x, map.get(x) + 1);
            } else {
                map.put(x, 1);
            }
        }
        
        double mean = 0, sdv = 0;
        for (i=0; i < n; i++) {
            mean += map.get(i);
        }
        mean /= n;
        
        for (i=0; i < n; i++) {
            double tmp = (map.get(i) - mean);
            tmp *= tmp;
            sdv += tmp;
        }        
        
        sdv = Math.sqrt(sdv/n);
        System.out.println(sdv / mean);
        
    }

And the resulting standard deviation is about 0.934 % of the mean value, so one can say the resulting distribution is actually pretty uniform.

- Anonymous April 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

We could also use an order-statistic tree, which is a balanced-BST. So, if we keep count of the total nodes in the tree,then we can call index into the K'th node in the list, in O(logn) time. .. the index being generated by random number generator from (0,K-1) .

- darwin March 13, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

how does this sound? Assume that height of the tree is always known to us. In that case we select a random int < height of the tree. Then we travel that deep and at each level we make a random decision of whether to go left or right.

We might have to store height of the subtree at each node if the tree is highly skewed and take the height of the subtree into account too whenever we make decision on left/right child.

- guru April 03, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

The issue with that approach is that, the distribution would be skewed in favor of elements belonging to depths (/heights) which have lesser nodes. For e.g. at depth 0, we only have 1 element, i.e. root. At depth 1, we have 2. At depth N, say its a balanced BST, we'd have 2^N elements. If we're choosing equally among depth [0, N], individual element at depth N, would have a much smaller chance of being selected than anything above it.

This might or might not be a concern. But, if I was the interviewer, I'd definitely want to get a solution where there's an equal probability of selection for each element in the tree.

- mrjn April 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

since this is a BST, we can search the k highest node. assuming we know the size of the tree.

- aileen May 03, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

augment the data structure to include # of nodes. and return k = rand() % size + 1, and return kth smallest element

- Anonymous July 30, 2013 | Flag Reply

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews