Google Interview Question
Software EngineersCountry: Switzerland
Interview Type: Written Test
tested.candidate solution should work. It's implementation of reservoir sampling. In this case we are implementing it for a tree rather than an array or linkedlist etc.
I think it's the best solution, when you are not allowed pre-processing or don't know the number of nodes in advance.
If you can pre-process, you can come up with O(logn).
c# implementation
public void ModifiedInOrderForRandom(RData<T> data)
{
ModifiedInOrderForRandomUtil(root, data);
}
public void ModifiedInOrderForRandomUtil(Node node, RData<T> data)
{
if (node == null)
return;
ModifiedInOrderForRandomUtil(node.left, data);
data.index++;
Random rnd = new Random();
int randomIndex = rnd.Next(0, data.index);
if (randomIndex == 0)
data.randomData = node.data;
ModifiedInOrderForRandomUtil(node.right, data);
}
Update the tree to have to know how many node under any node including parent.
Once we know that, all we have to do it generate a random number using random() function between 1 to N(noth inclusive), then go that element and return it.
Preprocessing time :
Space : O(N)
Time : O(N)
Per request time : O(Log(N)) time + O(1) space.
Put the tree into array form. 0 where there's no node, 1 where is one. Throw a random between 0 and array size, if found 1 return, if found 0 repeat.
Caveat: we are wasting space and time if the tree is hugely unbalanced.
If no previously visited nodes must be returned, we have to keep track of every cell visited and check if there is still one unvisited (number of all visited, filled or no has to be less then array) unless the algo never returns.
Sample solution in C++
#include <memory>
#include <random>
#include <iostream>
using namespace std;
template<typename T>
struct BTreeNode {
T data;
shared_ptr<BTreeNode> left;
shared_ptr<BTreeNode> right;
explicit BTreeNode(const T& data)
: left(nullptr), right(nullptr), data(data) {}
BTreeNode()
: BTreeNode(T()) {}
};
typedef BTreeNode<int> IntNode;
typedef shared_ptr<IntNode> IntNodePtr;
template<typename RandEng>
class RandomSelector {
int _count;
IntNodePtr _choice;
RandEng _randEng;
public:
RandomSelector(const RandEng& re);
IntNodePtr selectRandom(const IntNodePtr root);
protected:
void selectRandomInner(const IntNodePtr root);
};
template<typename RandEng>
RandomSelector<RandEng>::RandomSelector(const RandEng& re)
: _count(0), _choice(nullptr), _randEng(re) {}
template<typename RandEng>
IntNodePtr RandomSelector<RandEng>::selectRandom(const IntNodePtr root) {
_count = 0;
_choice = nullptr;
selectRandomInner(root);
return _choice;
}
template<typename RandEng>
void RandomSelector<RandEng>::selectRandomInner(const IntNodePtr root) {
if (root == nullptr) return;
++_count;
uniform_int_distribution<int> dist(1, _count);
int randInt = dist(_randEng);
if (randInt == 1) {
_choice = root;
}
selectRandomInner(root->left);
selectRandomInner(root->right);
}
int main() {
// make tree
IntNodePtr node1 = make_shared<IntNode>(1);
node1->left = make_shared<IntNode>(2);
node1->right = make_shared<IntNode>(3);
// setup random number generation
random_device randDev;
mt19937 randEng(randDev());
RandomSelector<mt19937> randSelector(randEng);
IntNodePtr selection = randSelector.selectRandom(node1);
cout << "Random choice: "<< selection->data << endl;
return 0;
}
If we're not keeping track of nodes already visited, here is a solution I propose:
Lets say the depth of tree is N i.e. number of levels is N.
1. Select a random number between 0-N-1 say P. So we will go down P levels
2. Go down P levels, one level at a time. At each level, select a random number between 0or1: if number = 0 go left else go right
Each node will have equal probability of being selected.
I don't think so it will work puneet. The reason is being a unbalance tree. so suppose you choose a number randomly from 0 to N, where N is depth of tree, now if you traverse and the probability of taking any path is equal , you might end of taking that sub tree which is lesser than the number we have selected from 0- N. In that particular case we are not doing it proper. you algo will work awesome when you have balanced tree. Hope i am clear with my writing.
Please leave comment.
The answer is to traverse the tree (DFS or BFS) while keeping track of the number of nodes we've seen so far (with the first one indexed 1). Each time we see a new node pick a random number from 0 to index-1. If the random number is equal to 0 then remember the node to be returned (replace the previous one if necessary).
- tested.candidate July 14, 2015