Google Interview Question for Software Engineer / Developers


Country: Israel
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
8
of 10 vote

If there was no weight the problem would be to sample from a stream with equal probability: Sample the first input from the stream. Given the k-th input from the stream, generate random number p=1..k, if p=1 replace the sample.
Now, the problem is that selection is no longer uniform but driven by 'weight'. One can use very similar approach: (i) cumulate the weights seen so far and (ii) decide wheather to replace based on current weight and cumulated value.

A sample code is shown below:

public int getRandom(InputStream str) {
	Scanner sc = new Scanner(str);
	int sum = 0, sample;

	while (sc.hasNext()) {
		int val = sc.nextInt();
		int wgt = sc.nextInt();
		if (wgt == 0 && val == 0) 	break;
		sum += wgt;
		int p = Math.random()*(double) sum;
		if (p <= wgt)				sample = val;
	}
	
	return sample;
}

EDIT:
As some people requested, here is a proof of the simpler problem:

The goal is to find a random sample from the stream. After receiving the k-th input, the sample will be replaced with probability 1/k with this new input. Let's compute a probability that the first input is still the selected ranodm sample after k iterations. Several consecutive events must happen: (i) The first input is selected as a sample and (ii) the sample is never replaced afterwards.

The first input is selected with probability p1 = 1/k = 1/1, the j-th input does not replace the sample with probability not_pj = (j-1)/j. Hence, the probability of the first input is a sample after k-th iteration can be written as:
p1_isSample = p1 * not_p2 * not_p3 ... * not_pk= 1/1 * 1/2 * 2/3 ... (k-1)/k = 1/k

Notice that this result holds for the second, third, and any other n-th element
pn_isSample = 1/n * n/(n+1) * (n+1)/(n+2) ... (k-1)/k = 1/k

hence, each element can be selected as the sample with equal probability of 1/k.
Q.E.D.

This proof can be now generalized to the non-uniform distribution case with weights.

- autoboli January 27, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Shouldn`t we move random p * sum and condition outside the while loop and before the return?

- Sai January 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Got it......we have to replace the sample....so the return should be sample not val.

- Sai January 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@autoboli, excellent approach to think about simpler problem first, but you don't even have solution for a simpler problem. We don't know number of elements ahead of time, how can we select k?

- CT January 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Regarding replacing - are you sure you are not giving preference based on order of appearance?

- CT January 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@Sai, thx a lot for pointing the mistake at return statement. Fixed!

- autoboli January 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@CT: For simple problem the 'k' is number of samples seen so far. One can compute the probability of each sample being replaced after 'k-th' sample appears. It can be proven that for each sample, the probability is equal. Now, the problem is essentially the same except that probability being sampled is driven by weight.

- autoboli January 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@autoboli, let's just talk about simple version for the time being. It seems to me you are talking about extending selection from k-1 elements to selection from k elements. Each element has 1/k probability to be selected, including the last one, I got this part about the last one, it will be selected with probability 1/k. The other ones will be selected with probability (k-1/k)/(k-1) =1/k, Wow, it works ...

- CT January 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Excactly ;)

- autoboli January 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Incredible solution, autoboli. Probabilities are indeed equal. I'd never have thought it myself.

- Victor January 28, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Thank you Victor! You would, maybe I would take some time but finally you would :)

- autoboli January 28, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Any link to a theory of why this works? Meaning why samples have equal probability of being chosen (according to the weights) in this approach. I think I grasp how it works, but not quite why. Any theorem, rule, or Wikipedia article?

- valoris January 28, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@Valoris: Samples with higher weights have better chances of being chosen, now consider a sample with some weight, if it's near the beginning, chances of being chosen is high, but chances of being replaced is high as well. if it's near the end, chances of being chosen is low, but chances of being replaced is low as well.
Bottom line, no matter where a sample is, it's chances of being chosen is proportional to its weight

- koosha.nejad January 28, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I created a document which mathematically proves the answer given by autoboli is correct, If interested, please email me (koosha.nejad@gmail.com) and I'll send it to you.

- koosha.nejad January 28, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@valoris, I put some proof sketch below the code above. I hope it will help to understand the concept :)

- autoboli January 28, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

That is a variation of reservoir sampling problem

- Dmitry April 08, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Start from the beginning of the stream and keep the unique id with probability w[i] / sum[i], where w[i] is the weight of the i'th element in the stream and sum[i] is the sum of weights up to i'th element.

- anan January 27, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Currently the getRandom() function is O(n). We can reduce it to O(log n) if we sort the tuples in order of non-decreasing weights and then do binary search.

import java.util.Scanner;
import java.util.List;
import java.util.LinkedList;

class WeightedProbability {
	public static void main (String [] args) {
		Scanner s = new Scanner(System.in);
		List<Tuple> tuples = new LinkedList<Tuple>();
		int totalWeight = 0;
		Tuple tuple;
		do {
			tuple = readTuple(s);
			totalWeight += tuple.weight;
			tuples.add(tuple);
		} while (!tuple.isLastTuple());
		// Remove the last tuple from the set
		for (int i = 0; i < 100; i++) {
			getRandom(tuples, totalWeight);
		} 

	}
	
	public static Tuple readTuple(Scanner s) {
		Tuple t = new Tuple();
		t.num = s.nextInt();
		t.weight = s.nextInt();
		return t;
	}

	public static int getRandom(List<Tuple> tuples, int totalWeight) {
		int rand = (int)(Math.random() * totalWeight);
		System.out.println("rand = " + rand);
		int i = 0;
		Tuple t = null;
		while (rand >= 0) {
			t = tuples.get(i);
			rand -= t.weight;
			i++;
		}
		System.out.println("Number = " + t.num);
		return t.num;
	}
}

class Tuple {
	int num;
	int weight;

	boolean isLastTuple() {
		if (num == 0 && weight == 0) {
			return true;
		}
		return false;
	}
}

- Sumeet Singhal January 27, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Keep track of the total_weight;
At each iteration (id,weight)
• total_weight += weight
• x = generate a random number between 1 and total_weight
• if (total_weight - x < weight ) => selected_id = id
Trace:
1. (1,2) => with 2/2 chance selected_id = 1
2. (2,2) => with 2/4 chance selected_id = 2
3. (3,4) => with 4/8 chance selected_id = 3 => with 4/8 chance selected_id = 2
4. (4,10) => with 10/18 chance selected_id = 4 => with 8/18 chance selected_id = 2
Chances of "selected_id = 2" at the end: (2/4)*(4/8)*(8/18)=1/9

- koosha.nejad January 28, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I created a document which mathematically proves the answer given by autoboli is correct, If interested, please email me (koosha.nejad@gmail.com) and I'll send it to you.

- koosha.nejad January 28, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Assume the input from [0,W0], [1,W1],...[n,Wn], the selected index is i. Let totalWeight=sum(W0,...,Wn).
Now consider a new input [n+1,Wn+1], the probability to keep index I is totalWeight / (totalWeight + Wn+1), and probbability to have a new index n+1 is Wn+1 / (totalWeight + Wn+1).
See my analysis and code: allenlipeng47.com/PersonalPage/index/view/127/nkey

package feb;

public class ReturnWithProbabilityProportionalToItsWeight {

	public static void main(String[] args) {
		int[][] input = { { 1, 4 }, { 2, 2 }, { 3, 2 }, { 0, 0 } };
		int result = getRandom(input);
		System.out.println(result);
		/*
		 * Following runs for 1000 times to check the answer correctness.
		 * 
		 * int[] test = new int[3]; for (int i = 0; i < 1000; i++) {
		 * test[getRandom(input)]++; } for (int i = 0; i < test.length; i++) {
		 * System.out.print(test[i] + "\t"); }
		 */
	}

	public static int getRandom(int[][] input) {
		double totalWeight = input[0][1];
		int result = 0;
		for (int i = 1; i < input.length; i++) {
			if (input[i][0] == 0 && input[i][1] == 0) {
				break;
			}
			double newTotalWeight = totalWeight + input[i][1];
			double newChance = ((double) input[i][1]) / newTotalWeight;
			if (Math.random() <= newChance) {
				result = i;
			}
			totalWeight = newTotalWeight;
		}
		return result;
	}
}

- allen.lipeng47 February 01, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

This is the same as the algorithm "autoboli" suggested

- koosha.nejad February 02, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Can you format better the question? #noob, #you

- wtf? March 01, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

ranjan

- ranjan April 17, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

Considering that we are allowed to modify the given list of pairs, then we can store the cumulative sum in the list itself.

Let list[i] store the cumulative sum till index i.
The weight of the given index can be retrieved as list[i] - list[i-1].

Once we have stored the cumulative sum (say, totalSum), we may call the random function as rand()*totalSum. As the cumulative sum array is an nondecreasing list we can do a binary search on the list to obtain the key corresponding to "rand()*totalSum".

Time complexity : O(n) to traverse the list + O(log n) to do the binary search.

Correct me if i am wrong.

- subrata.das.it.kgec January 27, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

You can't store anything - not even members of the list itself - because of the O(1) memory limit and the fact your input is a stream (which, presumably, only offers some sort of getNext() API).

- demonix January 28, 2015 | Flag
Comment hidden because of low score. Click to expand.
-1
of 1 vote

class Program
{
static void Main(string[] args)
{
int size = Convert.ToInt32(Console.ReadLine());
int[] a = new int[size];
for (int i = 0; i < a.Length; i++)
{
a[i] = Convert.ToInt32(Console.ReadLine());
}

int unique = 0;
int x = a[0];
for (int i = 1; i < a.Length; i++)
x ^= a[i];

for (int i = 0; i < a.Length; i++)
{
if (Convert.ToBoolean(x) & Convert.ToBoolean(a[i]))
unique ^= a[i];

}

Console.WriteLine(unique);

}
}

- Anonymous January 28, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Note that Stream size and weights sum are unknown in advance, and you are allowed O(1) memory consumption. you using an array

- koosha.nejad January 28, 2015 | Flag
Comment hidden because of low score. Click to expand.
-1
of 1 vote

class Program
{
static void Main(string[] args)
{
int size = Convert.ToInt32(Console.ReadLine());
int[] a = new int[size];
for (int i = 0; i < a.Length; i++)
{
a[i] = Convert.ToInt32(Console.ReadLine());
}

int unique = 0;
int x = a[0];
for (int i = 1; i < a.Length; i++)
x ^= a[i];

for (int i = 0; i < a.Length; i++)
{
if (Convert.ToBoolean(x) & Convert.ToBoolean(a[i]))
unique ^= a[i];

}

Console.WriteLine(unique);

}
}

- vishal gupta January 28, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Note that Stream size and weights sum are unknown in advance, and you are allowed O(1) memory consumption. you cannot use an array

- koosha.nejad January 28, 2015 | Flag


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More