## Google Interview Question for Software Engineer / Developers

Country: Israel
Interview Type: In-Person

Comment hidden because of low score. Click to expand.
8
of 10 vote

If there was no weight the problem would be to sample from a stream with equal probability: Sample the first input from the stream. Given the k-th input from the stream, generate random number p=1..k, if p=1 replace the sample.
Now, the problem is that selection is no longer uniform but driven by 'weight'. One can use very similar approach: (i) cumulate the weights seen so far and (ii) decide wheather to replace based on current weight and cumulated value.

A sample code is shown below:

``````public int getRandom(InputStream str) {
Scanner sc = new Scanner(str);
int sum = 0, sample;

while (sc.hasNext()) {
int val = sc.nextInt();
int wgt = sc.nextInt();
if (wgt == 0 && val == 0) 	break;
sum += wgt;
int p = Math.random()*(double) sum;
if (p <= wgt)				sample = val;
}

return sample;
}``````

EDIT:
As some people requested, here is a proof of the simpler problem:

The goal is to find a random sample from the stream. After receiving the k-th input, the sample will be replaced with probability 1/k with this new input. Let's compute a probability that the first input is still the selected ranodm sample after k iterations. Several consecutive events must happen: (i) The first input is selected as a sample and (ii) the sample is never replaced afterwards.

The first input is selected with probability p1 = 1/k = 1/1, the j-th input does not replace the sample with probability not_pj = (j-1)/j. Hence, the probability of the first input is a sample after k-th iteration can be written as:
p1_isSample = p1 * not_p2 * not_p3 ... * not_pk= 1/1 * 1/2 * 2/3 ... (k-1)/k = 1/k

Notice that this result holds for the second, third, and any other n-th element
pn_isSample = 1/n * n/(n+1) * (n+1)/(n+2) ... (k-1)/k = 1/k

hence, each element can be selected as the sample with equal probability of 1/k.
Q.E.D.

This proof can be now generalized to the non-uniform distribution case with weights.

Comment hidden because of low score. Click to expand.
0

Shouldn`t we move random p * sum and condition outside the while loop and before the return?

Comment hidden because of low score. Click to expand.
0

Got it......we have to replace the sample....so the return should be sample not val.

Comment hidden because of low score. Click to expand.
0

@autoboli, excellent approach to think about simpler problem first, but you don't even have solution for a simpler problem. We don't know number of elements ahead of time, how can we select k?

Comment hidden because of low score. Click to expand.
0

Regarding replacing - are you sure you are not giving preference based on order of appearance?

Comment hidden because of low score. Click to expand.
0

@Sai, thx a lot for pointing the mistake at return statement. Fixed!

Comment hidden because of low score. Click to expand.
0

@CT: For simple problem the 'k' is number of samples seen so far. One can compute the probability of each sample being replaced after 'k-th' sample appears. It can be proven that for each sample, the probability is equal. Now, the problem is essentially the same except that probability being sampled is driven by weight.

Comment hidden because of low score. Click to expand.
0

@autoboli, let's just talk about simple version for the time being. It seems to me you are talking about extending selection from k-1 elements to selection from k elements. Each element has 1/k probability to be selected, including the last one, I got this part about the last one, it will be selected with probability 1/k. The other ones will be selected with probability (k-1/k)/(k-1) =1/k, Wow, it works ...

Comment hidden because of low score. Click to expand.
0

Excactly ;)

Comment hidden because of low score. Click to expand.
0

Incredible solution, autoboli. Probabilities are indeed equal. I'd never have thought it myself.

Comment hidden because of low score. Click to expand.
0

Thank you Victor! You would, maybe I would take some time but finally you would :)

Comment hidden because of low score. Click to expand.
0

Any link to a theory of why this works? Meaning why samples have equal probability of being chosen (according to the weights) in this approach. I think I grasp how it works, but not quite why. Any theorem, rule, or Wikipedia article?

Comment hidden because of low score. Click to expand.
0

@Valoris: Samples with higher weights have better chances of being chosen, now consider a sample with some weight, if it's near the beginning, chances of being chosen is high, but chances of being replaced is high as well. if it's near the end, chances of being chosen is low, but chances of being replaced is low as well.
Bottom line, no matter where a sample is, it's chances of being chosen is proportional to its weight

Comment hidden because of low score. Click to expand.
0

I created a document which mathematically proves the answer given by autoboli is correct, If interested, please email me (koosha.nejad@gmail.com) and I'll send it to you.

Comment hidden because of low score. Click to expand.
0

@valoris, I put some proof sketch below the code above. I hope it will help to understand the concept :)

Comment hidden because of low score. Click to expand.
0

That is a variation of reservoir sampling problem

Comment hidden because of low score. Click to expand.
0
of 0 vote

Start from the beginning of the stream and keep the unique id with probability w[i] / sum[i], where w[i] is the weight of the i'th element in the stream and sum[i] is the sum of weights up to i'th element.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Currently the getRandom() function is O(n). We can reduce it to O(log n) if we sort the tuples in order of non-decreasing weights and then do binary search.

``````import java.util.Scanner;
import java.util.List;

class WeightedProbability {
public static void main (String [] args) {
Scanner s = new Scanner(System.in);
int totalWeight = 0;
Tuple tuple;
do {
totalWeight += tuple.weight;
} while (!tuple.isLastTuple());
// Remove the last tuple from the set
for (int i = 0; i < 100; i++) {
getRandom(tuples, totalWeight);
}

}

public static Tuple readTuple(Scanner s) {
Tuple t = new Tuple();
t.num = s.nextInt();
t.weight = s.nextInt();
return t;
}

public static int getRandom(List<Tuple> tuples, int totalWeight) {
int rand = (int)(Math.random() * totalWeight);
System.out.println("rand = " + rand);
int i = 0;
Tuple t = null;
while (rand >= 0) {
t = tuples.get(i);
rand -= t.weight;
i++;
}
System.out.println("Number = " + t.num);
return t.num;
}
}

class Tuple {
int num;
int weight;

boolean isLastTuple() {
if (num == 0 && weight == 0) {
return true;
}
return false;
}
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

Keep track of the total_weight;
At each iteration (id,weight)
• total_weight += weight
• x = generate a random number between 1 and total_weight
• if (total_weight - x < weight ) => selected_id = id
Trace:
1. (1,2) => with 2/2 chance selected_id = 1
2. (2,2) => with 2/4 chance selected_id = 2
3. (3,4) => with 4/8 chance selected_id = 3 => with 4/8 chance selected_id = 2
4. (4,10) => with 10/18 chance selected_id = 4 => with 8/18 chance selected_id = 2
Chances of "selected_id = 2" at the end: (2/4)*(4/8)*(8/18)=1/9

Comment hidden because of low score. Click to expand.
0
of 0 vote

I created a document which mathematically proves the answer given by autoboli is correct, If interested, please email me (koosha.nejad@gmail.com) and I'll send it to you.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Assume the input from [0,W0], [1,W1],...[n,Wn], the selected index is i. Let totalWeight=sum(W0,...,Wn).
Now consider a new input [n+1,Wn+1], the probability to keep index I is totalWeight / (totalWeight + Wn+1), and probbability to have a new index n+1 is Wn+1 / (totalWeight + Wn+1).
See my analysis and code: allenlipeng47.com/PersonalPage/index/view/127/nkey

``````package feb;

public class ReturnWithProbabilityProportionalToItsWeight {

public static void main(String[] args) {
int[][] input = { { 1, 4 }, { 2, 2 }, { 3, 2 }, { 0, 0 } };
int result = getRandom(input);
System.out.println(result);
/*
* Following runs for 1000 times to check the answer correctness.
*
* int[] test = new int[3]; for (int i = 0; i < 1000; i++) {
* test[getRandom(input)]++; } for (int i = 0; i < test.length; i++) {
* System.out.print(test[i] + "\t"); }
*/
}

public static int getRandom(int[][] input) {
double totalWeight = input[0][1];
int result = 0;
for (int i = 1; i < input.length; i++) {
if (input[i][0] == 0 && input[i][1] == 0) {
break;
}
double newTotalWeight = totalWeight + input[i][1];
double newChance = ((double) input[i][1]) / newTotalWeight;
if (Math.random() <= newChance) {
result = i;
}
totalWeight = newTotalWeight;
}
return result;
}
}``````

Comment hidden because of low score. Click to expand.
0

This is the same as the algorithm "autoboli" suggested

Comment hidden because of low score. Click to expand.
0
of 0 vote

Can you format better the question? #noob, #you

Comment hidden because of low score. Click to expand.
0
of 0 vote

ranjan

Comment hidden because of low score. Click to expand.
-1
of 1 vote

Considering that we are allowed to modify the given list of pairs, then we can store the cumulative sum in the list itself.

Let list[i] store the cumulative sum till index i.
The weight of the given index can be retrieved as list[i] - list[i-1].

Once we have stored the cumulative sum (say, totalSum), we may call the random function as rand()*totalSum. As the cumulative sum array is an nondecreasing list we can do a binary search on the list to obtain the key corresponding to "rand()*totalSum".

Time complexity : O(n) to traverse the list + O(log n) to do the binary search.

Correct me if i am wrong.

Comment hidden because of low score. Click to expand.
1
of 1 vote

You can't store anything - not even members of the list itself - because of the O(1) memory limit and the fact your input is a stream (which, presumably, only offers some sort of getNext() API).

Comment hidden because of low score. Click to expand.
-1
of 1 vote

class Program
{
static void Main(string[] args)
{
int[] a = new int[size];
for (int i = 0; i < a.Length; i++)
{
}

int unique = 0;
int x = a[0];
for (int i = 1; i < a.Length; i++)
x ^= a[i];

for (int i = 0; i < a.Length; i++)
{
if (Convert.ToBoolean(x) & Convert.ToBoolean(a[i]))
unique ^= a[i];

}

Console.WriteLine(unique);

}
}

Comment hidden because of low score. Click to expand.
0

Note that Stream size and weights sum are unknown in advance, and you are allowed O(1) memory consumption. you using an array

Comment hidden because of low score. Click to expand.
-1
of 1 vote

class Program
{
static void Main(string[] args)
{
int[] a = new int[size];
for (int i = 0; i < a.Length; i++)
{
}

int unique = 0;
int x = a[0];
for (int i = 1; i < a.Length; i++)
x ^= a[i];

for (int i = 0; i < a.Length; i++)
{
if (Convert.ToBoolean(x) & Convert.ToBoolean(a[i]))
unique ^= a[i];

}

Console.WriteLine(unique);

}
}

Comment hidden because of low score. Click to expand.
0

Note that Stream size and weights sum are unknown in advance, and you are allowed O(1) memory consumption. you cannot use an array

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.