## Google Interview Question for Software Engineer / Developers

• 10

Country: United States
Interview Type: In-Person

Comment hidden because of low score. Click to expand.
3
of 11 vote

Why not to consider this as Bernoulli trial, where:
1/M - probability of single success (that a value goes to a single slot)
i - expected numbers of success (the success = the value goes exactly to the same slot)
N - numbers of tries
1 - 1/M - a probability of failure:

So then having a Bernoulli trial formula where (N i) stands for binomial coefficient:

P(i)=(N i) * (1/M)^i * (1 - 1/M) ^(N - i)

Comment hidden because of low score. Click to expand.
0

What if N=2*i? There is a case that 2 slots with i elements in each and you are counting this case twice. What if N=k*i?

Comment hidden because of low score. Click to expand.
5

But since there are M single slot for success, I think you should multiply it by M.

Comment hidden because of low score. Click to expand.
1
of 1 vote

I'm not sure what's posted is correct so far.

Looks like what joe kid posted gives the probability of this outcome:
(1) Exactly i items will land in one particular bucket b. All other N-i items land in any of the other buckets. To me, this seems incomplete because it considers just one special bucket b.

As a correction, Charlie suggests multiplication by M to make this the probability of exactly i items landing in *any* bucket. This is using the assumption that the outcomes defined by (1) are mutually exclusive for each bucket b, because you would need that assumption to use the sum all M of them to get the probability of any of them occuring. However, they are not mutually exclusive. You can have i in one bucket and also have i land in another bucket by chance.

Comment hidden because of low score. Click to expand.
0

you are correct, this is probably not correct solution. Because of that, I tried to come up with dynamic programming for this. Counting from the end, we can count multiple occurrence only once, by the most right occurrence only.

Comment hidden because of low score. Click to expand.
1
of 3 vote

Base case, i=0.
Probability of an element hashing in a slot = 1/M
Probability of an element not hashing in a particular slot (ending in remaining M-1 slots) = (M-1)/M
All N elements not hashing in that slot = (M-1)/M * (M-1)/M * (M-1)/M ...
= ((M-1)/M)^N

Let us take the case of i=1.
First element can hash at any possible slot, probability = 1.
Now, probability of remaining N-1 elements not hashing in that particular slot = ((M-1)/M)^(N-1)

Let us take the case of i=2.
First element can hash at any possible slot, probability = 1.
Probability of second element going into same slot = 1/M
Now, probability of remaining N-2 elements not hashing in that particular slot = ((M-1)/M)^(N-2)
Probability of a slot having 2 elements = (1/M)*((M-1)/M)^(N-2)

To make it generic:

P(i) = ((M-1)/M)^N, when i=0
P(i) = (1/M)^(i-1) * ((M-1)/M)^N, otherwise

Comment hidden because of low score. Click to expand.
0

Very clear explanation.
Sounds good to me.

Comment hidden because of low score. Click to expand.
3

I'm afraid it's wrong. You are calculating the probability that a particular slot has i elements, not "the probability that there will be a slot with i elements".

Comment hidden because of low score. Click to expand.
0

henryadam, you're thinking of (1/M)^i * ((M-1)/M)^N

Comment hidden because of low score. Click to expand.
1
of 3 vote

Hint: Inclusion-Exclusion Principle

Comment hidden because of low score. Click to expand.
1
of 1 vote

Here is the only correct answer on the page, you can find the details in my blog post: goo.gl/Ex9an1

``````private static double calculateKcollisions(int m, int n, int k) {
double probability = 0;

int sign = 1;

for (int i = 1; n - i * k >= 0 && i <= m; i++) {
double p = binomial(m, i) * pow(1.0 / m, i * k);
p *= pow((double) (m - i) / m, n - i * k);

for (int j = 0; j < i; j++) {
p *= binomial(n - j * k, k);
}

probability += sign * p;
sign *= -1;
}

return probability;
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

(1/M)^i (M-1/M)^(N-i)

Comment hidden because of low score. Click to expand.
0

This is wrong. The above formula is for question "what is the probability that slot x(i.e. a particular slot) has i elements".

The question is asking what is the probability that THERE WILL BE A SLOT with i elements.

Comment hidden because of low score. Click to expand.
0

Do you mean (1/M)^i (1-1/M)^(N-i) ?

For example, say i=2 and suppose you have keys labeled K0, K1, K2... This would be the probability that a certain pair of keys, say K1 and K2 for example land in a particular spot (while all other keys land somewhere else). joe kid multilplies the above expression by N choose i and gets (for this example) the probably any possible pair landing in a particular bucket.

Comment hidden because of low score. Click to expand.
0
of 0 vote

This is statistics math problem.

I would think it this way.

(1)P(there is one slot with i elements) = 1- P(there is no slot with i elements)

(2)P(there is no slot with i elements) = P(the first slot does not have i elements)*P(the second slot does not have i elements)*...*P(the last slot does not have i elements)

Because the probability of any slot does not have i elements is the same, we get

(2*) P(there is no slot with i elements) = P(a slot does not have i elements)^N

(3) P(a slot does not have i elements) = 1- P(a slot has i elements)

(4) P(a slot has i elements) = [(1/N)^i] * [(1-1/N)^(N-i)]

Use (1), (2*), (3), (4) to solve the problem.

Comment hidden because of low score. Click to expand.
0

I'm not sure if (2*) is correct, because the probabilities of "a slot does not have i elements" are not independent, so can you simply multiply them?

Comment hidden because of low score. Click to expand.
0
of 0 vote

(c(N,i)*(M-1)^N-i)/M^N

Comment hidden because of low score. Click to expand.
1
of 1 vote

I might miss something, but you can choose the bucket in M ways, so you have to multiply numerator by M.

Comment hidden because of low score. Click to expand.
0
of 0 vote

The probability of an element being in one slot = 1/M
Consider i=2, probability that there will be a slot with i elements? will be 1/M*1/M

Since 1<i<N probability that there will be a slot with i elements = ∏ 1/M^i, where i=1 to N

Comment hidden because of low score. Click to expand.
0
of 0 vote

# of ways to drop N samples into M slots = M^N (denote this as denominator).
binding i samples together, dropping the bundle into one of M slots, and dropping the other samples into the remaining M-1 slots, the number of ways to do this = C(N, i) * M * (M-1) ^ (N-i) (denote this as numerator).
The probability is therefore numerator / denominator.

Comment hidden because of low score. Click to expand.
0

For example, N = 6, M =3, i =3. You can get the combination 123, 456, _ by binding 123 and 456 separately, thus you've counted it twice.
(123), 456, _
123, (456), _

Comment hidden because of low score. Click to expand.
0
of 0 vote

The probability that a particular slot has i elements = (1/M)^i * ((M-1)/M)^(N-i)
No. of ways that slot can be chosen = C(M,1) = M
Therefore probability that a slot has i elements = M * (1/M)^i * ((M-1)/M)^(N-i)

Note: There would be no C(N,i) because we do not get to choose which i elements go to that slot. It is already pre-decided by the hash function.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Probability of an event = No of occurences of an event / All possible event occurences

Total No of ways to put N objects in M slots = M^N
Total No of ways such that there is a slot with i objects = No of ways to put i objects in a slot * no of ways to put N-i objects in M-1 slots

No of ways to put i objects in 'a' slot = choose a slot * choose i objects * 1^i (since there is a singel way to put i objects into a single chosen slot)

No of ways to put i objects in 'a' slot = MC1 * NCi * 1
No of ways to put N-i objects in M-1 slots = (M-1) ^ (N-i)

Probability = MC1 * NCi * (M-1)^(N-i) / (M^N)

but I am worried about how to eliminate the duplicates.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Let any one of m slots contains exactly i element and remaining n-i elements distributed among rest (m-i) slots =
C(m,1)*C(n,i)*((1/m)^i)*((1-1/m)^n-1) ..1

But above formula counts multiple time if there can be more than 1 slot with i elements. Probability that more than 1 slot with i element =
Summation over k = 2 to [n/i], (k-1)*C(m, k)*C(n, k*i)* C(k*i, i)*C((k-1)*i, i)*...C(i,i) *((1/m)^k*i)*((1-1/m)^(n-k*i)) ..2

Substract ..2 from ..1, we get the answer:
C(m,1)*C(n,i)*((1/m)^i)*((1-1/m)^n-1) -
Summation over k = 2 to [n/i], (k-1)*C(m, k)*C(n, k*i)* C(k*i, i)*C((k-1)*i, i)*...C(i,i) *((1/m)^k*i)*((1-1/m)^(n-k*i))

Comment hidden because of low score. Click to expand.
0

Edit int

Let any one of m slots contains exactly i element and remaining n-i elements distributed among rest (m-i) slots =
C(m,1)*C(n,i)*((1/m)^i)*((1-1/m)^n-1) ..1

But above formula counts multiple time if there can be more than 1 slot with i elements. Probability that more than 1 slot with i element =
Summation over k = 1 to [n/2i], C(m, 2*k)*C(n, 2*k*i)* C(2*k*i, i)*C((2*k-1)*i, i)*...C(i,i) *((1/m)^2*k*i)*((1-1/m)^(n-2*k*i)) ..2
//Prob of odd no. of slots with i elements cancels out, so summation over 2k number of slots with i elements included

Substract ..2 from ..1, we get the answer:
C(m,1)*C(n,i)*((1/m)^i)*((1-1/m)^n-1) -
Summation over k = 1 to [n/2i], C(m, 2*k)*C(n, 2*k*i)* C(2*k*i, i)*C((2*k-1)*i, i)*...C(i,i) *((1/m)^2*k*i)*((1-1/m)^(n-2*k*i))

Comment hidden because of low score. Click to expand.
0
of 0 vote

This follows the binomial distribution with parameters (N, 1/M) and so P(X = i) = N Ci (1/M)^i(M-1/M)N-i. This is because all M slots were equally probable. If the probabilities of falling into a slot varies, that would be a multinomial distribution(with different probabilities for individual trial for each slot)

Comment hidden because of low score. Click to expand.
0
of 0 vote

My attempt, feels right but probably wrong.

What is the probability that there will be at least 1 of m slots with (exactly) k elements in a hash table with n inserted elements.

Assuming k < n.

Total ways to bucket n elements into m slots = m^n

Pr [ the first slot has exactly k elements ] =
= [number of ways n-k elements are in one of the other m - 1 buckets (which effectively enforces that k elements are in the first bucket)] / [number of ways n elements are in m slots]
= (m-1)^(n-k) / (m^n)

Pr [ exactly 0 slots have exactly k elements]
= product over i=1...m of Pr [ith slot does not have k elements]
= product over i=1...m {1 - Pr [ith slot has exactly k elements]}
= {1 - [(m-1)^(n-k) / m^n]}^m

Pr [ > 0 slots have exactly k elements] = 1 - Pr [ 0 slots have exactly k elements] = 1 - {1 - [(m-1)^(n-k) / m^n]}^m

Comment hidden because of low score. Click to expand.
0
of 0 vote

Pr("there exist a slot with exactly i items") = newtonSymbol(N,i) * M * (M-1)^(N-i) / M^N

Comment hidden because of low score. Click to expand.
0
of 0 vote

This is a problem of poisson approximation to the binomial distribution.

Total number of keys = N
In an average if two keys hash to the same bucket, then N = 2M

P(i) ==> Probability that i keys collide

P(i) = e ^ (-lambda) * (lamda^i) / i!

where lambda = np

where n = total number of keys = 2M
where p = probability that a bucket is chosen = 1/M

therefore lambda = 2.
Hence p(i) = e^-2 * (2^i)/i!

Comment hidden because of low score. Click to expand.
0
of 0 vote

try this:
(M+N-i-2)! / ((M-3)! * (N-i)!)

Comment hidden because of low score. Click to expand.
0
of 0 vote

The solutions using Bernoulli trial are not correct, because they count multiple occurrence of i multiple times. They are quite close though.

Let me introduce dynamic programming that may be helpful with this problem. Hopefully there are no bugs, but if they are, feel free to raise your hands.
Let P(M,N) be a function that counts probability that within N fields, there will be at least one with exactly i matches from the N numbers hashed inside.

``````P(M,n < i) =0
P(0,i) =0
P(M,N) = sum (for x <- 0..N:
if( x==i) :"probability that exactly x integers matches into Nth slot"
else: "probability that exactly x integers matches into Nth slot" * P(M-1,N-x)
)``````

The probability in quotes is (1/M)^x * ( (M-1)/M ^ (N-x) ) * (n choose x)

Comment hidden because of low score. Click to expand.
0
of 0 vote

We first calculate the probability P that none of the slots have i elements, then the answer should be 1 - P.

Calculating P is a traditional interview question:
find all possible M non-negative (<=N) numbers such that their sum is N and none of them == i.

This can be done by a recursion algorithm.

``````std::vector<std::size_t> v;
v.reserve(M);

double g() {
return (1/M)^N;
}

void f(std::vector<std::size_t> &v, int i, double &p, int sum) {
assert(i<=M);
if (i == M) {
p += g();
return;
}
for (int j=0; j<N; ++j) {
if (sum + j < N) {
v[i] = j;
f(i+1, v, p, sum+j);
}
}
}

f(0, v, 0, 0);``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

Isn't the question "what the probability that there's at least one slot with i elements" ?

``````Let
N = 2
M = 4

Then….
0 0 0 2
0 0 2 0
0 2 0 0
2 0 0 0
0 0 1 1
0 1 0 1
1 0 0 1
0 1 1 0
1 0 1 0
1 1 0 0
=> Total 10 cases
(now......... "i" can be anywhere from 0 to N)
Assume i =  0?
100%: all cases have at least one slot with 0 element
Assume i =  1?
60%: 6 out of 10 cases have at least one slot with 1 element
Assume i =  2?
40%``````

Comment hidden because of low score. Click to expand.
-1
of 1 vote

A typo in my above anwser:

(2*) P(there is no slot with i elements) = P(a slot does not have i elements)^M

(4) P(a slot has i elements) = [(1/M)^i] * [(1-1/M)^(N-i)]

Comment hidden because of low score. Click to expand.
0

If M is 1, then (1-1/M)^(N-i) becomes 0, so the probability becomes 0, when it should be 1.

Comment hidden because of low score. Click to expand.
0

Alex, lets say we have only one slot: M = 1. The values are always hashed in the same slot. So, we have only 2 possible results: probability = 1 if i = N and 0 otherwise.

N = i, leads to (1-1/M)^(N-i) = 0^0 = 1 by convention
N != i leads to 0

the formula gives the expected values

Comment hidden because of low score. Click to expand.
0

I believe your answer is correct, except for one small thing in Step 4. The expression should be multiplied by c(N, i).

Comment hidden because of low score. Click to expand.
0

@Zoro, can you explain how did you reach the formula in step 4? I think, it should be (1/M)^(i-1) in the first part. See explanation below.

Comment hidden because of low score. Click to expand.
0

Let's re-sentence (2*) (3) (4) to avoid confusion here:

(2*) P(there is no slot with i elements) = P(the first slot does not have i elements)^M

(3) P(the first slot does not have i elements) = 1- P(the first slot has i elements)

(4) P(the first slot has i elements) = [(1/N)^i] * [(1-1/N)^(N-i)]

Comment hidden because of low score. Click to expand.
0

The reasoning starts from (1) and goes to (4), to solve the problem, solve the equations in reverse order to solve (1).

(1)P(there is one slot with i elements) = 1- P(there is no slot with i elements)

(2)P(there is no slot with i elements) = P(the first slot does not have i elements)*P(the second slot does not have i elements)*...*P(the last slot does not have i elements)

(2)->(2*) because the slots have equal chance of having any element fall into it

(2*) P(there is no slot with i elements) = P(the first slot does not have i elements)^M

(3) P(the first slot does not have i elements) = 1- P(the first slot has i elements)

(4) P(the first slot has i elements) = [(1/M)^i] * [(1-1/M)^(N-i)]

Comment hidden because of low score. Click to expand.
0

I just realize that Anish made a great point. To have i elements in the first slot, you are freely choosing any i from the N, therefore there's cases, each with the same probability.

(4) P(the first slot has i elements) = [(1/M)^i] * [(1-1/M)^(N-i)] * C(N,i)

Comment hidden because of low score. Click to expand.
0

step 1. (1)P(there is one slot with i elements) = 1- P(there is no slot with i elements)
should it be like (1)P(there is one slot with i elements) = 1- P(there is no slot with i elements) - P(there is two slots with i elements) ...etc?

Comment hidden because of low score. Click to expand.
0

This is incorrect. Suppose M = 10, N=1, and i=1. The formula does not yield the expected value of 1. The flaw in the analysis above is that (2) does not hold because the events are not independent.

Comment hidden because of low score. Click to expand.
-1
of 1 vote

Probability of 1 slot having i elements P(1)=

``(1/M)^i * ((M-1)/M)^N-i``

Probability of 2 slots having i elements P(2)=

``(1/M)^2i * ((M-1)/M)^N-2i``

Probability of 3 slots having i elements P(3)=

``(1/M)^3i * ((M-1)/M)^N-3i``

...
and so on, so the probability of having A slot with i elements is: P(1)+P(2)+....+P(M)
but I am not sure of a formula to generalize this equation.

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.