Google Interview Question for Software Engineer / Developers


Country: United States
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
3
of 11 vote

Why not to consider this as Bernoulli trial, where:
1/M - probability of single success (that a value goes to a single slot)
i - expected numbers of success (the success = the value goes exactly to the same slot)
N - numbers of tries
1 - 1/M - a probability of failure:

So then having a Bernoulli trial formula where (N i) stands for binomial coefficient:

P(i)=(N i) * (1/M)^i * (1 - 1/M) ^(N - i)

- joe kidd August 19, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

What if N=2*i? There is a case that 2 slots with i elements in each and you are counting this case twice. What if N=k*i?

- NersusZ September 17, 2013 | Flag
Comment hidden because of low score. Click to expand.
4
of 4 votes

But since there are M single slot for success, I think you should multiply it by M.

- Charlie September 18, 2013 | Flag
Comment hidden because of low score. Click to expand.
1
of 1 vote

I'm not sure what's posted is correct so far.

Looks like what joe kid posted gives the probability of this outcome:
(1) Exactly i items will land in one particular bucket b. All other N-i items land in any of the other buckets. To me, this seems incomplete because it considers just one special bucket b.

As a correction, Charlie suggests multiplication by M to make this the probability of exactly i items landing in *any* bucket. This is using the assumption that the outcomes defined by (1) are mutually exclusive for each bucket b, because you would need that assumption to use the sum all M of them to get the probability of any of them occuring. However, they are not mutually exclusive. You can have i in one bucket and also have i land in another bucket by chance.

- R September 26, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

you are correct, this is probably not correct solution. Because of that, I tried to come up with dynamic programming for this. Counting from the end, we can count multiple occurrence only once, by the most right occurrence only.

- mbriskar June 06, 2015 | Flag
Comment hidden because of low score. Click to expand.
1
of 3 vote

Base case, i=0.
Probability of an element hashing in a slot = 1/M
Probability of an element not hashing in a particular slot (ending in remaining M-1 slots) = (M-1)/M
All N elements not hashing in that slot = (M-1)/M * (M-1)/M * (M-1)/M ...
= ((M-1)/M)^N

Let us take the case of i=1.
First element can hash at any possible slot, probability = 1.
Now, probability of remaining N-1 elements not hashing in that particular slot = ((M-1)/M)^(N-1)


Let us take the case of i=2.
First element can hash at any possible slot, probability = 1.
Probability of second element going into same slot = 1/M
Now, probability of remaining N-2 elements not hashing in that particular slot = ((M-1)/M)^(N-2)
Probability of a slot having 2 elements = (1/M)*((M-1)/M)^(N-2)

To make it generic:

P(i) = ((M-1)/M)^N, when i=0
P(i) = (1/M)^(i-1) * ((M-1)/M)^N, otherwise

- Mukesh August 14, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Very clear explanation.
Sounds good to me.

- Anony1 November 30, 2013 | Flag
Comment hidden because of low score. Click to expand.
3
of 3 votes

I'm afraid it's wrong. You are calculating the probability that a particular slot has i elements, not "the probability that there will be a slot with i elements".

- henryadam December 09, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

henryadam, you're thinking of (1/M)^i * ((M-1)/M)^N

- Anonymous December 10, 2014 | Flag
Comment hidden because of low score. Click to expand.
1
of 3 vote

Hint: Inclusion-Exclusion Principle

- NotImplemented September 15, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

Here is the only correct answer on the page, you can find the details in my blog post: goo.gl/Ex9an1

private static double calculateKcollisions(int m, int n, int k) {
    double probability = 0;

    int sign = 1;

    for (int i = 1; n - i * k >= 0 && i <= m; i++) {
      double p = binomial(m, i) * pow(1.0 / m, i * k);
      p *= pow((double) (m - i) / m, n - i * k);

      for (int j = 0; j < i; j++) {
        p *= binomial(n - j * k, k);
      }

      probability += sign * p;
      sign *= -1;
    }

    return probability;
  }

- baris evrim January 30, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

(1/M)^i (M-1/M)^(N-i)

- Den August 13, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

This is wrong. The above formula is for question "what is the probability that slot x(i.e. a particular slot) has i elements".

The question is asking what is the probability that THERE WILL BE A SLOT with i elements.

- Zoro August 13, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Do you mean (1/M)^i (1-1/M)^(N-i) ?

For example, say i=2 and suppose you have keys labeled K0, K1, K2... This would be the probability that a certain pair of keys, say K1 and K2 for example land in a particular spot (while all other keys land somewhere else). joe kid multilplies the above expression by N choose i and gets (for this example) the probably any possible pair landing in a particular bucket.

- R September 26, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

This is statistics math problem.

I would think it this way.

(1)P(there is one slot with i elements) = 1- P(there is no slot with i elements)

(2)P(there is no slot with i elements) = P(the first slot does not have i elements)*P(the second slot does not have i elements)*...*P(the last slot does not have i elements)

Because the probability of any slot does not have i elements is the same, we get

(2*) P(there is no slot with i elements) = P(a slot does not have i elements)^N

(3) P(a slot does not have i elements) = 1- P(a slot has i elements)

(4) P(a slot has i elements) = [(1/N)^i] * [(1-1/N)^(N-i)]

Use (1), (2*), (3), (4) to solve the problem.

- Zoro August 13, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

I'm not sure if (2*) is correct, because the probabilities of "a slot does not have i elements" are not independent, so can you simply multiply them?

- henryadam December 09, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

how about this:
(c(N,i)*(M-1)^N-i)/M^N

- ahmad August 13, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

I might miss something, but you can choose the bucket in M ways, so you have to multiply numerator by M.

- pi3141592 September 02, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

The probability of an element being in one slot = 1/M
Consider i=2, probability that there will be a slot with i elements? will be 1/M*1/M

Since 1<i<N probability that there will be a slot with i elements = ∏ 1/M^i, where i=1 to N

- Cloudster August 14, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

# of ways to drop N samples into M slots = M^N (denote this as denominator).
binding i samples together, dropping the bundle into one of M slots, and dropping the other samples into the remaining M-1 slots, the number of ways to do this = C(N, i) * M * (M-1) ^ (N-i) (denote this as numerator).
The probability is therefore numerator / denominator.

- zf August 15, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Obviously your answer hasn't eliminated duplicates.
For example, N = 6, M =3, i =3. You can get the combination 123, 456, _ by binding 123 and 456 separately, thus you've counted it twice.
(123), 456, _
123, (456), _

- henryadam December 09, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

The probability that a particular slot has i elements = (1/M)^i * ((M-1)/M)^(N-i)
No. of ways that slot can be chosen = C(M,1) = M
Therefore probability that a slot has i elements = M * (1/M)^i * ((M-1)/M)^(N-i)

Note: There would be no C(N,i) because we do not get to choose which i elements go to that slot. It is already pre-decided by the hash function.

- karthikthehun August 23, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Probability of an event = No of occurences of an event / All possible event occurences

Total No of ways to put N objects in M slots = M^N
Total No of ways such that there is a slot with i objects = No of ways to put i objects in a slot * no of ways to put N-i objects in M-1 slots

No of ways to put i objects in 'a' slot = choose a slot * choose i objects * 1^i (since there is a singel way to put i objects into a single chosen slot)

No of ways to put i objects in 'a' slot = MC1 * NCi * 1
No of ways to put N-i objects in M-1 slots = (M-1) ^ (N-i)

Probability = MC1 * NCi * (M-1)^(N-i) / (M^N)


but I am worried about how to eliminate the duplicates.

- NiP P o August 27, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Let any one of m slots contains exactly i element and remaining n-i elements distributed among rest (m-i) slots =
C(m,1)*C(n,i)*((1/m)^i)*((1-1/m)^n-1) ..1

But above formula counts multiple time if there can be more than 1 slot with i elements. Probability that more than 1 slot with i element =
Summation over k = 2 to [n/i], (k-1)*C(m, k)*C(n, k*i)* C(k*i, i)*C((k-1)*i, i)*...C(i,i) *((1/m)^k*i)*((1-1/m)^(n-k*i)) ..2

Substract ..2 from ..1, we get the answer:
C(m,1)*C(n,i)*((1/m)^i)*((1-1/m)^n-1) -
Summation over k = 2 to [n/i], (k-1)*C(m, k)*C(n, k*i)* C(k*i, i)*C((k-1)*i, i)*...C(i,i) *((1/m)^k*i)*((1-1/m)^(n-k*i))

- Anonymous August 31, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Edit int

Let any one of m slots contains exactly i element and remaining n-i elements distributed among rest (m-i) slots =
C(m,1)*C(n,i)*((1/m)^i)*((1-1/m)^n-1) ..1

But above formula counts multiple time if there can be more than 1 slot with i elements. Probability that more than 1 slot with i element =
Summation over k = 1 to [n/2i], C(m, 2*k)*C(n, 2*k*i)* C(2*k*i, i)*C((2*k-1)*i, i)*...C(i,i) *((1/m)^2*k*i)*((1-1/m)^(n-2*k*i)) ..2
//Prob of odd no. of slots with i elements cancels out, so summation over 2k number of slots with i elements included

Substract ..2 from ..1, we get the answer:
C(m,1)*C(n,i)*((1/m)^i)*((1-1/m)^n-1) -
Summation over k = 1 to [n/2i], C(m, 2*k)*C(n, 2*k*i)* C(2*k*i, i)*C((2*k-1)*i, i)*...C(i,i) *((1/m)^2*k*i)*((1-1/m)^(n-2*k*i))

- Anonymous August 31, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

This follows the binomial distribution with parameters (N, 1/M) and so P(X = i) = N Ci (1/M)^i(M-1/M)N-i. This is because all M slots were equally probable. If the probabilities of falling into a slot varies, that would be a multinomial distribution(with different probabilities for individual trial for each slot)

- Anonymous September 01, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

My attempt, feels right but probably wrong.

What is the probability that there will be at least 1 of m slots with (exactly) k elements in a hash table with n inserted elements.

Assuming k < n.

Total ways to bucket n elements into m slots = m^n

Pr [ the first slot has exactly k elements ] =
= [number of ways n-k elements are in one of the other m - 1 buckets (which effectively enforces that k elements are in the first bucket)] / [number of ways n elements are in m slots]
= (m-1)^(n-k) / (m^n)

Pr [ exactly 0 slots have exactly k elements]
= product over i=1...m of Pr [ith slot does not have k elements]
= product over i=1...m {1 - Pr [ith slot has exactly k elements]}
= {1 - [(m-1)^(n-k) / m^n]}^m

Pr [ > 0 slots have exactly k elements] = 1 - Pr [ 0 slots have exactly k elements] = 1 - {1 - [(m-1)^(n-k) / m^n]}^m

- 6b72 September 02, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Pr("there exist a slot with exactly i items") = newtonSymbol(N,i) * M * (M-1)^(N-i) / M^N

- pi3141592 September 02, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

This is a problem of poisson approximation to the binomial distribution.

Total number of keys = N
In an average if two keys hash to the same bucket, then N = 2M

P(i) ==> Probability that i keys collide

P(i) = e ^ (-lambda) * (lamda^i) / i!

where lambda = np

where n = total number of keys = 2M
where p = probability that a bucket is chosen = 1/M

therefore lambda = 2.
Hence p(i) = e^-2 * (2^i)/i!

- dinesh October 03, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

try this:
(M+N-i-2)! / ((M-3)! * (N-i)!)

- jl9394 April 25, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

The solutions using Bernoulli trial are not correct, because they count multiple occurrence of i multiple times. They are quite close though.

Let me introduce dynamic programming that may be helpful with this problem. Hopefully there are no bugs, but if they are, feel free to raise your hands.
Let P(M,N) be a function that counts probability that within N fields, there will be at least one with exactly i matches from the N numbers hashed inside.

P(M,n < i) =0 
P(0,i) =0 
P(M,N) = sum (for x <- 0..N:
			if( x==i) :"probability that exactly x integers matches into Nth slot"
                        else: "probability that exactly x integers matches into Nth slot" * P(M-1,N-x)
                 )

The probability in quotes is (1/M)^x * ( (M-1)/M ^ (N-x) ) * (n choose x)

- mbriskar June 06, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

We first calculate the probability P that none of the slots have i elements, then the answer should be 1 - P.

Calculating P is a traditional interview question:
find all possible M non-negative (<=N) numbers such that their sum is N and none of them == i.

This can be done by a recursion algorithm.

std::vector<std::size_t> v;
    v.reserve(M);

    double g() {
       return (1/M)^N;
    }

    void f(std::vector<std::size_t> &v, int i, double &p, int sum) {
       assert(i<=M);
       if (i == M) {
          p += g();
          return;
       }
       for (int j=0; j<N; ++j) {
          if (sum + j < N) {
              v[i] = j;
              f(i+1, v, p, sum+j);
          }
       }
    }

    f(0, v, 0, 0);

- zjzzjz August 28, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

A typo in my above anwser:

(2*) P(there is no slot with i elements) = P(a slot does not have i elements)^M

(4) P(a slot has i elements) = [(1/M)^i] * [(1-1/M)^(N-i)]

- Zoro August 13, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

If M is 1, then (1-1/M)^(N-i) becomes 0, so the probability becomes 0, when it should be 1.

- Alex August 13, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Alex, lets say we have only one slot: M = 1. The values are always hashed in the same slot. So, we have only 2 possible results: probability = 1 if i = N and 0 otherwise.

N = i, leads to (1-1/M)^(N-i) = 0^0 = 1 by convention
N != i leads to 0

the formula gives the expected values

- Miguel Oliveira August 13, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I believe your answer is correct, except for one small thing in Step 4. The expression should be multiplied by c(N, i).

- Anish August 14, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@Zoro, can you explain how did you reach the formula in step 4? I think, it should be (1/M)^(i-1) in the first part. See explanation below.

- Mukesh August 14, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Let's re-sentence (2*) (3) (4) to avoid confusion here:

(2*) P(there is no slot with i elements) = P(the first slot does not have i elements)^M

(3) P(the first slot does not have i elements) = 1- P(the first slot has i elements)

(4) P(the first slot has i elements) = [(1/N)^i] * [(1-1/N)^(N-i)]

- Zoro August 14, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

The reasoning starts from (1) and goes to (4), to solve the problem, solve the equations in reverse order to solve (1).

(1)P(there is one slot with i elements) = 1- P(there is no slot with i elements)

(2)P(there is no slot with i elements) = P(the first slot does not have i elements)*P(the second slot does not have i elements)*...*P(the last slot does not have i elements)

(2)->(2*) because the slots have equal chance of having any element fall into it

(2*) P(there is no slot with i elements) = P(the first slot does not have i elements)^M

(3) P(the first slot does not have i elements) = 1- P(the first slot has i elements)

(4) P(the first slot has i elements) = [(1/M)^i] * [(1-1/M)^(N-i)]

- Zoro August 14, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I just realize that Anish made a great point. To have i elements in the first slot, you are freely choosing any i from the N, therefore there's cases, each with the same probability.

(4) P(the first slot has i elements) = [(1/M)^i] * [(1-1/M)^(N-i)] * C(N,i)

- Zoro August 14, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

step 1. (1)P(there is one slot with i elements) = 1- P(there is no slot with i elements)
should it be like (1)P(there is one slot with i elements) = 1- P(there is no slot with i elements) - P(there is two slots with i elements) ...etc?

- zwd August 14, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 2 votes

This is incorrect. Suppose M = 10, N=1, and i=1. The formula does not yield the expected value of 1. The flaw in the analysis above is that (2) does not hold because the events are not independent.

- eugene.yarovoi August 15, 2013 | Flag
Comment hidden because of low score. Click to expand.
-1
of 1 vote

Probability of 1 slot having i elements P(1)=

(1/M)^i * ((M-1)/M)^N-i

Probability of 2 slots having i elements P(2)=

(1/M)^2i * ((M-1)/M)^N-2i

Probability of 3 slots having i elements P(3)=

(1/M)^3i * ((M-1)/M)^N-3i

...
and so on, so the probability of having A slot with i elements is: P(1)+P(2)+....+P(M)
but I am not sure of a formula to generalize this equation.

- IntwPrep August 19, 2013 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More