## Microsoft Interview Question Software Engineer / Developers

- 0of 0 votes
An array A[1...n] contains all the integers from 0 to n except one. In this problem, we cannot access an entire integer in A with a single operation. The elements of A are represented in binary, and the only operation we can use to access them is "fetch the jth bit of A[i]", which takes constant time. Find the missing integer in O(n) time.

Hi All. It seems none of you have figured out the correct solution.

Have a look:

If n is a k bit binary number we can view the array A as a k by n two dimensional array such that A[i; j] is the ith most significant bit of the jth element of A. Examining an elment of this 2D matrix has unit cost.

To begin with assume n = 2^k - 1 so there are n + 1 = 2^k potential numbers. The basic idea is as follows:

half the potential numbers have a zero in their most significant bit position and half have a 1.

Thus we count the number of 0's and store this in Z. If Z equals 2^k -1 then we know there are only n - 2^(k - 1) = 2^(k-1) - 1 numbers starting with 1, so the missing number starts with a 1 and we can ignore all the numbers which start with a zero.

We now have a new problem of exactly the same form as the original: Find the missing number among the 2^(k-1) -1 numbers in our list which start with a 1 out of the 2^(k-1) possible numbers which start with a 1.

Array B records each index of a number in A which start with a 1.

Similarly, If Z = 2^(k-1) -1 we know the missing value starts with a zero, and we focus on the locations in A whose first bit is zero.

Thus we can do exactly the same process on the next most significant bit, considering only those numbers which start with a 1.

We will again find whether our missing number has a zero or one in its second most significant bit position, and will thus be able to eliminate half the "live" locations in A.

If there are m live positions in A we can eliminate half the live positions in O(m) time since we just need to check the current bit in each live position and record the index of each live number in our new list. Thus we get a recurrence of T(n) = cn + T(n=2) which has solution of T(n) = O(n)

(using the master method or just expanding it).

When n < 2^k - 1 we do not have an exactly even split at each step. However, we can easily calculate the ideal number of 0's and 1's at each step (e.g. if n is a k-bit number, there will be exactly 2^(k-1) possible k-bit numbers which start with a bit of zero, and (n + 1) - 2^(k-1) possible k-bit numbers

which start with a bit value of 1 (since n + 1 total values in the range 0..n.) ) So we can tell which bit value is missing at each step.

Also, if we fix the value of the first b bits, the number of possible values is always at most 2^(k-b) , so this upper bound does half at each step

Add all the integers and subtract it from the (n(n+1)/2) ... the missing number is the difference

require O(n) time

this way takes 32 times to fetch a single interger,

total O(32n), not so good

@ redroof

O(32n) is O(n). Particularly when n is large, then this equation is very much true. Now what is not good ? Did u try to understand what is not good ?

Every bit has an occurrence frequency for the bit 1.

This frequency is n for pos 1.

n/2 for pos 2.

n/4 for pos 3 and so on.

Find the freq of all the bits and compare with the actual count that needs to be present. The position where the bit count is less or more gives a way to construct the missing number.

The solution that take sum of n numbers and subtract the sum of no given in order to find the missing no is not accurate as it will be exponential in nature. remember you are finding the product as n(n+1)/2. if n is very large n won't be able to fit in 32 bit word and so this multiplication will go in exponential of the no. of bits used to represent the number n. The best solution is as follows:

Lets say n = 8 and u have no. from 0-7 write in binary form all the no.You need 3 bits to represent each no. Take the column representing the MSB. Now it should have equal no. of 1's and 0's but since there is one element missing so it will have one of 1 or 0 less. Find out by counting the total no. of 0 and 1 in MSB. Lets say no. of 1 is more than no. of 0 so it means the missing no has its MSB as 0. Repeat this for the second column of bits but now counting no. of 1 and o only for those digits which has 0 as the MSb and so in this iteration you only have to scan through n/2 elements. Do this until u r r done with the LSB and so the total time taken will be n + n/2 + n/4 + ---- = O(n) complexity.

Aaron, as far as I know this technique only works when n is a power of 2. If it is not, then this will fail.

trying to extend Gaurav's method for n!=power of 2

we know n, so we know the no of 1s and 0s per bit, that wud have been present(apart from what we have now) if the array had actually contained numbers upto the next power of 2.

say n is 10010111 (8 digits)

then for LSB, since it is ending in 1, we will have equal no of 1s and 0s from n to the next power of 2. had it been zero, we wud have had one extra 1

for next bit, since it is one and all bits to its right(in this case, only the one LSB) are one, we will have equal number of 1s and zeroes

same for next bit

for the 4th bit, it is a 0. Also, all bits to its right are one, so on the next number, this bit will change to 1 and remain so for 2^3 numbers, when we will get a sequence of four 1s. after that thr will be equal number of 0s and 1s . so for this bit we have 8 extra 1s than 0s

for 5th bit, it is a one. The bits to its right are 0111. So (2^4-1) - 0111 = t(say). Then as we progress from n to 2^n-1, we first have t numbers where the 5th bit is 1, and then there we will have equal number of 0s and 1s(bcoz we have then arrived at the 11111 pattern)

So the GENERAL RESULT is that for any bit, if it is a 1, and the number formed by the 'q' bits to its right is 'y', then for that bit, we have 2^q -1 - y extra 1s over 0s. If that bit was 0 instead, then we will have (2^q-1) - (2^q-1 -y) extra 1s than zeroes, i.e y extra 1s than zeroes.

So using this general rule, we can compute the number of extra 1s or zeroes that wud be present for any bit in our array, had thr not been a number missing. Then we can add up and find the 'actual' difference in the number of 1s and zeroes for every bit in the array. Comparing the two, we will know the value of each bit of the missing number.

trying to extend Gaurav's method for n!=power of 2

we know n, so we know the no of 1s and 0s per bit, that wud have been present(apart from what we have now) if the array had actually contained numbers upto the next power of 2.

say n is 10010111 (8 digits)

then for LSB, since it is ending in 1, we will have equal no of 1s and 0s from n to the next power of 2. had it been zero, we wud have had one extra 1

for next bit, since it is one and all bits to its right(in this case, only the one LSB) are one, we will have equal number of 1s and zeroes

same for next bit

for the 4th bit, it is a 0. Also, all bits to its right are one, so on the next number, this bit will change to 1 and remain so for 2^3 numbers, when we will get a sequence of four 1s. after that thr will be equal number of 0s and 1s . so for this bit we have 8 extra 1s than 0s

for 5th bit, it is a one. The bits to its right are 0111. So (2^4-1) - 0111 = t(say). Then as we progress from n to 2^n-1, we first have t numbers where the 5th bit is 1, and then there we will have equal number of 0s and 1s(bcoz we have then arrived at the 11111 pattern)

So the GENERAL RESULT is that for any bit, if it is a 1, and the number formed by the 'q' bits to its right is 'y', then for that bit, we have 2^q -1 - y extra 1s over 0s. If that bit was 0 instead, then we will have (2^q-1) - (2^q-1 -y) extra 1s than zeroes, i.e y extra 1s than zeroes.

So using this general rule, we can compute the number of extra 1s or zeroes that wud be present for any bit in our array, had thr not been a number missing. Then we can add up and find the 'actual' difference in the number of 1s and zeroes for every bit in the array. Comparing the two, we will know the value of each bit of the missing number.

varun's solution is gud...but as it was said we can access the only bits but not entire number..we shud use bits to get the answer...

Yes, but n need not be 32 bit integer. If you access all the bits, then it is O(n log n) solution.

@abe dhakkan & @chinni_chor : the following is the working solution for this problem,

Our convention is to call the least significant bit (LSB) as 0-th bit. Number of digits required to represent n is at least lg(n + 1) . Most significant bit (MSB) of n is the ( lg(n + 1) − 1)-th bit of it. Among the integers between 0 to n, exactly 2( lg(n+1) −1) numbers are < 2( lg(n+1) −1) . If we look at the ( lg(n + 1) − 1)-th bit of any non-negative number ≤ n we can determine whether the number is < 2( lg(n+1) −1) . We now look at the ( lg(n + 1) − 1)-th bit of all the numbers in A[1...n]. We populate two lists of references (e.g. pointer, index, handle etc) during the first n lookup. Reference to the numbers having 0 in their ( lg(n + 1) − 1)-th bit are kept in one and the rest in the other. The size of the first list is the count of numbers < 2( lg(n+1) −1) . The missing integer is < 2( lg(n+1) −1) iff this count is 2( lg(n+1) −1) − 1. We look for the missing integer in the first list in that case. Otherwise, we look in the second list. In this

process, we figure out the ( lg(n + 1) − 1)-th bit of the missing integer in n bit lookup.

We want to use recursion to search the appropriate list. The fact that now we search in a list of references to numbers in stead of a list of numbers can be handled as follows. The initial problem comes with the list all the references. The fact that the search range could be shifted in the subproblem is handled by replacing all the ( lg(n + 1) − 1)-th bits as 0. However, as we never read the ( lg(n + 1) − 1)-th bits again we don’t have to do this modification explicitly.

geeze, it is just binary search,

give n, you should know from 1-n how many number has 1/0 for each bit,

so you will just do binary search,

check the first bit, see if the number that missing has the first bit 0 or 1,

then check the second bit from the result which has the missing number, and find whether it is 0/1.

keep going until you reach the last bit.

the time complexity is

n + n/2 + n/4 + ... ~= 2n

Guys correct me if i am wrong:

if the array is sorted, then the LSB will toggle every time

LSB

00 0

00 1

01 0

Soln:

```
long value = 0;
char bit = fetch(0,A[0]);
for(i = 1 ; i < n ; i++,value++, )
{
if(bit == fetch(0,A[i]))
break;
bit = fetch(0,A[i]);
}
printf("%ld",value);
```

int j;

int k;

// hash into same array

for (int i = 0; i < a.Length; i++)

{

j = a[i];

while (j >= 1 && j <= a.Length

&& j != i+1 && a[j-1] != j)

{

// swap i and j

k = a[i];

a[i] = a[j-1];

a[j-1] = k;

// repeat

j = a[i];

}

}

// find missing

for (int i = 0; i < a.Length; i++)

{

if (i+1 != a[i])

{

return i+1;

}

}

// none are missing

return a.Length+1;

}

Hi All. It seems none of you have figured out the correct solution.

Have a look:

If n is a k bit binary number we can view the array A as a k by n two dimensional array such that A[i; j] is the ith most significant bit of the jth element of A. Examining an elment of this 2D matrix has unit cost.

To begin with assume n = 2^k - 1 so there are n + 1 = 2^k potential numbers. The basic idea is as follows:

half the potential numbers have a zero in their most significant bit position and half have a 1.

Thus we count the number of 0's and store this in Z. If Z equals 2^k -1 then we know there are only n - 2^(k - 1) = 2^(k-1) - 1 numbers starting with 1, so the missing number starts with a 1 and we can ignore all the numbers which start with a zero.

We now have a new problem of exactly the same form as the original: Find the missing number among the 2^(k-1) -1 numbers in our list which start with a 1 out of the 2^(k-1) possible numbers which start with a 1.

Array B records each index of a number in A which start with a 1.

Similarly, If Z = 2^(k-1) -1 we know the missing value starts with a zero, and we focus on the locations in A whose first bit is zero.

Thus we can do exactly the same process on the next most significant bit, considering only those numbers which start with a 1.

We will again find whether our missing number has a zero or one in its second most significant bit position, and will thus be able to eliminate half the "live" locations in A.

If there are m live positions in A we can eliminate half the live positions in O(m) time since we just need to check the current bit in each live position and record the index of each live number in our new list. Thus we get a recurrence of T(n) = cn + T(n=2) which has solution of T(n) = O(n)

(using the master method or just expanding it).

When n < 2^k - 1 we do not have an exactly even split at each step. However, we can easily calculate the ideal number of 0's and 1's at each step (e.g. if n is a k-bit number, there will be exactly 2^(k-1) possible k-bit numbers which start with a bit of zero, and (n + 1) - 2^(k-1) possible k-bit numbers

which start with a bit value of 1 (since n + 1 total values in the range 0..n.) ) So we can tell which bit value is missing at each step.

Also, if we fix the value of the first b bits, the number of possible values is always at most 2^(k-b) , so this upper bound does half at each step

```
Hi All. It seems none of you have figured out the correct solution.
Have a look:
If n is a k bit binary number we can view the array A as a k by n two dimensional array such that A[i; j] is the ith most significant bit of the jth element of A. Examining an elment of this 2D matrix has unit cost.
To begin with assume n = 2^k - 1 so there are n + 1 = 2^k potential numbers. The basic idea is as follows:
half the potential numbers have a zero in their most significant bit position and half have a 1.
Thus we count the number of 0's and store this in Z. If Z equals 2^k -1 then we know there are only n - 2^(k - 1) = 2^(k-1) - 1 numbers starting with 1, so the missing number starts with a 1 and we can ignore all the numbers which start with a zero.
We now have a new problem of exactly the same form as the original: Find the missing number among the 2^(k-1) -1 numbers in our list which start with a 1 out of the 2^(k-1) possible numbers which start with a 1.
Array B records each index of a number in A which start with a 1.
Similarly, If Z = 2^(k-1) -1 we know the missing value starts with a zero, and we focus on the locations in A whose first bit is zero.
Thus we can do exactly the same process on the next most significant bit, considering only those numbers which start with a 1.
We will again find whether our missing number has a zero or one in its second most significant bit position, and will thus be able to eliminate half the "live" locations in A.
If there are m live positions in A we can eliminate half the live positions in O(m) time since we just need to check the current bit in each live position and record the index of each live number in our new list. Thus we get a recurrence of T(n) = cn + T(n=2) which has solution of T(n) = O(n)
(using the master method or just expanding it).
When n < 2^k - 1 we do not have an exactly even split at each step. However, we can easily calculate the ideal number of 0's and 1's at each step (e.g. if n is a k-bit number, there will be exactly 2^(k-1) possible k-bit numbers which start with a bit of zero, and (n + 1) - 2^(k-1) possible k-bit numbers
which start with a bit value of 1 (since n + 1 total values in the range 0..n.) ) So we can tell which bit value is missing at each step.
Also, if we fix the value of the first b bits, the number of possible values is always at most 2^(k-b) , so this upper bound does half at each step
```

<pre lang="java" line="1" title="CodeMonkey69533" class="run-this">Hi All. It seems none of you have figured out the correct solution.

Have a look:

If n is a k bit binary number we can view the array A as a k by n two dimensional array such that A[i; j] is the ith most significant bit of the jth element of A. Examining an elment of this 2D matrix has unit cost.

To begin with assume n = 2^k - 1 so there are n + 1 = 2^k potential numbers. The basic idea is as follows:

half the potential numbers have a zero in their most significant bit position and half have a 1.

Thus we count the number of 0's and store this in Z. If Z equals 2^k -1 then we know there are only n - 2^(k - 1) = 2^(k-1) - 1 numbers starting with 1, so the missing number starts with a 1 and we can ignore all the numbers which start with a zero.

We now have a new problem of exactly the same form as the original: Find the missing number among the 2^(k-1) -1 numbers in our list which start with a 1 out of the 2^(k-1) possible numbers which start with a 1.

Array B records each index of a number in A which start with a 1.

Similarly, If Z = 2^(k-1) -1 we know the missing value starts with a zero, and we focus on the locations in A whose first bit is zero.

Thus we can do exactly the same process on the next most significant bit, considering only those numbers which start with a 1.

We will again find whether our missing number has a zero or one in its second most significant bit position, and will thus be able to eliminate half the "live" locations in A.

If there are m live positions in A we can eliminate half the live positions in O(m) time since we just need to check the current bit in each live position and record the index of each live number in our new list. Thus we get a recurrence of T(n) = cn + T(n=2) which has solution of T(n) = O(n)

(using the master method or just expanding it).

When n < 2^k - 1 we do not have an exactly even split at each step. However, we can easily calculate the ideal number of 0's and 1's at each step (e.g. if n is a k-bit number, there will be exactly 2^(k-1) possible k-bit numbers which start with a bit of zero, and (n + 1) - 2^(k-1) possible k-bit numbers

which start with a bit value of 1 (since n + 1 total values in the range 0..n.) ) So we can tell which bit value is missing at each step.

Also, if we fix the value of the first b bits, the number of possible values is always at most 2^(k-b) , so this upper bound does half at each step

</pre>

Wouldn't it be possible to just construct the number by checking the bit counts in each position? For example if we have the integers 0,1,2,4,5,6,7 and 3 is missing then at position 0 we will have 4 zeros and 3 1s meaning were missing a 1 at position 0. at position 1 the same as position 0. at position 2 we will have 3 0 and 4 1 meaning were missing a 0 so we get (011)b = (3)d

```
Hi All. It seems none of you have figured out the correct solution.
Have a look:
If n is a k bit binary number we can view the array A as a k by n two dimensional array such that A[i; j] is the ith most significant bit of the jth element of A. Examining an elment of this 2D matrix has unit cost.
To begin with assume n = 2^k - 1 so there are n + 1 = 2^k potential numbers. The basic idea is as follows:
half the potential numbers have a zero in their most significant bit position and half have a 1.
Thus we count the number of 0's and store this in Z. If Z equals 2^k -1 then we know there are only n - 2^(k - 1) = 2^(k-1) - 1 numbers starting with 1, so the missing number starts with a 1 and we can ignore all the numbers which start with a zero.
We now have a new problem of exactly the same form as the original: Find the missing number among the 2^(k-1) -1 numbers in our list which start with a 1 out of the 2^(k-1) possible numbers which start with a 1.
Array B records each index of a number in A which start with a 1.
Similarly, If Z = 2^(k-1) -1 we know the missing value starts with a zero, and we focus on the locations in A whose first bit is zero.
Thus we can do exactly the same process on the next most significant bit, considering only those numbers which start with a 1.
We will again find whether our missing number has a zero or one in its second most significant bit position, and will thus be able to eliminate half the "live" locations in A.
If there are m live positions in A we can eliminate half the live positions in O(m) time since we just need to check the current bit in each live position and record the index of each live number in our new list. Thus we get a recurrence of T(n) = cn + T(n=2) which has solution of T(n) = O(n)
(using the master method or just expanding it).
When n < 2^k - 1 we do not have an exactly even split at each step. However, we can easily calculate the ideal number of 0's and 1's at each step (e.g. if n is a k-bit number, there will be exactly 2^(k-1) possible k-bit numbers which start with a bit of zero, and (n + 1) - 2^(k-1) possible k-bit numbers
which start with a bit value of 1 (since n + 1 total values in the range 0..n.) ) So we can tell which bit value is missing at each step.
Also, if we fix the value of the first b bits, the number of possible values is always at most 2^(k-b) , so this upper bound does half at each step
```

If they are ordered integers, they should go odd to even to odd to even. You only need to check the last bit, and if it does not switch between 1 and 0 then you know that the integer is missing. Take the index from the 1 or 0 skipped and you've found it.

People .... am I missing something or are you confused?

The point given is that we cant even read one integer completely into the memory ...All we can do is call a func(i,j) which returns jth bit of A[i]..... So at a time we will have access to 1 BIT OF 1 NUMBER in the array....

This cannot be done by n(n+1)/2 type methods....

Another one:

- varun_agg on June 12, 2008 Edit | Flag ReplyIf the numbers are too big, adding them may overflow.

Just take the XOR of all numbers in the array and then XOR the result with numbers 1 to N.

The result in the variable storing XOR would be the missing number