## Microsoft Interview Question

Software Engineer / Developersint FindFirstDuplicate(int[] arr,int k)

{

Hashtable hash = new Hashtable();

int i = 0;

for (; i < k + 1; i++)

{

if (hash[arr[i]] == null)

{

hash.Add(arr[i], arr[i]);

}

else

break;

}

return i;

}

k=11 here

int a[]={2,5,3,7,3,9,8,10,1,7,5,6};

int bucket[]=new int[11];

for(int i=0;i<11;i++)

{

bucket[a[i]]++;

if(bucket[a[i]]==2)

{

System.out.println(a[i]);

break;

}

}

In this case all numbers can be different. O(k) can be reached only in case if all of them is positive

dont' understand what negative numbers have to do with this. So what if the list has negative numbers?

Suppose i gave you an array and said all numbers are > -k.

Will you be able to solve the problem now?

I guess as n>k and the maximum possible number is k. So either the first k elements will be distinct or some repeating elements will occur already. So at k+1 th item we are bound to find a repeating element. So O(k) elements have to be checked (as we encounter j-th item check the hash table for a 1 or 0).

credit goes to Musheka.

I will explain with an example.

Let n=5, k=4 and the array is: a[] = {2,3,2,1,4} (Let us consider array index starts from 1)

Now we can use the array elements as the pointer to the array itself. Like a[1] points to a[2], a[2] points to a[3] , a[3] points to a[2], a[4] points to a[1] and a[5] points to a[4]. Now consider this as a directed graph with a[1] to a[5] as nodes and the points to as directed edges. Start with a[5] and travel the directed graph. You are sure to find a loop. (Figure out why?)

So, its like tail and a cycle. Now the problem is to simply find the beginning node of cycle, which is a duplicate integer.

Pseudo code:

i = n;

j = n;

while (i != j)

{

i = a[i];

j = a[a[j]]; // This is like detecting loop in a linked list

}

j = n;

while (i != j)

{

i = a[i];

j = a[j];

} //This while loop finds the first node of the loop

return i;

I will explain with an example.

Let n=5, k=4 and the array is: a[] = {2,3,2,1,4} (Let us consider array index starts from 1)

Now we can use the array elements as the pointer to the array itself. Like a[1] points to a[2], a[2] points to a[3] , a[3] points to a[2], a[4] points to a[1] and a[5] points to a[4]. Now consider this as a directed graph with a[1] to a[5] as nodes and the points to as directed edges. Start with a[5] and travel the directed graph. You are sure to find a loop. (Figure out why?)

So, its like tail and a cycle. Now the problem is to simply find the beginning node of cycle, which is a duplicate integer.

Pseudo code:

i = n;

j = n;

while (i != j)

{

i = a[i];

j = a[a[j]]; // This is like detecting loop in a linked list

}

j = n;

while (i != j)

{

i = a[i];

j = a[j];

} //This while loop finds the first node of the loop

return i;

Order of this algo O(n) and space requirement is O(1).

how does this work? if the elements happened to be the same as in the indexes your loops would just be reoccuring and go nowhere.

I dont think ur algo works for this example:

{2,2,3,3}, n = 4; k =3 It will return 3 and not 2

Declare an bucket of size k. Then insert the element into appropriate bucket.

initialize bucket to zero

for(i = 0 ; i < n; i++)

{

bucket[inputarray[i]] ++;

if(bucket[inputarray[i]] == 2)

return the element, position and continue to find next repeating element

}

Time: Linear

let me rephrase the answer ... first create an array of k length.

now scan only till the first k+1 elements,(worst case), of the given array of length n whenever we get a new value , set the corresponding index in the array of length k to 1.(if it is zero). If the index already has value 1, display it , that is our answer.

assumption .

1) all nos. should be positive and within the range 1-k.

You can optimize your answer by replacing array of k length to an 32bit integer array of k/32. Set a bit to 1 corresponding to the value < k. And when you find a bit which already has been set to 1, then you have a first repeative value. Order is still same but memory usage is optimize in terms of array size reduced from k to k/32. Interviewers always look for your attempt to solve the question.

but our worst case is O(k+1). Interviwever is looking for O(k)?!?! Any answers??

Im not sure if we get credit if we modify the problem

code is -(considering array contains only positive numbers)

for(i=0;i<k+1;i++){

if(A[abs(A[i])])<0) return abs(A[i]); //1st duplicate, abs is absolute function

A[abs(A[i])]*=(-1);

}

let k=6 , k<n

A= 1,6,4,5,2,6,3,..........

A[0]-1st element, a[k]-k+1 element in array

now A[0]=1; and A[1]=6 now A[1]= -6

A[1]=6 and A[6]=3 now A[6]=-3

A[2]=4 and A[4]=2 now A[4]=-2

A[3]=5 and A[5]=6 now A[5]=-6

A[4]=-2 and A[abs(-2)]=4 now A[2]=-4

A[5]=-6 and A[abs(-6]<0 therefore return 6

time complexity=O(k+1) and very much inplace solution .

The only point is the first K+1 numbers may get negated and the array is modified. In that case after finding the duplicate we can traverse the array again for the first k+1 no.s and make the negative numbers positive.

code is -(considering array contains only positive numbers)

for(i=0;i<k+1;i++){

if(A[abs(A[i])])<0) return abs(A[i]); //1st duplicate, abs is absolute function

A[abs(A[i])]*=(-1);

}

let k=6 , k<n

A= 1,6,4,5,2,6,3,..........

A[0]-1st element, a[k]-k+1 element in array

now A[0]=1; and A[1]=6 now A[1]= -6

A[1]=6 and A[6]=3 now A[6]=-3

A[2]=4 and A[4]=2 now A[4]=-2

A[3]=5 and A[5]=6 now A[5]=-6

A[4]=-2 and A[abs(-2)]=4 now A[2]=-4

A[5]=-6 and A[abs(-6]<0 therefore return 6

time complexity=O(k+1) and very much inplace solution .

The only point is the first K+1 numbers may get negated and the array is modified. In that case after finding the duplicate we can traverse the array again for the first k+1 no.s and make the negative numbers positive.

Just a random thought! Instead of hashing why can we not xor each element? worst case would be that at the k+1th element we are bound to get repetition.so O(k). The moment you get answer 0 record the last number xored..it's the first repeated vale..save and exit

I thought of this one. Not sure how good it is.

The OP says that there are n elements. The n elements are less than k. k is the value of the number in the array. so it means that all the n elements in the array are less than the value k(this is my understanding of the problem).

Now to solve the problem.

Traverse through the first k-1 elements in the array.

1) you could have found a repeating element by then.

2) if you haven't jump to the position k+1(don't go to the kth element). you can find the match for any one of the element before(as all the elements are less than the value k and occur only once until then). if the k+1 th element is not a match from any of the elements before then you can say that the kth element is repeating.

I am not sure that i am really clear. but you can work this out with an example.

Since the range of numbers is less than k,say from zero to k,we need to check only atmost k+1 items in the array.the element at k+1 is sure to differ.So I think so we can obtain O(k)by using hashing method.

- Musheka January 14, 2010