how to find a duplicate elemen

Microsoft Interview Question for Software Engineer / Developers

1

of 1 vote

66
Answers
how to find a duplicate element in an array without using extra memory....do this in O(n)?
- vineetsetia009 November 20, 2011 in India | Report Duplicate | Flag | PURGE
Microsoft Software Engineer / Developer

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 5 vote

Hope it'll work!!
if the array size is n and the array elements are in the range 1 to n-1(with one element repeated) then we will traverse the array once from index(0) to index(n-1). If we have k at index(0) make the array value negative at index k(if positive) else return that index if the value at that index is already negative.Will do the same for complete loop(while accessing the index ignore the sign of array element) until we get the repeated element.

Thus no extra space is required and it'll take O(n) space. But we are loosing the original array here so no need to worry, just make all the negative elements of the array as positive in another pass.

ex. there is an array of size 5 with the values as-

A = [1,4,3,2,1]

for i = 0 A[0] = 1 so we will change A[1] which is +ve, now A = [1,-4,3,2,1]
for i = 1 A[1] = -4 so we will change A[4] which is +ve, now A = [1,-4,3,2,-1]
for i = 2 A[2] = 3 so we will change A[3] which is +ve, now A = [1,-4,3,-2,-1]
for i = 3 A[3] = -2 so we will change A[2] which is +ve, now A = [1,-4,-3,2,-1]
for i = 4 A[4] = -1 so we will change A[1] which is already -ve so report this index(1) as ans.

- Anonymous November 22, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Good way.

- Anonymous November 22, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Range is not defined.That is the main reason..

- Anonymous November 22, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Good solution

- Anonymous November 22, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

What if array consist of negative numbers as well? For eg -3 to 4

- Prateek Caire November 23, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

look, man if the array is of size 'n' and the numbers are in [1 to n-1], then the duplicate no is given by 'n(n+1)/2 - sum_of_elements'
that would by too simple

- Anonymous November 26, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

In the case of negative numbers, convert all the numbers to positive numbers. i.e., find the min number if it is negative, add mod(number)+1 to all numbers find the duplicate and decode it.

Will this work?

- Nani November 30, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Above algorithm works on Array A = [1,4,3,2,1]

This is the new Array

B = [5, 2, 3, 5, 1, 4];

Ans:- ?

I think this algorithm fails on array B

Can any one explain how this algorithm will run and produce result on array B?

- Anonymous December 01, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

A = [1,4,3,2,1]
if the array size is n and the array elements are in the range 1 to n-1(with one element repeated) then we will traverse the array once from index(0) to index(n-1). If we have k at index(0) make the array value negative at index k(if positive) else return that index if the value at that index is already negative.
for i = 0 A[0] = 1 so we will change A[1] which is +ve, now A = [1,-4,3,2,1]
for i = 1 A[1] = -4 so we will change A[4] which is +ve, now A = [1,-4,3,2,-1]
@Anonymous 1(top)

for i = 2 A[2] = 3 so we will change A[3] which is +ve, now A = [1,-4,3,-2,-1]
for i = 3 A[3] = -2 so we will change A[2] which is +ve, now A = [1,-4,-3,2,-1]
for i = 4 A[4] = -1 so we will change A[1] which is already -ve so report this index(1) as ans.

My Question is :-
This is the new Array

B = [5, 2, 3, 5, 1, 4];

Ans:- ?

I think this algorithm fails on array B

Can any one explain how this algorithm will run and produce result on array B?

- Anonymous December 03, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

So every element is unique except for a single element that is present twice?

- eugene.yarovoi November 20, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

yeah right..

- vineetsetia009 November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Are the elements within a range, say from m to n and all the elements are present and only one is duplicated?

- Mango November 21, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

If it's what mango said, you could actually solve the problem...

- eugene.yarovoi November 22, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Yes. But the one who posted this question hasn't responded yet.

- Mango November 23, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

what is the range of data?

- Anonymous November 20, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

You are thinking about counting sort?

- Bruce November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Counting sort will consume a lot of space and hash map will consume O(n) space...think apart from this..

- vineetsetia009 November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

you cant use extra space....

- vineetsetia009 November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

if the array size is n and the data is in between 1 to n-1 then we can do this is O(n) time and O(1) space... otherwise its not possible to find the duplicate in O(n) time and O(1) space.

- Anonymous November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Still it requires space.

- Anonymous November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Who asked you this question? It's impossible. Did you ask him for more clarification?

- Anonymous November 20, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I solved this question by hash map, then they asked me try to solve this in O(1) extra memory.......

- vineetsetia009 November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

To find 2 elements with same value in an arbitrary array will take O(nlogn) time. This is a research result. Infact it says if an element occurs k times in an array, searching duplicacy takes O(nlog(n/k)) asymptotically. That's why for majority element(k>n/2) we get O(n) solution.

There has to be some glitch, like elements are between 1 to n or something.

- Sanket November 20, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

ok...thanks

- vineetsetia009 November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

if the array size is n and the data is in between 1 to n-1 then we can do this is O(n) time and O(1) space... otherwise its not possible to find the duplicate in O(n) time and O(1) space.

- Anonymous November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

{ if the array size is n and the data is in between 1 to n-1 then we can do this is O(n) time and O(1) space... otherwise its not possible to find the duplicate in O(n) time and O(1) space. }

Can u explain how ?

- Anonymous December 01, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

And since its said find duplicate element, should we consider that the element is a number?

- Anonymous November 20, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

yeah...you can consider this..

- vineetsetia009 November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

if we consider that the elements are number and the array of size n contains all elements from 1 to n with one element appearing twice.

Let the element appears twice is i,

∑x = n(n+1)/2 = 1+2+3+4+5+...+i+...j+...+n

∑x^2 = n(n+1)(2n+1)/6 = 1^2+2^2+3^2+4^2+5^2+...+i^2+...j^2+...+n^2

Let the sum of the array be S
S = 1+2+3+...+i+...+i+...+n

Let
S2 = 1^2+2^2+3^2+...+i^2+...+i^2+...+n^2
Let this i replace element j.

∑x - S = j - i = ∂

∑x^2 - S2 = j^2 - i^2 = ß

so (j+i) = ß / (j - i) = ß / ∂

so 2j = ∂ + (ß / ∂)

The missing number = j = (∂ / 2) + (ß / (2∂))

This way we can find the missing number in O(n) time by traversing the array once.

- Anonymous November 20, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

The data would be in the range of 100 * million... so addition and multiplication is not an elegant solution.

- Anonymous November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

How to solve this then in O(n). For 100 million this cannot work, we have to think in terms of map reduce in that case, but that will be similar to hashing as we will be converting the problem into some key-value pair, what do u say?

- Anonymous November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

first of all who said that numbers are sorted and range from 1 to n

- Anonymous November 23, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Above solution donot require array to be sorted.

- Anonymous November 30, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

If we can assume the array is of integers (or char), then we can use bit vector to find the duplicate in O(n) time and ~O(1) i.e. using just one extra integer.

- unemployed coder November 20, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Will you please explain with an example?

- vineetsetia009 November 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

No, this won't work. While economical, a bitmap is still O(n) extra space. If it were letters instead of integers, the number of unique letters is O(1), so you could do it in O(1) extra space. Here, however, we are assuming no bounds on the integers.

- eugene.yarovoi November 21, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Well we could use a bitset(256) to do this. This would cost 32bytes only. Essentially O(4)

- premsantosh November 21, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I have come across this question before from Microsoft, we have decide some way to answer this impossible question.

- Dr.Sai November 21, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Well, you could say it can't be done. I think the question may have been misstated. The interviewer was probably looking first for the O(n) time, O(n) extra space solution and then was willing to drop the O(n) time requirement to get an O(1) space solution. It's also quite possible that the original question put additional constraints on the numbers in the array.

- eugene.yarovoi November 22, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Unemployed cannot be fired so easily ,say you have 500 terabytes of numbers? with 512 mb of space you can find out duplicates in O(n).

- Dr.Sai November 21, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Sure, I agree that there could be times when it would be a reasonable solution. But in this problem, if there are x bits per integer, this approach needs 2^x bits in the bitmap, and the maximum size of the array is 2^x+1 because we know all the elements except one to be unique. So even in the best case, we use about 1 bit per element, which is O(n). For any fixed x, the algorithm technically uses O(1) space, but the length of the array is technically O(1) too, so we could declare the whole algorithm to be O(1) time if we were to assume a bound on x.

- eugene.yarovoi November 22, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Guys for one I am sure this cannot be solved the way it is stated,I am suggesting we can ask Gayle Laakmann McDowell to settle this once and for all.

- Dr.Sai November 21, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Gayle doesn't post these questions. Given that this problem probably can't be solved, it's likely she wouldn't know either.

- eugene.yarovoi November 22, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Are the elements within a range, say from m to n and all the elements are present and only one is duplicated?

- Mango November 23, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Range lies between m and n
1. Swapping current number with its appropriate index so that number is at coorect place when sorted
2. if swapped index(computed k) itself contains same number. report such number duplicate
3. else swap numbers
4. move index if swapped index points to current index

i = 0
D()
	while(i <= n-m)
		k = a[i]-m
		if(k != i)
			if(a[k] == a[i])
				print a[i] as duplicate
				return
			else	
				swap(k, i)	
		if(k == i)
			i++

- Prateek Caire November 23, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

They never said not to change the array. So, why not sort the array and look for 2 adjacent elements that are the same?
Merge sort would do an in-place sort. The second phase is straight forward. Just a thought.

- smffap November 25, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Not O(n)... (which was a requirement)

- wdtvliveplus November 26, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I assumed the array contained integers in the range {1..n} and one number was missing.
O(n) = n....

- Anonymous November 25, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

let a,b,c,d,a.

s=xnor(a,b,c,d,a) ------ O(n)

check s xnor a == 0 then a is the duplicate,
else check s xnor b == 0 and so on.

- Anonymous November 27, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

here Xnor doesn't depends on bits... the xnor operation is atomic... hence it is correct... it will take O(n) time..

- Anonymous November 27, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

here Xnor doesn't depends on bits... the xnor operation is atomic... hence it is correct... it will take O(n) time..

- Anonymous November 27, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

can you elaborate more on your soln ?

if by xnor(a,b) you understand ~(a ^ b), then I do not see
why applying xnor with a duplicate element should necessarily give 0 ? (I think this only works if the numbers in the array are consecutive)

i.e., consider: s = xnor[1,2,4,5,7,2] = -8
s xnor 1 = 6
s xnor 2 = 5
s xnor 4 = 3
s xnor 5 = 2
s xnor 7 = 0
s xnor 2 = 5

- Anonymous November 28, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Question is correct... solution using xnor is correct

- Anonymous November 27, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If We are OK with the extra Space then we can choose BST for finding Duplicate Element.

- User November 29, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The problem can only be solved in O(n) time when all numbers are within range [n, m] where n < m where n and m can be either negative or position, and (m - n) / 2 >= half of integer range (in Java, it's Integr.MINIMUM, Integer.MAXIMUN inclusive. When these two conditions are met, here are the steps to solve it:

1. Scan the array to find the n and m.
2. Scan the array again minus n from every number: array[i] -= n. After this scan, all numbers >= 0.
3. Scan the array again to do
array[array[i]]

- Anonymous December 01, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This could be one solution ::
Make a BST of the given set {O(logn)}. Now we can perform sorting operation in BST {O(logn)}. can find all the duplicate numbers.

T(n)=O(logn)+O(logn)=O(logn)[which in the worst case would tend to O(n)];

- oLiV3r December 02, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

anonymous answer posted on nov. 20 1011 is good but do we have to do multiplication ? let there n numbers in the range of 1 to n and the number b occurs twice . the numers are stored in the array 'a'. then

sum1 = 0;
sum2 = n(n+1)/2;
for(int i=0; i<n; i++){
sum1 = sum1+a[i];
}
result = sum2-sum1;

if there is anything wrong then please notify. all constructive comments are welcome

- CAFEBABE December 28, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think this should be correct program.
Given an array of n elements which contains elements from 0 to n-1, with any of these numbers appearing any number of times. Find these repeating numbers in O(n) and using only constant memory space.

For example, let n be 7 and array be {1, 2, 3, 1, 3, 6, 6}, the answer should be 1, 3 and 6.

Implementation:

#include <stdio.h>
#include <stdlib.h>
 
void printRepeating(int arr[], int size)
{
  int i;
  printf("The repeating elements are: \n");
  for (i = 0; i < size; i++)
  {
    if (arr[abs(arr[i])] >= 0)
      arr[abs(arr[i])] = -arr[abs(arr[i])];
    else
      printf(" %d ", abs(arr[i]));
  }
}
 
int main()
{
  int arr[] = {1, 2, 3, 1, 3, 6, 6};
  int arr_size = sizeof(arr)/sizeof(arr[0]);
  printRepeating(arr, arr_size);
  getchar();
  return 0;
}

- Gajanan Rudrawar January 26, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

This does not work if the element occurs more then two times
check with
1, 2, 3, 1, 3, 6, 6,1
elements

- Ajay Pathak August 30, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Looks quite easy: we just store 2 vars-bit-vectors ( for +ve an -ve values).
O(n) complexity and O(1) space

public static int Dublicate (int [] A)
{
	int positive = 0;
	int negative = 0;

	for (int i = 0; i < A.Length; i++)
	{
		if (A[i] < 0)
		{
			if (((negative >> Math.Abs(A[i])) & 1) == 1)
				return A[i];
			else
				negative |= 1 << Math.Abs(A[i]);
		}
		else
		{
			if (((positive >> A[i]) & 1) == 1)
				return A[i];
			else
				positive |= 1 << A[i];
		}
	}
	return int.MinValue;
}

- Konstantin February 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

The question is completely ...incomplete.
to answer this question in O(n) range must be specified.

The preson who posted this question is crazy and he is bit poor in algos.
dont waste our time by posting partial questions.
Thanks

- rider November 23, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I admit I may not be that much strong in algos like you all but this question is asked in interviews and people agree with this....and whatever answers are posted to various questions in this site may not be efficient one.. And yeah Noone is perfect in this world...every question has an answer.

- Anonymous November 23, 2011 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

make me wrong if i am??
we can use the property of XOR
if we have 4 number a,b,c,a
xor till three element sum = a^b^c
then xor this result with fourth element a^b^c^a which result = b^c
then again xor this by sum so let sum2 = sum^b^c = a
check if fourth element = sum2
then duplicate found
else
do next iteration

- Anonymous November 25, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

it will be same for all number..

- Anonymous November 25, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Nice solution. I think it works...

- sandeep.gs123 December 06, 2011 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 vote

- rider November 23, 2011 | Flag Reply

CareerCup

Microsoft Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews