An sorted array is given with

Google Interview Question for Software Engineer / Developers

-1

of 1 vote

27
Answers
An sorted array is given with more than one duplicates. You have to remove the duplicates. Can it be done in less than O(n). i tried to think in lines of Bin search but coudnt get it as more than one duplicates
- Cartman September 25, 2010 | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

No you can't get it done in less than O(n) in given scenario.

- Anonymous September 25, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

is it means that it has more than one duplicates on one character?

- Anonymous September 26, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think it can be done in log N * log N if the number of different elements is log N.

- S September 26, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Is this a joke?

- Mahesh September 29, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Well generally it can be done in O(log(N)) if duplicates reside at the end of array. Otherwise locating them takes O(log(N)) and removal with shifting in the array O(N)

- Aleks September 27, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Can u explain algorithm/logic for locating duplicates in O(log N)?
I thought locating them would require O(N).

- andy September 27, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

If the data is raw, you have to go over it at least once. O(n).

- memo September 28, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

if the sorted array with duplicates. i.e. 1 2 2 3 4 4 5 6 7 7 7 8 9 etc. is given.
we should atleast be given tat the elemts are ranging from 1 to n, wher all the numbers between 1 and n are present.. then we can do binary search, calculate wat number to expect in the middle, if that number is lesser than wat is expected, then the left half needs correction. based on this we can come up with some method to skip some numbers, which will make it less than O(n).

- CodeBoyS September 28, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

@CodeBoyS
if we already know the range and all the number in that range are present then why do we need to even remove duplicates just return any array with values 1 to n

- Anonymous September 29, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

CodeBoys:
How do you calculate what number to expect in the middle?
Even if you able to do that then your solution might not work for some cases.

- Arang September 29, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Yeah, it looks like you will need to see every element once, so O(n). I suppose if you knew more about the nature of the input, you might be able to do slightly better. For eg. if you know that a duplicated element always occurs 4 times or more, you could "hop", looking only at every other element and be able to tell that an element is duplicated at least 3 times when you encounter it twice consecutively while hopping.

Plus on a multi-core/multi-proc architecture you could divvy up the array and check in parallel (though that's cheating of course since we are still performing O(n) operations, just in lesser time) and do an "edge check" in the end.

- peanutballs September 29, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

yes it is possible..
algorithm goes as follow..
while(i<n){//n is the length of array,i looping varaible
compare a[i+2] and a[i]
if(a[i+2]-a[i]==2) i=i+2;continue;
if(a[i+2]-a[i]==1) i=i+2;delete last one and continue
if(a[i+2]-a[i]==0) i=i+2;delete the last two element and continue
}
so in this we are skipping numbers (approx o(n/2)).....

limitation : will work only for the input where from 1 to n atleast any number should be present at least one times..
like 1 2 2 3 4 5 5 6 6 6 6 6 6 7 8 9 this...
if we skip number like 1 4 5 5 7 8 ..it won't work

- awsomist October 03, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

@awsomist: O(n/2) is O(n)!

- farkt February 27, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I guess the "less than O(n)" trick is applicable when you have a long string of duplicates. In this case, supposing that you are now at the first element of the duplicated series, you conduct a binary search to with finding the last appearance in mind. If you are lucky, you will get a position that is far away from the first element position. Then you erase everything in between, and then jump to the next element, which be the start of a new round.

If the average number of duplicates in a series is more than log_2 n, the above method is more desirable, and yes, it is less than O(n).

- Eric Xu October 04, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

- Eric Xu October 04, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Nice :-)

Though I did not exactly get the analysis in the end ...

Assume that each distinct element in the array has about lg n dupes. Then we have n/lg n distinct elements. For each one of them, we have to run binary search O(lg n), so the running time is still O(n).

You may be able to make this bound a bit tighter (the upper bound on binary search could be lg(lg n), since we really need to search only about lg n elements. Then running time is O(n/lg n * lg(lg n)) < O(n).

Having even more than lg n duplicates (per distinct element) would make this algorithm even more efficient.
For example, if the number of distinct values is at most k, the running time would be O(k * log n), and if k is constant, that would O(log n).

- czpete October 04, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I thought it is pretty easy but I might miss something.
Please correct me which part I have missed.

Since the value is sorted array, duplicated element must be adjacent.
With pointers, compare current and previous element. There is one more pointer called 'pWrite'. It just write non-duplicated element chasing the other pointers.
See my simple code (I assumed the array is characters to keep the code simple but we can do number with one more parameter the size of array)

char *pCur,*pPrev, *pWrite;
	pPrev = value; 
	if(pPrev == NULL) return;

	pWrite = pCur = value+1;
	while(*pCur != NULL)
	{
		*pWrite = *pCur;
		if(*pPrev != *pCur) pWrite++;
		pPrev++;pCur++;
	}
	*pWrite = NULL;
	cout << value << endl;

If I missed something correct me

- CorrectMe October 06, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Yes, you have missed something :-( The question is--can you remove duplicates in time less than O(n). We all can do it in O(n) and really don't need to clutter the page with an O(n) code.

- czpete October 13, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Will converting into a heap help ? It takes O(log N) times.

- Anonymous October 17, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Will converting into a heap help ? It takes O(log N) times.

- Anonymous October 17, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think we can do this by divide and conquer. Divide the numbers into smaller subproblems; The base case is when there are 2 numbers, compare the numbers, if they are equal delete one of them. Now in order to marge two sub-problems A&B, all you need to consider is the last array element in A(largest) and first array element in B(smallest) and compare, and repeat the process. Essentially the algorithm removes duplicates within subproblems first, and then while merging it removes duplicates across sub-problems.

This takes less than O(n).

For a set of numbers of size 8, we will be doing 4+2+1 comparisons and remove all the duplicates.

- Anonymous October 18, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

nice:)

- Anonymous October 19, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

it takes O((logn)^2), right?

- Anonymous December 19, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Clever trick!

Isnt this still O(n) though?

You are doing n/2 comparisons in the base.
Then n/4,n/8 and so on. This would add up to O(n) asymptotically.

log n is achieved when you are able to eliminate half of the possibilities each time. Such as search an element in a binary tree (each branch you go down enables you to ignore the other branch).

Let me know if my reasoning is wrong.

- ggggizli December 27, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

It's still O(n). T(n) = 2T(n/2) + 1. The Master Theorem is your friend; use it.

- Bullocks July 18, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I don't think this can be done in less than O(n). We need to read the element at least once to decide if it is a duplicate. Reading itself will cost O(n).

- limaye.nikhil November 14, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think you might manage O(m-n) where n is the length of all the duplicates. The reason is that you can divide and conquer and get a lot of early terminations early when you see the beginning and end of a substring are equal, and pass that fact up to a parent to use sometimes on a sibling. So if n is a large number, it could knock the runtime down a lot.

- wealthychef June 17, 2012 | Flag Reply

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews