Find the largest 1 million num

Microsoft Interview Question for Software Engineer / Developers

0

of 0 votes

42
Answers
Find the largest 1 million numbers in 1 billion numbers. Supposed the memory can hold all the one billion numbers.

The interviewer said there exist a O(n) solution, anyone has any hint?
- liandage June 13, 2007 | Report Duplicate | Flag | PURGE
Microsoft Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 0 vote

Look up counting sort. Using an approach similar to this would give you linear time.

- vodangkhoa June 13, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I mentioned counting sort, but the interviewer said that we assumer the range of the numbers are unpredictable, counting sort doesn't work ( since counting sort requires all the numbers fall within a finite range, which is the size of memory we need)

- liandage June 17, 2007 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

opps! I didn't read it carefully. Look up Selection Algorithm.

- vodangkhoa June 17, 2007 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

wat do you mean selection algorithm ? .... u dont mean selection sort .. do you ?

- Vel November 03, 2007 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Use a hash table.

- Billy June 17, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

How? Can you say more details?

- liandage June 17, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

i am not sure about this one too. here is another pigeonhole sorting algorithm.

iterate through the entire array and check the min and maximum. this take O(n). now, instead of count sorting each element into individual buckets. have buckets of certain ranges.

Now, say you have buckets such that
b[0]: 0-100000
b[1]: 100001-200000
b[2]: 200001-300000
....
b[9]: top of range

you can say that top buckets + x items in some middle bucket add to 1000000 items and they are the top 1 million numbers.

i guess my point is that eventhough the range might be unpredictable, if you read it, it does not hurt your O(n) algorithm.

- nunbit romance June 19, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

how about a counting sort in the range of 1 million.
declare an array of 1 million no. with 0 corresponding to 1,and index 999,999 correspond to 1,999,999
for every no in billian ...check if the no falls in that range...go to no's index and increment it!!

- Adeel June 20, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

should be Selection Algorithm

- Flyfly June 20, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Nope. It is O(nlogn).

- russoue February 10, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Selection sort is O(n2)...interviewer asked for O(n) solution.

- Anonymous June 21, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

The question doesn't ask for sort. While selection sort is O(n^2), selection is O(n) (average time). Again refer to Selection Algorithm instead of Selection sort.

- Flyfly June 21, 2007 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

How about the heap sort.
It takes 1 million * 1 billion times to get the solution
If we consider C = 1 million. n = 1 billion
The time is still O(cn) = o(n)

- Anonymous June 21, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use the heap sort.
Construct the heap is O(n)
Then it taks O(c * lg n) to find the largest C items.
So the total time is still
O(n + c * lgn) = O(n).

- noname June 21, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

The construction of heap doesnot take place in o(n) time , it takes O(nlogn) . The solution proposed would not work.

- Anonymous November 02, 2007 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Construction of heap takes O(n) time. Its a tighter bound. Refer to Cormen for more details.

- gauravk.18 April 08, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

what a moron

- Anonymous December 31, 2009 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

i guess u are the moron here...u can build a heap bottom up in O(n)

- Anonymous July 08, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Say we have to find the k largest elements in an array. We can use the the logic in Cormen Rivest to find the kth largest element using the same logic of partitioning the array around a pivot.

Here is the Pseudo code:

FindKLargest(arr, int start, int end, int k)
{
i = Randomized_Partition(arr, start, end)
if(i > k)
FindKLargest(arr, start, i, k)
else if(i < k)
FindKLargest(arr, i+1, end, k)
}

The arr will contain all the largest elements from the index (n-k) to n.
What do you think?

- temp1 July 03, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Should the Pseudo code be:

FindKLargest(arr, int start, int end, int k, int length) 
{ 
 i = Randomized_Partition(arr, start, end) 
 if((length-i) < k)
    FindKLargest(arr, start, i, k-(length-i)) 
 else if((length-i) > k)
    FindKLargest(arr, i+1, end, k)
}

- Nirvanatiger September 10, 2007 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

worst-case is O(n^2), ave O(4n)

- Anonymous June 11, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

The solution proposed by temp1 will probably work. We can also use partial sorting here. Take the first million numbers and keep them in minimum priority queue. If the next number in the input list is less than min priority queue value, we replace the value in the heap. This way, we get nlogm solution and if "m" is small which in this case is, we can say the solution is O(n).

- atulg July 29, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

temp1 solution is Quick Sort Algo...
One more solution...
Use bit vector... Initialize it initally to zero.
Bit Vector will contain 0 to 1 million bits.. A number less than 1 billion will be subtracted from 1 billion and the so obtained subtracted value will be the index of that bit in the vector. Set that bit to 1.
Once all the numbers are scanned..navigate the vector from end to 0th location.. Go on printing only that mber that has its bit set.. The number will be 1billion - (size - location).

- KSD October 20, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Since we have the internal memory to hold the 1 billion number , why cannot we go for using radix and bucket sort ?

Comments are appreciated.

- Joe November 02, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Radix sort is a sorting algorithm that sorts integers. Numbers doesn't mean integers.

- zdmytriv March 06, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

May be we can use Linear Selection Algorithm...See the chapter 9 of Introduction to Algorithm, Cormen, 2nd edition...There's a section "9.3 Selection in worst-case linear time"

I guess it will work out here.. No sorting algorithm will work here

- passer by November 02, 2007 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

use a binary search tree

- xp joel March 01, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I would use a sliding window heap of size 1 million. We'll have the solution in o(n * log k), where n is 1 billion and k is 1 million. If k is small compared to n this converges towards o(n).

- Ram Varma March 05, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use bubble sort after 1 million iteration you can get 1 million highest numbers at the end

- zdmytriv March 06, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

What I initially though of is that we construct a max heap which will take o(n) time. Now at the first level of heap we have only 1 element. At second level we have 2 and at h height we have 2^h elements. We also know that elements at each level are smaller than the elements that are one level above. If we just want to pick top i million numbers without caring that they should be sorted we can just scan through the heap printing elements till we have reached the 20th level. This is obtained as follows: 2^0 + 2^1 + 2^2 + ----+2^h = 1 million -->Applying GP 2^(h+1) = 1 million = 2^20 and so h = 19 .So we only need to scan till level 19th and we have top 1 million elements(a little more than a million) and obviously not sorted. Any comments??

- gauravk.18 April 08, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Gauravs solution will not work. Heap just means that parent is greater than children. Which means you can have a heap like
100
85 90
80 82 17 19
75 76 55 56

So in this case your solution will miss out on 75,76,55,56.

kth largest element is the correct way to go.

- King_K April 22, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Thanks for the correction. I guess u r right. I completely missed out that point.

- gauravk.18 June 15, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Say we are supposed to find K largest integer from n integers where n>=k

1) We can sort the integer using any good sorting algorithm like merge sort or heap sort, which is having (nlogn)comlexity.
2) Now take 1 pointer from the begining of the list and move it n-k times forward.
Nw we have all K largest integer from n integers

- Puranik June 11, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

read the question before answering. An O(n) solution is being asked for.

- Anonymous September 07, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Use bitmap, classic problem in "Programming Pearls"

- Bill November 21, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

find the largest number O(n),lets say max.Create bitmap of 1..max,iterate through list and mark off bits.the iterate through the bitmap from max.... till we get 1 million numbers.O(n)

- pseudo April 01, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I agree with Ram Varma (also others who suggest a sliding window with a heap).

- odyssey May 10, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use a bit vector ... Keep on setting the bits depending upon whether the number is visited .... Once all billion numbers are stored spit the largest one million numbers ...

- Anonymous September 05, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use a bit vector ... Keep on setting the bits depending upon whether the number is visited .... Once all billion numbers are stored spit the largest one million numbers ...

- Anonymous September 05, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I guess we need to use the Selection in Worst case linear times method - using median of medians.. to find the Kth smallest/Largest number....

Once we have gotten (billion -million)th number using this approach in O(n) time.. then do a scan of the entire list to spit the numbers >= that No...

- Nag. December 23, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

N= 1 billion
M =1 million

1. Create a Max heap with the billion numbers. O(N)
2. Take the root element a[0] and then heapify. do this M times to get the M largest numbers. M*LogN

O(N) + O(MlogN)

- Anonymous December 26, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Million is a constant. Billion is a constant. Million multiplied by a billion, raised to any power, etc., is still a constant. So any algorithm will do, because the result is O(1). It's only when the size of the input is a variable like n, that you get O(n), O(N log n), O(n^2) complexities.

- michfr3 May 11, 2011 | Flag Reply

CareerCup

Microsoft Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews