Amazon Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: Phone Interview
@Anonymous its min heap not max heap here we need million smallest element not million largest element and max heap contain million largest element
pls let me know if i m wrong ......... thanks
I think Partition algorithm of Quick Sort would do.
Take one element as pivot and partition it recursively by taking pivot from small set
At one stage data would be partitioned such that first 1000 numbers are one left side of pivot.
What say???
Think of it this way - You push in the first million numbers in 'a heap' now when the (1million+1)th number arrives you'd wanna kick out the largest of the million numbers stored in your heap and accommodate this new number. so, you take a max-heap and compare the new number with the heap-top, hence if the new number is less than the 'max' of previously recorded million numbers you kick out the top element and insert the arriving number. Thus, a 'max heap' will filter out the smallest million numbers of the given billion numbers.
@Anonymous - yes u r right..... thanks for the correction ....
@pradeep.. yes pradeep u can solve this problem in this way also .. but in some case it will give some problem... ex .. suppose there r million num is there and u have to find 150 smallest element ...
but still the complexity point of view aprx the same......... also u don't need extra space.......
we need a sort algorithm here definitely
However billion numbers mean they would not fit the main memory.
I guess that what the interviewer wants to listen is a technique of external sort (where the input can be availed from a secondary staorage ie a tape, hard disk).
moreover the external sort which typically employs mergesort can be made parallel easily
Sorting is definitely worst idea. Question doesnt ask to order anything at all. It just asks to find 1 million smallest element in any order.
1. create a max heap.
2. push first 1 million +1 element in it.
3. Start a loop for 1billion - 1million +1 times and remove the top element, throw it away and push the next input element.
4. at the end of the loop the max heap will contain smallest 1 million numbers.
for solving this problem we can use min heap sort create an array of million size and iterate billion numbers and full that array and the data that are not fit into y=the array just ignore them as they are not in the set of million smallest element complexity is billionLOG(million).........
- raunak.gupta29 March 13, 2012