## Amazon Interview Question for SDE1s

Country: India
Interview Type: In-Person

Comment hidden because of low score. Click to expand.
1
of 1 vote

Just implementing the K-th order statistic algorithm to work with files will solve this problem. Complexity - O(N)

Comment hidden because of low score. Click to expand.
1
of 1 vote

its not the K'th largest, Its K largest

Comment hidden because of low score. Click to expand.
0

Does it matter ? :) if you're looking for K-th largest element by partitioning array every time, after finding a K-th largest element, the elements in array after K will be K largest elements. In this caser array=file.

Comment hidden because of low score. Click to expand.
1
of 1 vote

yeah, you are right. It would contain target elements ( but not in sorted order, that was what confused me !! )

Comment hidden because of low score. Click to expand.
1
of 1 vote

But how can you find the

``k-th order statistics``

externally ? Don't you have to keep in memory the whole file ? I haven't seen the Selection algorithm working externally yet and I couldn't find it,
Thanks

Comment hidden because of low score. Click to expand.
0

Hi Gevorg.

Have you considered meeting in your suggested solution the constraint

``"You dont have RAM to store even k elements."``

The K-th order statistics does require at least "K * sizeof(object)" allocated memory. Or you maybe suggesting some implementation that could use file system ?

Comment hidden because of low score. Click to expand.
1
of 1 vote

Ashot, of course I'm suggesting to modify it to work with file system. Something like
1. Choose a pivot element
2. Start reading a file, write elements less than pivot into one file, and elements greater than pivot into another file. Depending on position of pivot (less than or greater than K) recurse into one of output files.

Working with files will make it really slow, but there is no other choice here. The only optimisation we can do here is to read an input file by blocks of maximum size which can fit into memory instead of reading one element at a time.
So, basically idea is same as in external sort, when you have to sort a file which doesn't fit in memory.

Comment hidden because of low score. Click to expand.
0

Kth order statistic will work when all elements can be stored in main Memory.
I am not sure if we can use secondary storage for implementing kth order statistic.

Comment hidden because of low score. Click to expand.
0
of 2 vote

Since we do not have enough memory to store the values, we need to use external file to store the numbers.
File will store the k largest numbers.
Every time we read a number from the input file, we need to compare with the already present numbers in the output file then accordingly store or reject this number.
Complexity is O(k*n).
Can we improve the searching in the output file?

Comment hidden because of low score. Click to expand.
0

in fact linear solution is possible, see my answer below

Comment hidden because of low score. Click to expand.
0

Since we can't hold even k elements in memory, another alternative is to use a min-heap of size k/2. In the first pass, keep storing the top k/2 elements in the min-heap and output them at the end of the iteration.

In the second pass, store the next top k/2 elements and output them.

Time complexity would be O(nlg(k/2)). If k is large, then it would make a difference to Subhajit's algorithm. Otherwise, time complexity would be nearly the same as his.

Comment hidden because of low score. Click to expand.
0
of 0 vote

You can do it using kth order statistics. in complexity O(n). while partitioning instead of storing into memory store it on disk. once you find the Kth element merge all the files having values larger than a[k] and output it.

Comment hidden because of low score. Click to expand.
0
of 0 vote

N - size of input
K - size of required set
M - size of memory

- build minheaps of size M, we will have K/M heaps.
- Within memory store the smallest value (say MIN), and a pointer to the minheap (say HP) with this smallest value
- When reading the input if you encounter a value less than MIN, then adjust HP to add the new element - Time complexity log M
- Search through K/M heaps to find new MIN and HP - time complexity log (K/M)

Time complexity will be N (log M + log (K/M))

Comment hidden because of low score. Click to expand.
-1
of 1 vote

I do this with MemoryMappedFile in C# if I understand your question right.

``````MemoryMappedFile mySequentialMMF = MemoryMappedFile.CreateFromFile(fileName, FileMode.Open, "MySequentialMap");
bool IsmutexCreated;
Mutex mymutex = new Mutex(true, "NonPersisterMemoryMappedFilemutex", out IsmutexCreated);
while (!sr.EndOfStream)
{
{
}
}

sr.Close();
mymutex.ReleaseMutex();``````

Comment hidden because of low score. Click to expand.
-1
of 1 vote

and I can make max heap for k elements if we need have a k max elements

Comment hidden because of low score. Click to expand.
-2

@rapirapp-i think minheap will help in finding k largest,not maxheap

Comment hidden because of low score. Click to expand.
-1

Ok Amit let's understand what is the difference between min heap and max heap. In the first one uses ascending priority where the smallest item is the first to be popped from the heap. A max heap uses descending priority where the largest item is the first to be popped.

Comment hidden because of low score. Click to expand.
-1

@Amit is right. You'll need a min-heap for k largest elements

Comment hidden because of low score. Click to expand.
-2

ok I don't say that Amit is not right but I want to understand why is Amit right.

Comment hidden because of low score. Click to expand.
0

-1 to all commenters in this thread.
1. First of all, the solution posted on top does something weird, but does't solve the problem
2. Max heap will not help here, since problem says that even K elements wouldn't fit in the memory
3. If they would fit, max heap is the solution, not the min heap.

Comment hidden because of low score. Click to expand.
0

Please do a dry run of your algo using maxheap..you will find why exactly do you need minheap...Nd if you find nothing wrong, write your code here with maxheap. I will give test cases where that will fail :)

Comment hidden because of low score. Click to expand.
0

Well, assuming that using a heap already violates problem statement (remember, we can't store even K elements in memory ?), my solution is follows (also violates the property that file cannot fit in memory)

1. read all numbers into memory
2. Create MAX HEAP out of that elements (O(N))
3. Attention.... Extract MAX K times !

total complexity – O(N) + O(k*logN), memory - O(N)

Using the min heap you'll use less memory ( O(K) ) but algorithm will be O(N*logK), which is greater than O(k*LogN) if K < N (which obviously is).

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.