Amazon Interview Question
SDE1sCountry: United States
How about using external sorting instead of heap sort?
External Sort: en.wikipedia.org/wiki/External_sorting
Time complexity: stackoverflow.com/questions/10359661/time-complexity-cost-of-external-merge-sort
insert first k element in BST klog k
now for n-k elements
element is smaller than smallest element in bst then delete it and insert curr element
logk finding smallest element + log k insert newly element in bst = 2 log k(worst case)
so overall its takes
klogk + (n-k)logk (find smallest) + (n-k)logk
= klogk +2*nlogk -2*klogk = 2*nlogk + klogk = nlogk
Why not just use a hash table? Always insert each number into the table, no matter what. If it's already in there, it doesn't matter, since you are replacing a number with itself. Then return the 100 highest entries in the table as your new hash table, so you don't run out of memory.
We don't need to sort it, since the hash does that. We don't care if we have a duplicate, since each bucket only holds one item. Finally, we don't care where it falls in relation to anything else, since we always return the top 100 entries.
Solution with heap implemented in C++
#include <iostream>
#include <fstream>
#include <queue>
int main() {
std::priority_queue<int, std::vector<int>, std::greater<int>> queue;
std::vector<int> vec{10, 20, 30, 40, 50};
for(auto value: vec) {
queue.push(value);
if(queue.size() > 2)
queue.pop();
}
while(queue.size() > 0) {
std::cout << queue.top() << std::endl;
queue.pop();
}
return 0;
}
Use a min heap of size 100. Insert the first 100 elements of the list into the heap. From the 101st element, check if the current element in the list is greater than the min element in the heap. If yes, delete the min element and insert the current element into the min heap. Repeat this until the list is exhausted and in the end, the top 100 elements will be present in the heap.
- Murali Mohan January 11, 2014