MAQ Interview Question
Software AnalystsCountry: India
Why do you need the heap?
When you increase a key/count, are you going to run heapify(heap_root) ? Why heapify (why not increase_key) ?
Can you calculate all the complexities step by step?
In the end, what you are trying to do can be done in two separate stages:
1) Maintain the hash table of counts.
2) Sort (heap sort in your case) the internal table of hash table based on the count column.
But for 2) we should be able to get away with a linear sort like count/radix. So this should be doable in O(n) in practice.
Why down vote .. ? Did you relay try this algorithm .. ?
1. HashMap will store words as key and pointer\reference to a heap node.
2. Heap node will hold only count.
3. If a word occures and if it exists in hashMap than increase it's count, it can be at any level in heap, not necessary at root. After increasing a count of node in heap we need to maintain max heap so we execute heapify.
After the completion of input file, we will traverse heap to store\print sorted elements in decreasing order.
Note : It is a generic algorithm for any size of frequency or number if words in file.
Counting sort: Yes we can utilize that, if you are sure that frequency is smaller enough to fit in Integer's size of system.
And for counting sort you have to maintain a largest frequency.
Use two data structures - "Max heap" and "HashMap".
- Ajeet September 26, 2013HashMap will contain word as a key and pointer to a node of Max heap as value. And Heap node will has only counter for frequency.
Now just read a word from the file.
Check in HashMap, if it is already there than increase it's frequency\count in heap and run heapify to maintain max heap.
If it is not in HashMap than add a new node in heap with frequency\count 1.
At end of file heap will contain ferequencies in dereasing order.
Complexity:
Time O(NlogN) - N = number of words in file
Space O(M) = M = number of unique words in file.