Given a file, return the Top 5

Yahoo Interview Question for Software Engineer / Developers

0

of 0 votes

7
Answers
Given a file, return the Top 5 frequently occuring list of words
- vinodjayachandran June 21, 2011 | Report Duplicate | Flag | PURGE
Yahoo Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

~ read the document - O(n)
~ maintain a hashmap, and hash the strings as keys with initial count of 1 if the current term isnt present else increment the bucket count by 1 - O(1)
~ sort the map by values - O(n log n)

- son_of_a_thread August 03, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Our map should be of the form <word, pair<count, pointer>>. This pointer should be an index to a heap structure of size k (5 here). We should maintain a min - heap, every time the count of a word is greater than root of the heap we remove the minimum count seen and replace it with the current value and max-heapify. Cost of max-heapify is lg k. Even if we have to max-heapify after reading every word, the complexity will come out to be O(n lg k) whereas in case of sorting it'd be O (n lg n).

- Second Attempt February 17, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

the min-heap is a really smart way. the map is just use to store the counts of the word appear

- brady May 01, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

1)read the document and build a tree of words with its count
2)the bst should contain words in lexicographical ordering
3)increment each count of the word if it is fount in the tree otherwise add it in the tree
4)sort this tree.

- Anonymous June 23, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

novel approach but building a tree and searching for the contents of the tree takes O(log n) time.. hashing is preferable? Insertion and searching takes O(1)

- son_of_a_thread August 03, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Hashing aint the sol for everything. Could you imagine the space complexity behind hashing.

CareerCup

Yahoo Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews