Programming Interview Questions

Page:

1

Comment hidden because of low score. Click to expand.

-1

of 1 vote

this problem becomes difficult when the file is really big...

one solution could be, you can partition the file into multiple segments and calculate frequent word lists (using hash tables) for each segments and merge them

this way, the space complexity is minimized. If you split the file into n segments, and store the top K frequently used words for each segment, you are looking at n*k space which will be less than storing all the words of the file in a hashtable

one obvious danger is there could be words that won't make it to any of the sub frequently used words and might therefore be not part of the final frequent words list.
But it it is possible that those few words ignored at each segment could potentially make the ideal K frequently used words list if not for this solution

- Nava Davuluri September 10, 2013 | Flag Reply

Page:

1

CareerCup

Nava Davuluri

Books

Videos

Resume Review

Mock Interviews