Interview Question
Team: SQL Server / DPG Group
Country: United States
Interview Type: In-Person
This is a typical question between memory and speed balance. In this case this space is precious.
- allocate 3 words with hits in the moemory
IF word IN top3words list
update the hit in top3words
ELSE IF top3words list HAS EMPTY SEAT
put this word into top3words and set hit as 1
ELSE
count the hit of this word from the beginning of the file
IF (hit > the lowest hit in top3words)
replace with this word and its hit
It requires constant memory but with O(n*n) computation.
Approach is to collect the words which start with same 'first 3 characters' into separate files. The size of these files will be in the range of MB's. Number of files = 26*26*26(assuming only case insensitive and only alphabets)
- bluesky October 15, 2012Then sort and collect top N words from each of the files and finalize on top N words.
Kind of 'Divide and Conquer' approach.
Worst case: if each word occurs only once!!.