Interview Question


Team: SQL Server / DPG Group
Country: United States
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
2
of 2 vote

Approach is to collect the words which start with same 'first 3 characters' into separate files. The size of these files will be in the range of MB's. Number of files = 26*26*26(assuming only case insensitive and only alphabets)

Then sort and collect top N words from each of the files and finalize on top N words.

Kind of 'Divide and Conquer' approach.

Worst case: if each word occurs only once!!.

- bluesky October 15, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

@bluseky: why specifically 3 characters only?

- JustCoding October 15, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

to split of the file of size 1TB to files of size in the range of MBs.
Three characters means 26*26*26 different files. one file for strings starting with.. aaa, another for strings starting with 'aab' so on...

- bluesky October 16, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

read the amount of the file that can fit into memory, and create an in-memory map or heap to count the occurrence for each word... print the top "N" values from the map once the entire file has been read...

O(n) time and O(n) space...

- JustCoding October 14, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

use MapReduce :-)

- Anonymous October 14, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

This is a typical question between memory and speed balance. In this case this space is precious.
- allocate 3 words with hits in the moemory

IF word IN top3words list
update the hit in top3words
ELSE IF top3words list HAS EMPTY SEAT
put this word into top3words and set hit as 1
ELSE
count the hit of this word from the beginning of the file
IF (hit > the lowest hit in top3words)
replace with this word and its hit

It requires constant memory but with O(n*n) computation.

- peter tang October 15, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

We can maintain a Minheap of most occuring N words and a Hasmap for occurence of words.

- loveCoding October 16, 2012 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More