Groupon Interview Question for Senior Software Development Engineers


Country: United States
Interview Type: Phone Interview




Comment hidden because of low score. Click to expand.
1
of 1 vote

Some stats on the given data:
RAM: 250 MB
Data size: 10 GB - we dont know the nature of this data, is it just text information about the movie or also contains the movie itself. I am presuming the former.

How many movies are there in the world ? Lets put an upper bound on the this information? - 0.5 million ?

Around 20 KB of movie information is present about each of the 0.5 million movies in the given 10GB of data.

If one ought to load all of the 20KB info onto RAM, we can read only 12.5 k movies at a time. In which case, loading partial data from disk and searching for the movie of choice is one way to do the job. One needs to load data 40 times to finish searching for one movie.

Alternatively, If the movie title is hashed (a MD5 hash uses 128 bits representing 32 chars(2 chars per byte)).
The movie info's location at disk can be represented by a path - approximately 20 words? (each word can take ~10 bytes). So in all, each key(title):value(location in disk) would max take 250 bytes

Given 250 MB ram, we can store upon 1 million movies.(which is above our original guess on the number of movies).

So, when user searches for a title, md5 hash the title(takes about 400 MB per second - can handle upto 2000 requests in a second to generate hash value), look up in the in-memory hash map, and do a disk I/O for looking up description and showing it to the user.

Trouble is with disk I/O on concurrent requests.

Nature of this application is more read-oriented than write. So, using concurrent disks like (Multidisk arrays/RAID(0-6 which are more read friendly)s) could help in faster IOPs.

One could also have a multicore machine which can parallelize the process time.

- venkatvi November 28, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Preprocess the DB and create files with title -> movieInfo mapping. Caliculate the hash of the title and append this information to the corresponding file. Given a query, calculate the hash and load the corresponding file, If the file size is more than 250 MB, keep loading the file partially and search the contents.

- v May 25, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Stats:

10GB data

250 MB RAM

How many movies are there in this world ? 0.5 million ?

Amount of information per movie available = 10GB/0.5M = 20 KB of information. So most likely textual information and not the movie itself.

How much can 250MB RAM hold ? 12.5K movies at a time - 40 loads of the data per query to search for title. (linear search/binary search - you still need to load data to acknowledge that the section of data is not what you are looking for.!).

Alternative approach - MD5 hash the movie title - 16 bytes (128 bits). Another 200 bytes to store the movie location on disk. = 250 bytes per movie stored in a hash map = 1 M movies can be looked up!

So, when a query comes in, MD5 movie title, look up in hash map, go to disk (IIIIIIII/OOOOO) and fetch the description and display it!

- Multidisk arrays ? and store in hash map the disk location

- RAIDs for greater IOPs.? (Prefer read- friendly RAIDs for cheaper cost)

Use multi-core machine to parallelize request processing.

- vaidehi.venkatesan November 28, 2015 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More