Ebay Interview Question for SDE-2s


Team: Traffic
Country: United States
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
4
of 4 vote

@juny this is one of those open-ended questions where there can be a few good approaches instead of one single correct answer.

Some points that I can see right now,
1) There are lots of events per day, 1billion events/86400sec = O(10^4) events/sec. We should be able to handle this much load.

2) Since nothing is said about the distribution of events in the whole day, there can be dull periods and then big spikes in the events coming in. We should be able to handle these spikes gracefully.

3) The data structure that we choose has to give good performance against the requirement of giving most recent events in a given time period (like last 60 secs here).

4) Holding 1 billion events in a single machine's RAM may not be feasible if individual events themselves are not small. Even if they are small, there could be requirement to fetch data for an event into the RAM and when you have many events that could quickly eat up the RAM.

5) The system must have very little downtime as there are so many events coming in every second.

6) What about persistence of events? Do we need to store every event that comes in? Maybe, we would be need to do some analysis later on the events coming in.

Some quick ideas for above points,
(1) and (2) Load balancers can help here. Also, queues to hold the requests, so as to not overwhelm the system with spikes.

3) queues or lists which hold items in the order of their timestamps can do the job here

(4) , (5) and (6) We need to use distributed data structures to handle this. We could divide the huge list of events across the machines. Each machine has a list of events (ordered by timestamps). When a query is sent, every machine gives list of events in last 60 secs, which are merged and sent back as output to the user who queried. You can think of doing this stuff using frameworks like Hadoop. For persistence of events you could use HBase (single hard disk would not be able to hold all event data after a few weeks perhaps).

Obviously, this is not a complete solution and there are flaws. Hope this helps!

- DevGuy January 24, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Thats a good start, it will be good if you can talk about interaction (sequence diag) from end to end for both the put and get scenario

- juny January 25, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

thank your for your answer.

- suwei19870312 July 04, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

It was more on talking about system design also

- juny January 23, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 5 vote

Use double ended queue containing data from the last 60s.
When new data were sent remove old ones (which exceed 60s) from the end of the queue and add new data on the beginning of the queue.

- thelineofcode January 22, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
-2
of 2 vote

for data structure, stack of size 10^9/24*60 events can be used, where recent event will be added to the top of stack.

- ashi January 22, 2014 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More