Design YouTube view-counting f

Google Interview Question for Software Engineers

0

of 0 votes

6
Answers
Design YouTube view-counting feature
- tested.candidate July 13, 2015 in United States | Report Duplicate | Flag | PURGE
Google Software Engineer System Design

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: In-Person

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 3 vote

Always simplify things and start building up.

Assume that we have a small set of videos that we need to track the counts. Also assume that we are not handling any spam. A basic approach is using relational databases (video_id, count) that is being incremented whenever a user clicks on.

What is the problem with this approach? The database operation is very expensive for each click. A possible solution we can have Memcache or Redis layer above the DB that counts the number of clicks for each video. And having a background job that reads the cached counts and reflects that in the DB. Pros: much faster, Cons: more complex, can lose data in case of server failure and cached values have not populated in the DB yet and the values in the DB are not the most recent values.

Another side of improvement is bandwidth wise. Assuem that it is not necessary that the number of hits to be really accurate. We can convert the connection request for increment to be based on UDP rather than TCP.

Again discuss with the interview the pros and cons and let him lead which direction should the discussion goes.

- ahmad.m.bakr July 14, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

How should we cope up with the competition created on acquiring the lock on that Redis key-value pair entity that we are trying to increment.

- asen November 14, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

You can probably enqueue a video id , each time video is viewed and some background process can pick items from queue and can maintain dictionary which it increments every time a video is picked and can write back to DB when dictionary reaches a certain size.

Since counts are always increasing so if multiple copies of background processes can run

- smarthbehl July 14, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Do you think a general queue of video ids would be better idea ? because there is too much request to enqueue it and consumer which is your background process will keep waiting. For me as soon as some one will request for video, server should add the view count. So every time a Map which has key - value pair where key is video id, is accessed, it should increase the view count at same time.Value can be pointer to an object which should have static view count. As soon as it will call view function, it should increase the counter and provide the object with that view number. Once it has to update in Database, it will update with video id and count after specific time.

- Pheonix August 02, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I'm not sure how technical I would get right away with a question like this. I feel like you have to tackle the more abstract questions first. Registering that somebody has requested the page a video is on is trivial on the surface. Server gets a request for a video id, finds that id in a DB, increments it. How these things are done can be expanded upon at the DB level, at the level of the code in the server, etc etc

The bigger questions:

- What is a view? If I refresh a page, certainly this doesn't seem like a view, except what if I view it again a week later? Or I don't refresh the page at all, but simply keep replaying the video?

Since we are 'designing' the view count, it's our call. I would only allow one view per day per ip, and ask the interviewer about what holes would exist in blocking by ip. I'd defend this design by saying it's more important to keep view counts overly-honest, because the error will be much lower than those trying to abuse it, where they might boost a video to 1 million rather than a video having 100 views instead of 300.

- narth August 07, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@asen
We don't really want a hard lock in this situation. I'm assuming a discrepancy of say 10 counts is acceptable.

You can always let your current browser video view count increment and send the count update request to the server. The server can queue the requests and update the count when processing the requests. With this scheme, the difference in view counts can be greater than one when you refresh the video.

- confused_coder August 21, 2016 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I want to point out one observation that hasn't been bought up in the previous discussion. The data you need is simply in your web server's log. Have a reducer to aggregate the log and periodically send back the data to a key-value store. Record-level locking would be fine, because latency requirement is not real-time here.

- Jack September 05, 2016 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

It could be done by keeping distributed logs with youtube_id and user_id and then combining them and updating the count in the database.

The entire idea is discussed here https: // youtu.be / zSGRQDMfLkI

- LearningNinja January 25, 2018 | Flag Reply

CareerCup

Google Interview Question for Software Engineers

Books

Videos

Resume Review

Mock Interviews