Amazon Interview Question
SDE-2sCountry: India
Interview Type: In-Person
Avoid jumping right into class diagram and code level implementation in the beginning. The best way to approach such open-ended questions during Interview is to divide the entire designing process into two parts:
Note that High-Level Design techniques have a pretty straight forward. Having a hold on these and starting with these give you an upper hand.
a) High-Level Design
- Start building bottoms up.
- Discuss whether your Application is read-heavy or write-heavy. {Extract out the various operations}
- Try to understand how your system will communicate with different components.
- Discuss what all databases you will use, give reasons for them. {SQL/NOSQL}
- Discuss which two things out of three in CAP theorem interviewer wants you to focus on.
- Present ways how you can scale the system in case of heavy traffic.
# Caching approach
# Sharding database
# Adding Read replicas
# Discuss if you can use CDN for lesser network hops
b) Low-Level Design
- Select a component for which you want to discuss a LLD.
- Start with Class diagrams.
- Discuss the various use cases. If possible present a use case diagram.
- List down a few APIs relevant to this service.
- Now start introducing Design Patterns. {Factory, Observer, Chain of responsibility are few commonly applicable in most of the systems}
- If possible discuss on the SOLID design principles.
Video class - Contains Video attributes like time, author, type, and other stuff; Media functionns like start, stop, pause, resume, previous,next;
Youtube class - Upload, Download, archieve
Playlist class
SocialMedia class for shareit, like
Comments class to contains comments/user data
User class - to contain user information
UserStatus class
UserPlaylist class extending playlist class
Search/filter class (search video, related video, etc)
My idea is as follows -
- rsk October 26, 20161. For the storage of videos, the video files as such could be stored on the server (having an absolute pathId ) similarly 'Reviews' can be stored in flat file. Then there can be few DB tables such as 'Videos', 'Users' , 'Categories'. Videos table primarily has columns such as 'video_id, 'Name', 'Absolute path', 'Creator', 'Number of likes', 'Review location' , 'Category Ids' etc
Idea is to use RDMBS tables to efficiently query all the info associated with a given video, while storing the video file and set of reviews/comments in flat files, referring those using corresponding 'path locations'
2. Coming to Model layer design, could use a bit of 'strategy pattern' to identify basic objects as 'Users', 'Videos', 'Moderator/YouTube Controller'.
"User" class basically has all getters/setters for - user info , uploads done, videos liked, Categories subscribed to etc.
"Video" class basically has all info reg the Video by itself such has video_id, creator, methods to fetch number of likes, reviews, deletion logic etc.
"Moderator" class then interacts with User and Video for upload, fetchNPlay and Deletion etc.
Here the idea is to have a decoupling between these objects for addition/modification of any features, without impact others.
3. Coming to UI logic, scaling part, performance etc All that would primarily reside in the Moderator class. When many users are expected to parallely watch a video, there could be a memcache implemented for videos belonging to 'TopWatched', 'TopRated' , 'Trending Categories' etc (these would be technically methods in Moderator class). Also for performance, a particular video could be stored redundantly across servers/locations for multiple parallel fetches (but in this case, the logic to store the location could change a little to have - region wise location Ids, Primary/secondary locations etc). This could again be used NOT for all but for top viewed videos may be. Moderator can keep this list updated.
Please let me know your opinion on this design.