Google Interview Question
SDE1sCountry: United States
This is Information retrieval question.
What you probably got to do is create a data structure which will capture the user input (unsubs / subs), and subsequently apply to the existing user data in certain scalable fashion which varies on what basis the data is being used.
Any descriptive question, would love to have a personal Interview then typing
Few clarifications needed :
1. When we say performant, are we saying number of users (of a website) can grow much beyond 2k ? or is it more of "Intra-website" performance needed as in website content is to be interacted with, in a faster way ?
2. Scalable means number of websites can potentially increase ? or overall number of users can go beyond 2k ?
Accordingly, one basic design could be to 'deploy' each website on two servers one acting as primary and other secondary, with a 'sync' mechanism in place. Each server(hosting a given site) stores its subscribed user list (above idea of userId is good one i guess).
Pros - Entire system could be fault-tolerant. Every primary is backed by secondary server.
Performance could be addressed using load-balancers installed on both servers, redirecting traffic appropriately.
Cons - If a user is subscribed to all site, his info is stored on all servers, redundant.
Thanks.
Taking a shot at this. We need to have a lookup table where they key is userId and value is a list of the websites that user is subscribed to. Let's assume we assign each user a 32-bit(4 byte) int user iD. Let's also assign 4 byte int iDs to the websites. If in the worst case a user is subscribed to all websites, a single key-value pair will take up 16Kb of space we have a total of 16Kb * 10^9 users = 1.6 * 10^12 bytes of data (1.6TB). Assume that a single server can store 8GB of data then 1.6 * 10^12 / 8 * 10^9 = 2 * 10^3 servers ( which is what we have). This allows us to store data on 500K users per server. Each server can store a range of User IDs (ex. server 1 stores IDs from 1...499 999). When a user unsubscribes based on his userId we can navigate to the correct server containing the corresponding look-up table. In the look up table delete the ID of the website he wishes to unsubscribe from.
- divm01986 November 26, 2016