ashish
BAN USER
so this is somewhat like desinging a recommendation engine.
Here is my approach
lets there are three users
User A : Product bought A,B,C
User B: Product bought B,F
User C : product bought D,A,C
now when User D buys product C , we need to recommend him A,D,B.
approach
create a undirected graph with adjacency linked list approach and a hashmap which gives us node reference present in graph
now for user A : graph will have node A as one node and Node B and Node C will be neighbour node
For Node B , Node A and Node C are his neibhours,
For Node C , Node B and Node A are his neighbours,
Also keep on inserting these node addresses in hashmap as well.
keep on repeating these steps on each user,
now for User D we need to traverse Node C in BFS manner at one level and give him the output.
We can modify above approach by maintaing count of products bought for better approach
now how we can scale above approach.
since there could be millions of products the hashmap may not be ideal solution.
we can switch hashmap and graph to inmemcache DB such as redis or Aerospike. Which can be scaled accordingly.
This db has the key as product Id and value as adjacent nodes which are brought together with other node present in key.
so this is somewhat like desinging a recommendation engine.
Here is my approach
lets there are three users
User A : Product bought A,B,C
User B: Product bought B,F
User C : product bought D,A,C
now when User D buys product C , we need to recommend him A,D,B.
approach
create a undirected graph with adjacency linked list approach and a hashmap which gives us node reference present in graph
now for user A : graph will have node A as one node and Node B and Node C will be neighbour node
For Node B , Node A and Node C are his neibhours,
For Node C , Node B and Node A are his neighbours,
Also keep on inserting these node addresses in hashmap as well.
keep on repeating these steps on each user,
now for User D we need to traverse Node C in BFS manner at one level and give him the output.
We can modify above approach by maintaing count of products bought for better approach
To scale it, we can deploy servers which serve request regionally or according to domain suffix, ex for .in or .com, the requests coming for .in and .com will be served from different servers.
For region wise servers these will be served from region wise cache, each cache holds trending news of particular region for each topic.
user will subscribe number of topics, when first user gets logged in for his/her subscribed topics trending news will be fetched from region wise cache for each topic and will be served.
Now we need to update region wise cache frequently. A cron job will keep on running in background which will fetch news from news feeds , aggregate news from different feeds and update region wise cache.
By breaking system region wise , it is easy to scale since for not all regions load will distributed and each region wise cluster can be scaled easily .
since there could be many servers, and each server has its separate log file.
what we can do , rather than doing processing log file one by one. Maintain a min heap of top requests and a hashmap to index each query and to update min heap of individual server.
combine result of all the servers, i.e compare min heaps of all servers, and consolidate result in to one large heap to produce response.
caching can always help
- ashish July 03, 2016use connection pool.
which will solve the problem in way below than 35 seconds