Yahoo Interview Question
Software Engineer / DevelopersCases like what to do when one of the systems goes down, how would you deal with the loss of data.. etc.
There can be various ways to deal with churn in distributed systems. In the case of DHT, the following methods are popular:
(1) Replication at different nodes in the system; where to replicate is an interesting issue: the naive approach is to replicate in the successor/predecessor nodes (assuming DHT nodes are organized in a ring topology such as Chord or hyper-plane as CAN); a better approach would be to dynamically replicate by using content placement strategies to optimize total search cost considering the popularity ratios of the query workload
(2) Caching can be used a solution where the content is cached at various intermediate nodes in the query resolution path which also reduces the search cost.
The successor node takes up the responsibility of the failed node by default for answering future queries and quickly recover from node/link failures.
We can use distributed hash tables where each node in the distributed system manages some set of predefined keys. Along with hash table, functionality exists to route hashing/retrieval request to the other nodes in the DHT system.
- Dumbo February 15, 2009