How would you synchronize a li

Google Interview Question for Software Engineer / Developers

1

of 3 votes

18
Answers
How would you synchronize a linked list across multiple computers. If nodes are added/removed to a linked list on one computer, all others must reflect this change. Concurrancy must be accounted for. (java)
- Guy January 22, 2014 in United States | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer System Design

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 3 vote

One solution is that, Identify one node as primary node, and everytime there needs to be a change in local copy, lock the linked list in primary node, perform change in local copy, update it to reflect in the primary node, relese lock, send broadcast message to everyother node to sync to latest, with ChangeID.

- Kiran Kumar January 22, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

This sounds promising. So the all nodes are their own linked list. But the primary node has both the underlying linked list and a local copy. Whenever a change is made, it goes to the local copy (lock the local copy in the mean time). When local copy is updated, we update the underlying linked list in the primary node, then send broadcast message to all other nodes to update their linked list to latest? How should we use the changeID though?

- Guy January 22, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Distributed locks!

- Noobie January 22, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Could you please explain more how it works?

- Guy January 22, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

This can be done without distributed locks--however, coding and analyzing this approach in a short interview is really not realistic. Links aren't allowed here, so just google "Lock-Free Linked Lists and Skip Lists" by Fomitchev and Ruppert.

- nilkn January 22, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Oh it looks very complicated. I believe the interviewer just wanted me to describe the approach, not actually coding it. Is there any better, cleaner and simpler way to do this?

- Guy January 22, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

One idea which might work would be to take a "transactional" approach. Instead of trying to keep the linked lists directly synchronized, instead keep a log of operations performed on the list and target this log as what you're synchronizing. The idea is the current list can be constructed by applying the log of operations in sequence to some neutral starting point (the empty list).

This has obvious downsides, but it 's fairly simple and it works because the log can be kept synchronized via something very well-known like the Paxos algorithm.

Others may have better ideas. I have no experience with distributed locks personally so I'm not sure how easy of a solution that is.

- nilkn January 22, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

How is synchronizing a file easier than synchronizing a linked list?

- Guy January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Please check below link for solution:
trydatastructure.blogspot.com/2014/01/technical-question-answer.html

- Vishal January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

This is distributed lock way.

- Vishal January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

It only explains how the lock system works, but doesn't specify how it synchronizes the changes, using message that contains a list of new commands or using a complete log file or even just synchronize the list directly.

- Guy January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Hi Kelvin,
Regarding synchronizing the linked list:
I have mentioned it in blog
i.e.
"Once exclusive lock holder done with its operation then it can ACK along with changed data as piggyback data."

Means once any node modifies the data with exclusive permission (i.e. from other no one can even read) and if lock manager now requesting lock revoke from exclusive lock holder, it means now some node trying to read or write. then
The previous node which has modified the data (with exclusive lock) will send modified data in the form of "piggyback" and using this piggyback data any one (who gets the read or write lock grant) can get the latest/updated information.

Please let me know if I am not clear (my mailID vishal.p.lad@gmail.com)

- Vishal January 27, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Google "Paxos Made Live - An Engineering Perspective".

- nilkn January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Careercup...

This was meant to be a reply to kelvin198897's question "How is synchronizing a file easier than synchronizing a linked list?"

- nilkn January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

We can use optimistic locking machanism.

- lina January 28, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

We can tackle this problem using a master for synchronization and N nodes for actual storage of the linked list.

Every time a node wants to do something on the linked list (since we are using a linked list we can suppose the operations available are append on both ends/remove on both ends), it has first to acquire a lock which state is controlled by the master.

This will work but won't allow for much concurrency as no parallel operations would be possible.

If we want to allow parallel operations to occur, we can make the operations reported (marked as => below) to the master before they are actually performed on any node :
Operation 1 - Node i => Append 3 to the linked list
Operation 2 - Node j => Remove the top element of the linked list

The master would allow the operation 1 and start propagating it to every N nodes.
Then it would see the operation 2 and would not allow it as it is not compatible with operation 1 (add/delete on the same side cannot be done concurrently).

Another sequence could be:
Operation 1 - Node i => Append 3 to the linked list
Operation 2 - Node j => Append 5 to the linked list

In that case, the master could allow each operation to be performed but would propagate operation 1 before it actually starts propagating operation 2.

In my opinion, the key point here is to ask :
- should we allow parallel operations and which ones ? if no, then a master lock will do fine
- does it matter if the order of insertion/deletion is not respected ? (i.e. we can permute two consecutive append or two consecutive delete) If not, then we can allow for some leven of concurrency

- Thomas February 09, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

There can be two ways of doing this:
[1] Consistent way where the linked lists in different computers are guaranteed to be consistent. This however will be at the expense of availability and latency. When an update happens on one machine, it sends the update to the other machines and
waits till a majority of updates are applied. When a read query comes, it must also look for a majority of machines to agree on the state of the linked list. Paxos may be used to do this more efficiently.

[2] Eventually consistent way where the linked lists may be out-of-sync for a small amount of time, but they will be in-sync eventually. Here every machine sends its updates with timestamps to everybody else. When one machine receives others updates, it will apply those and its own based on the timestamp.

This is just a brief sketch - there are more issues here like what if updates come out-of-order, machine goes down, network partition, durability (commit log) etc

- bhombol February 26, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Here is my solution for this problem:
1.) Every node or server will maintain the two local copy(current local changed version) and Shadow copy(which will represent the copy it received from the other server last time).
2.) There will be one operation stack which will represent all the operation being done on the node of the list until it received ack back from the network for the changes it send last time.
3.) When the node get the ack, it will remove all the operation in the stack uptil that version.
4.) If any time while merging the changes with local copy server finds both of the version whatever its storing and whatever it received are out of sync from one than one version. It will discard all the changes and take the backup from the shadow copy and again calculate the diff and send it across the network.

This is the same solution which is being used by all online shared document editors and really scales well.

- VICKY.HIMANSHU18 June 01, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

We can use git's approach here. Assuming there's a single source of truth on one server, the other servers can try to send updates to the central one. The central server accepts or rejects updates received from others. It accepts if the requests are sequential (say based on a timestamp). It rejects a request if it is older than a timestamp already applied to the list on the central server. The requester can then pull the latest state from central server and then push its updates back to the central server, which may again accept or reject those changes. Arguably, if a lot of changes are happening in the system, then a lot of the peripheral servers will get rejected and will have to keep retrying (similar to the rebase problem of concurrent merge requests in github)

- Dv February 18, 2021 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

- Dv February 18, 2021 | Flag Reply

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews