Amazon Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
I would think that HashTable is much easier to support concurrency in many readers but only one writer, because we could lock or no lock the key-value pair when insertion or update(lock-free means we can delay-delete the node).Also for BST, I also think it's ok for insertion or delete, but for Red-Black tree or AVL tree, it is another matter for the rotations.
"BST gives good serialization when stored in PreOrder format but not in postorder or inorder."
I don't know why you say that.
"HashTable can be serialized only if one makes sure to overwrite the hash function with the keys in the has table during serialization."
I have no idea what you mean.
"I dont think BST basically supports much of concurrency stuff..Hashtables when moved to the concepts of concurrent hashmaps they do give concurrency support"
The Collections.synchronizedMap(...) method in Java, for instance, can make TreeMaps synchronized as easily as HashMaps. Whether a map is synchronized and whether it's backed by a tree or hash table don't have to be related.
"but as such the hash tables are synchronized so I dont think they would support concurrency"
I believe something like Collections.synchronizedMap(...) just locks the data structure every time it's doing a read or write, but more sophisticated implementations are definitely possible for both trees and hash tables.
In a hash table you could lock on a particular key value pair, so that no one can delete that key when one thread is using it. but anything can happen to the rest of the keys you don't care. But in a tree map you can't do that since it's implemented as a red black tree during an insertion/deletion the whole data structure can change. due to the rotations performed in it. so if you lock one key you are basically locking the whole data structure.
Except in my last paragraph, I was mostly talking about the case of basic synchronization: just making it thread-safe, as opposed to having it be concurrent in any particularly optimized way. Basic thread-safety is easily doable for both hashmaps and treemaps.
In my last paragraph, I was thinking about how there are at least some improvements for both trees and hashmaps. For one, we can have a strategy for both HashMaps and trees where we allow an arbitrary number of concurrent reads, and only one concurrent write (that blocks reads and is blocked by reads).
Hash map synchronization is complicated too. A hash map probably has collision resolution in the form of linked-list chaining, and the linked lists can be in the process of being manipulated when you're doing a search. You might have a lock for each bucket to solve this problem, but then you must still come up with a strategy for how to handle the resizing of the hash map in a thread-safe way.
You're right that supporting concurrent writes for trees is a lot harder, though.
I would say Hash table is not suitable if there are too many insertions or deletions happening. The question doesn't only to read data.
what about serialization?
- DashDash February 16, 2013