Microsoft Interview Question
Software Engineer / DevelopersTeam: Cloud
Country: United States
Interview Type: In-Person
concurrent hash map using multiple locks. I'm using global array for simplicity and omitted delete operation. A real hash map should not be like this.
#include <iostream>
#include <vector>
#include <mutex>
#include <string>
#include <algorithm>
using namespace std;
struct node {
int key;
string val;
node* next;
};
// Let' assume this hash function is a really good one :)
size_t my_hash(int val) {
return (unsigned)val % 1001;
}
mutex m[4];
node* my_map[1001];
// Let's assume that this is an internal API
static node* hash_search_impl(int key, unsigned h) {
auto c = my_map[h];
while (c) {
if (c->key == key) {
return c;
}
c = c->next;
}
return nullptr;
}
string hash_get(int key) {
auto h = my_hash(key);
int mi = min(h / 250, (unsigned)3);
{
unique_lock<mutex> lck(m[mi]);
auto n = hash_search_impl(key, h);
if (n) {
return n->val;
}
}
return string();
}
bool hash_put(int key, const string& val) {
auto h = my_hash(key);
int mi = min(h / 250, (unsigned)3);
{
unique_lock<mutex> lck(m[mi]);
auto n = hash_search_impl(key, h);
if (n) {
return false;
}
n = new node{ key, val, my_map[h] };
my_map[h] = n;
}
return true;
}
To avoid locks, we need to use atomic compare and swap operation. For example,
#include <iostream>
#include <string>
#include <algorithm>
#include <atomic>
using namespace std;
struct node {
int key;
string val;
node* next;
};
size_t my_hash(int val) {
return (unsigned)val % 1001;
}
atomic<node*> my_map[1001];
node* hash_search(int key) {
auto c = my_map[my_hash(key)].load();
while (c) {
if (c->key == key) {
return c;
}
c = c->next;
}
return nullptr;
}
string hash_get(int key) {
auto n = hash_search(key);
if (n) {
return n->val;
}
return string();
}
bool hash_put(int key, const string& val) {
auto n = new node{ key, val, nullptr };
auto h = my_hash(key);
do {
// Maybe other thread is likely to insert the same key
auto nn = hash_search(key);
if (nn) {
delete n;
return false;
}
n->next = my_map[h].load();
// using atomic compare and swap operation
} while (!my_map[h]->compare_exchange_strong(n->next, n));
return true;
}
This is a naive implementation, but can show a basic idea on how to implement lock-free hash map. If we need to support remove operation, then algorithm will be much more complex.
First answer
Lock the whole take when doing any update/insert/delete operaton, allow reads without any lock, reads will wait if we have any lock on the table, we need to keep a priority queue for the sequence of operations, which will give reads more priority over writes, and within reads or writes, the priority can change based upon requirements and requested time.
Second answer,
We will use row wise lock, instead of whole table lock, we will put lock on only those table which are/will be affected by the current write operations, rest all will be allowed to read.
now whether or not we allow uncommited reads is the question of requirements, we may allow based upon the lock, (we can define the type of lock, in case of delete lock, we can directly return not found kind of response for read query, without making it wait on the lock.
Please note that, each lock will have a priority queue associated with it, when lock is revocked, the queue will not just vanish until all its entries are delete, the read operations in the queue will create multiple thread, and allow them to finish together, until then write awaits, we will finish all the reads first, before we start the write operation, (we can associate a timer as well, which will force the write operation to happen.
Operations on a hash table - Insert,delete,find.
- JD September 05, 2014Insert, delete ->Write operations, Find -> Read Operation.
Now, if we use a single read write lock on the entire hash table it will not scale to high concurrency.
Soln - Striped locks.
Divide hash table into N slots, each slot having its own rw lock.