Google Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: Phone Interview
Is linkedlist + hashtable more flexible? array's size if fixed and you can't free the memory of the deleted number
If you do linkedlist+hashtable, get_random() will take O(size) time instead of O(1) time, because you need to traverse the first k list elements to find the kth.
The array size being fixed is not a big issue - if you run out of space, you can move the array somewhere else where you have twice as much space. That means you get n O(1) time insertions before a single O(n) time insertion (because of an array move), which means that insertion/deletion is still O(1) time on average.
Instead of having to really manage the sizes of your array and allocation and stuff like that, simply use an ArrayList, it will handle all of the background grunt work for you, and will offer constant time insert and delete.
@skeptical.scientist why is O(size) for get? hashtable has O(1) search time. You can simply get that node in the hashtable instead of traversing.
zylo, get *which* node? You need to write a function which returns each element with equal probability. I'm not saying you can't do this in O(1) time with just a hash table, but I don't see how you would do it.
get it. I was referring to the delete function. Yes, you are right, for the random function, it'll need O(n).
you could easily do this in O(1) time by getting the size of a hashmap, using its built in size function in java, then you simple need to return all values or values in the map into an array, and then choose a random index. that should be constant time
@anonymous: if you are turning the hashmap into an array every time get_random() is called, that will take O(n) time. Unless you have the array stored all the time, and just update it when you add/remove an element from the set. That is exactly what my solution does.
Okay, two people have now said there's a problem with my delete function, which probably means there is... but what? I'm not seeing it.
In the delete function, instead of deleting key p from the hash table, you can replace the value of key p with the value at index size (array[size]) and can delete the key with that value. This way, your array also will not be sparsed.
@skeptical.
Consider the following test case:
add A, add B, add C, delete B, add D, delete C.
@letrysout: You have no clue how hashtables work, do you?
Ah, thanks anonymous. I was forgetting to update the hash table with the new location of the value formerly at the end of the array when I moved the value.
Why not a hashtable itself. Just add the unique element constraint while adding and that should do. Why have an array + hash table?
Why can't we use just simple linked list, or an array or whatever data structure where we can add and delete elements and to keep the size of it for the random function?
If you have a linked list or array, you can't quickly insert/delete, because you can't quickly check whether the element is present and where in the list/array it is. You can do it, but it will be slower.
These operations will take O(n) time once your set grows to size n, whereas in the array+hashtable implementation both operations are O(1).
How about a balanced tree with the following structure
treeNode* leftTree;
treeNode* rightTree;
int data;
int nosNodesLT; //Number of nodes in left tree
int nosNodesRT; //Number of nodes in right tree
Then obviously
Time Complexity of add = O(log n)
Time Complexity of delete = O(log n)
For the random return, we will use the fields nosNodesLT and nosNodesRT as follows:
1.Choose k to be a random integer between 0 and n−1 inclusive. Let A initially be the root node of the tree.
2.Let m be the number of nodes in the left subtree of A. (If A is a leaf or has only right children, let m=0.)
3.If k=m, choose A as the node we want and stop.
4.Otherwise, if k<m, replace A with its left child node and repeat from step 2.
5.Otherwise (i.e. if k>m), subtract m+1 from k, replace A with its right child node and repeat from step 2.
Again Time Complexity is O( log n)
Space Complexity : O(n)
The only problem seems to be with random() function. Otherwise an array and a hashtable would have sufficed.
Lets say that we delete a number from array, and its corresponding entry from HashTable. Now if we want to find results of random() we need to find exact number of empty/vacant spaces in the Array. This is required because if random() function returns a random value between 'a' and 'b' and if it turns out to be some position which is vacant, then we need to run random() function again.
Use an array to store the elements of the set, and a hash table to store a list of (key,value) pairs, where the keys are elements of the set, and the values are the array indices for the keys. For convenience, we also store the size of the set.
The function randrange(x,y) is assumed to return a random integer in the interval [x,y).
- skeptical.scientist January 05, 2013This has amortized O(1) time for all three functions.