Google Interview Question
InternsCountry: India
Interview Type: Phone Interview
What if an item is deleted from the arraylist items? Do you need to restore all the indexes after the deleted item?
Instead of deleting from list rather mark that position null, then index restore will not require. This approach will leave many unoccupied space, we need rebuild to clean up such space. Similar technique is followed in database.
define class DoubleLinkedList<Item>
1. using DoubleLinkedList instead of ArrayList, which guarantee O(1) delete and Insertion.
2. TreeMap<KeyValue, List<DoubleLinkedListNode<Item>> for each key;
3. HashMap<KeyType, TreeMap<...> mapOfIndexes;
...
Insert / delete / search: O(log(N))
Iterate: O(N)
This could be further optimized by making few changes.
- Instead of maintaining the items in a separate arraylist and have the positions as the value of the TreeMap, we could actually have the LinkedList/ArrayList of items as the value the TreeMap, so that way, you don't need to readjust the positions every time something changes within the ArrayList, especially delete.
So it will be something like,
TreeMap<Key, ArrayList<items>>
HashMap<KeyType, TreeMap<Key, ArrayList<items>>
This problem is more like having an index on the keys - This is how it is done is databases.
Why do we need a seperate TreepMap (data structure #2)? Can't we represent the required data structure with just a HaspMap (data structure #3)? For example, a combination resulting from the answer and the suggestions from the comments by PUdayakumar and lekuocom:
HashMap<KeyType, TreeMap<Key, DoubleLinkedList<items>>>
you can have
Map<HashSet<keys> , itemUniqId> and
Map<HashSet<keys>, Item>
To insert, check if any of item got this key, get uniqId and retrieve the item, if nothing found, insert new
To delete, similarly, get uniqid, delete the item and entry from uniqId
To iterate, just iterate over 2nd map
Your solution has a very large complexity, since for deletion you will need to iterate over all the items to check if key exist or not.
Instead, one can have a separate map for each key as this will be more efficient.
However, I still need to think, how sorting on these keys will be efficient.
A better way would be to use HashMap with Balanced Binary Tree to improve overall efficiency.
Considering Java Data structure
HashMap<key, item> should solve the problem. [k1-Item1, K2-item1, k3-item1, k4-item2 ...]
LinkedHashMap should be good for insertion and deletion.
For sorting, we have to maintain a key list implement comparator. sorting performance will be nlog(n) with collecttion.sort(list)
Store the keys in a hash map,with a key made of hashing then together. For ex uid1:(k1,k2...). Use treemap/binary trees to store each key:value. If there are three keys, then three treemaps. Value in these treemaps will be the hash of the keys.
Insert:
First calculate hash of keys together and store it. Using individual keys store in the respective treemaps with the hash value.
Delete:
Use the key provided, go to the corresponding tree map, get the hash. Using the hash get the associated keys. Now delete this entries and other entries from the treemaps.
Sorted access:
Use the corresponding treemaps.
I didn't fully get the solution but i have a query. Isn't treemaps a concept of data visualization rather than storage.
We can create k + 1 hash maps - one for each key and the last one for the actual item.
To Insert an item we
1. create a unique id (hashing all keys, incremental index)
2. foreach k in k1,...kn in the item we map k to the unique id in the respective hash map
3. We map the unique id to the item in the last hash map
To search by any given key, we use one of the hash maps to get the unique id and then use the last hash map to retrieve the item
For deletion, we first retrieve the item (as described in the previous step), then delete the mapping in each hash map using the key in the retrieved item
For sorting according to any key, we can sort the keys in the relevant hashmap.
This approach enable the operations to be performed in parallel. A downside is the space required to store all the hashmaps.
First of all, we should take k1, k2, k3 as unrelated keys, because that's the usual case for database.
Second, any data structure with hashtable woudn't be efficient because it's super slow for iteration, eg mysql doesn't use hashtable for index.
Here is my take.
1. Since this is related to database, we should use b+ tree for k1 with leaves pointing to pages (disk page size) where array of k1-sorted items are stored.
// Make use of locality of reference with O(log n) -> super fast for all operation even iteration.
2. For k2 and k3, we should use b+ tree as well, but the leaves are pointing at pages where array of k2/k3 sorted references are stored. These references points to the page id and array index of the item from (1).
// Slower than k1 for iteration, but still make use of locality of reference on all nodes except leaves. The references are still on pages, which is fast, but access item data will be random access. Searching is same as k1.
If speed is the main concern for k2 and k3 iteration, then you can store k2 and k3 data as k1, but 3 times the data.
I suggest to keet three maps, all operations are 3*(operation for map) I think it would be not much, here is implementation.
#include "stdafx.h"
#include<iostream>
#include<string>
#include<map>
#include<vector>
#include<list>
using namespace std;
class MultipleKey
{
public:
void insert(vector<int> aKeyList,int aVal)
{
m_MK_1.insert(pair<int,int>(aKeyList[0],aVal));
m_Ptr ptr = m_MK_1.find(aKeyList[0]);
m_MK_2.insert(pair<int,m_Ptr>(aKeyList[1],ptr));
m_MK_2.insert(pair<int,m_Ptr>(aKeyList[2],ptr));
}
private:
typedef map<int,int>::iterator m_Ptr;
map<int,int> m_MK_1;
map<int,m_Ptr> m_MK_2;
map<int,m_Ptr> m_MK_3;
}
;/**/
int _tmain(int argc, _TCHAR* argv[])
{
MultipleKey obj;
vector<int> vec(3);
vec.push_back(1);
vec.push_back(2);
vec.push_back(3);
obj.insert(vec,999);
return 0;
}
The required data structure will compose of following standard data structures:
- kainm July 09, 2014{
1. ArrayList<ItemObject> list;
//ArrayList to store all items.
2. TreeMap<KeyValue, List<Position>> index;
//TreeMap for Key to Item Indexing and Sorted order iteration.
//KeyValue is value of any key in Item and Position is index number of Item in ArrayList. List Of Position is taken because multiple Items may have same key.
3. HashMap<KeyType, TreeMap<KeyValue, List<Position>>> mapOfIndexes;
//HashMap to manage indexes for each key type.
//KeyType is type of key and TreeMap is described above as index.
The required operations are as follow:
method add(item): void
1. add item in ArrayList and save its position.
2. For each key in itemObject, get TreeMap from HashMap and add itemPosition in TreeMap.
method search(key): List<Item>
1. Get TreeMap from HashMap for provided key
2. Get List of position from TreeMap for the key value
3. Prepare List Of item from List of position
4. return List Of Items
method delete(key): List<Item>
1. Using search method get List Of item
2. Delete items from ArrayList and also delete its mapping from all TreeMap
method sortedIterator(keytype): Iterator
1. Get TreeMap from HashMap for provided key
2. Get EntrySet from TreeMap
3. return EntrySet.iterator();
}
All operation are almost constant time.