## Amazon Interview Question Software Engineer / Developers

• 0

Design any datastructure with three methods insert, delete and getRandom in a highly optimized way. The interviewer asked me to think of a combination of datastructures to design a new one. Insert can be designed anyway but for random and delete i need to get the position of specific element. He gave me a hint to think about the datastructure which takes minimum time for sorting.

Country: India
Interview Type: Phone Interview

Comment hidden because of low score. Click to expand.
4
of 4 vote

Hashtable + vector.

On Insert, insert into vector end, and hash with key as element and value as the index in the vector.

On delete, follow hash, get index, swap with last element, and delete last element. Don't forget to update the hash value of swapped element.

On random, pick a random element from vector (generate random index).

Can replace hash with any other lookup structure.

Comment hidden because of low score. Click to expand.
0

There's just a little snag here. This works great for unique elements. How will you deal with duplicates?

Comment hidden because of low score. Click to expand.
0

@eugene: You can maintain a count along with the index in the vector. On delete you decrement the count and if it goes to zero, perform the swap.

Similar to reference counting...

Comment hidden because of low score. Click to expand.
0

A count is insufficient. You need information on where all the elements having a particular value are.

Comment hidden because of low score. Click to expand.
0

This was assuming getRandom returns each distinct element with equal probability, so allot only one entry in the vector for each distinct element.

You raise an interesting point though (albeit indirectly I suppose): what if we need the probability to be proportional to the count of the element?

I suppose you are alluding to that case?

A balanced binary tree with count information will work in that case (although run time becomes O(log n)).

Comment hidden because of low score. Click to expand.
0

@eugene: Added answer. Not too detailed, but I suppose too detailed for you :-)

Comment hidden because of low score. Click to expand.
0

My interpretation of the question was that the probability of each entry must be the same, not each distinct entry. In the distinct case it really is as simple as having a count. Having some sort of balanced order statistic tree seemed like the obvious solution, but the real question is whether we can implement it while keeping all operations in expected O(1) time.

It seems that we can. Instead of your HashMap mapping values to a single index, it could map values to a HashSet of indices. Then we need a getAny() operation on this HashSet, that returns and deletes any one value in the set (with no requirement on which value it should be). We need this operation to run in expected O(1) time. Though it's tricky to prove, I do think that picking a hash bucket within the set at random until we get one that has at least one index in it, and then just taking the first index we see, accomplishes this goal, provided that we rehash periodically to keep the number of buckets within a certain ratio of the number of elements in the set.

If we have this method for the HashSet, given a delete, we get our HashSet of indices for the value to be deleted, and we pick and remove one such index, call it w, in O(1). Now we go to the end of the array, delete the value there, call it x, and put it at index w. We search the Hash*Map* for x and get a set of values where x occurs. We delete array.size()-1 from that set, and we add w to the set.

I think this covers it.

Comment hidden because of low score. Click to expand.
0

This seems same lyk simple hashtable(k,v) where k is the VALUE
and v is Count of the VALUE
Whats the difference between this and that?
PS. I'm a novice, pls explain and sorry if I'm silly

Comment hidden because of low score. Click to expand.
0

I'm actually storing a set describing where the values occur. If you only store a count, you won't be able to delete the values efficiently.

Comment hidden because of low score. Click to expand.
0

"Now do a search through the tree, using the cumulative counts as the key, with the additional step of decrement the count of a given node, when you go to its right subtree."

Hey will u please elaborate this part..I am not getting this.

Comment hidden because of low score. Click to expand.
0

I have a doubt about how you guys are implementing hash table..what is your hash function..how are you joining distinct elements to their indices in the vector using hash table in O(1) time.

Comment hidden because of low score. Click to expand.
0

The exact same way a hash table does anything in O(1) time. Our hash function? Whatever hash function you normally use for integers. So pick a random integer a and a large prime and do something like ((a * element) % p ) % numBuckets

Comment hidden because of low score. Click to expand.
2
of 2 vote

@Eugene raised an interesting point about duplicate inserts, and requiring probability to be proportional to the count.

The earlier hashtable + vector can now be replaced with a

hashtable + binary search tree.

Instead of a vector we maintain a balanced binary tree, with the insert order as the key: effective simulating an array. We maintain of count of how many times a given value has been inserted.

We also maintain a cumulative count where each node maintains the sums of counts of its descendants (counting itself).

The root contains the total number of elements that are present, say C.

To get a random number, generate a random number in {1,2, ..., C}.

Now do a search through the tree, using the cumulative counts as the key, with the additional step of decrement the count of a given node, when you go to its right subtree.

Comment hidden because of low score. Click to expand.
0

"Now do a search through the tree, using the cumulative counts as the key, with the additional step of decrement the count of a given node, when you go to its right subtree."

Hey will u please elaborate this part..I am not getting this.

Comment hidden because of low score. Click to expand.
0
of 0 vote

How about using HashTable and Array List?

Inserting is nothing but adding the entry to hashmap and adding element to ArrayList.
Deletion means removing the entry from hashmap and element from ArrayList.
Searching means you input the value to the hashtable, and search the value part of the entry.
Random search gets a random number under length of ArrayList and returns element at that index.

Comment hidden because of low score. Click to expand.
0

That's generally the correct approach, but how will you remove from the ArrayList quickly? See the top-voted solution.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Binary Search Tree:

Complexity for Inserting,Deleting and Sorting is better than Array and LL.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Instead of ArrayList and hashmap combo, linked list and hash map is the best possible way. SInce linked list takes O(1) for deletion but array list not.

Comment hidden because of low score. Click to expand.
0
of 0 vote

HashTable with BST(or red black trees for worst case performance of values inserted) for Duplicates.

Comment hidden because of low score. Click to expand.
0
of 0 vote

hash table + array
insertion------insert element at the end of the array and hash the element and insert the length of the array at that indexed hash element.
deletion------- get the index of array using hashing swap with the last element of the array. and change the hash table entry of the swapping element.

random-----get the any element from array.

Comment hidden because of low score. Click to expand.
0
of 0 vote

hash table + array
insertion------insert element at the end of the array and hash the element and insert the length of the array at that indexed hash element.
deletion------- get the index of array using hashing swap with the last element of the array. and change the hash table entry of the swapping element.

random-----get the any element from array.

Comment hidden because of low score. Click to expand.
-1
of 1 vote

How about a combination of double linked list and a hash map?

While inserting any element, inset it in hast with the key as the element and value as the address in the double linked list. With this, insertion/deletion/searching/randomRetrieval all are possible.

Inserting is nothing but adding the entry to hashmap, creating a node for the double linked list.

Deletion means removing the entry from hashmap and node from the double linked list.

Searching means you input the value to the hashtable, and search the value part of the entry.

RandomSearch, get a random number and go to the start of the double linked list and traverse those many iterations, thus, giving a random value each time.

Not sure if this is an overkill,..but if anyone can point any inconsistencies/comments on this one, it would be great :).

Comment hidden because of low score. Click to expand.
0

The fundamental flaw of this approach is this:

"RandomSearch, get a random number and go to the start of the double linked list and traverse those many iterations, thus, giving a random value each time."

That's a linear time algorithm.

Comment hidden because of low score. Click to expand.
0

Thanks Eugene. So, if we modify the double linked list approach with say ArrayList as suggested by Varun, we should be in a good shape.

Comment hidden because of low score. Click to expand.
0

Varun's described algo has a different flaw: the remove operation takes linear time. See the top voted solution, where everything is O(1).

Comment hidden because of low score. Click to expand.
-2
of 2 vote

Hash table method is the appropriate one.

Comment hidden because of low score. Click to expand.
0

hashtable may not be a good choice for sorting rt?

Comment hidden because of low score. Click to expand.
0

How will you provide getRandom() with just a hashtable?

Comment hidden because of low score. Click to expand.
0

what about sort in this approach?

Comment hidden because of low score. Click to expand.
0

What about it? Yes, let's just mention random algorithms without explanations of how you would apply them :)

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book walking you through every aspect of getting a job at a top tech company, while focuses on software engineering interviews.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.