CareerCup

Amazon Interview Question for SDE1s

3

of 3 votes

30
Answers
How can you design a data structure that can do the following operations in O(1) time:
Insert, Delete, Search, Max which returns the maximum number

I know delete, search and insert can be done O(1) time in a hashmap with a proper hash function. But not sure Max is even possible in O(1) with the presence of delete operation?
- Masumbuet December 29, 2014 in United States for International expansion | Report Duplicate | Flag | PURGE
Amazon SDE1 Data Structures

Email me when people comment.

Email me when people comment.

Loading...

An error occurred in subscribing you.

Team: International expansion
Country: United States
Interview Type: In-Person

Email me when people comment.

Email me when people comment.

Loading...

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

5

of 5 vote

I like this question.
I assume here that data structure can hold arbitrary numbers (data), otherwise the problem isn't interesting.

First of all, your intuition is right:
> But not sure Max is even possible in O(1) with the presence of delete operation?
Such datastructure/algorithm is impossible on a turing machine.
Proof by contradiction:
suppose this data structure exists. If this is the case then we can build general purpose comparison sorting algorithm by pushing all data to this structure and then pulling one-by-one using max and delete operations with O(N) complexity. However we know, that no comparison sorting algorithm on a turing machine can give us better than O(NlogN). Q.E.A./Q.E.D.

Now, we're engineers let's solve the problem :)
Given the proof above, the only possible loophole here - is to avoid using turing machine model. Since question is asked at Amazon where interviewers are passionate about scale I'd propose to parallelize the computation and leverage O(N) processors/computers. In this model all requested operations (including search) can be implemented with time complexity O(1)

- 0xF4 December 30, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

I like the thought in solution proposed. But scaling that will by a stupid attempt as with each addition, we are possibly creating another thread/processor performing possible reorganization in parallel.

- Victor December 30, 2014 | Flag

Comment hidden because of low score. Click to expand.

1

of 1 vote

Interviewer will be very happy with this response, especially if you clarify assumptions, show why it is impossible to do it in "standard" way and articulate when exactly parallelized solution will be effective Namely, when comparison operation is more expensive than processor allocation and inter-process communication. Obviously, the proposed communication scheme must be rational (e.g. there is no need for re-organization but it is enough to send each request to each node every time)

When asking questions like this interviewer clearly want to see whether candidate is able to think out of the box. You can argue whether this question is good or not. Usually interviewer provide enough hints. It is open-ended question, the discussion can even end up with designing your own hardware architecture for this problem.

- 0xF4 December 30, 2014 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Nice proof, it's all there.

For the parallel solution, 3-color problem would not be "hard" if we had 3^n boxes on demand ;)

- mdu December 30, 2014 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

We can just have a background thread that searches for new max after deleting max (deleting non-max does not change it, inserting greater than max is trivial update)
If you are getting max and this thread is still running, let it finish. Also, if you are doing insert or delete operation and this thread is still running, let it finish so you can be certain whether you are deleting non-max or inserting greater than max.

It is obvious that worst case scenario is n when you do operation that has to wait for background thread while it is running. However, given this thread starts up only on deleting max we can be certain in average case of O(1)

P.S. When inserting, you don't have to wait for thread to finish as soon as it finds greater element than is being inserted. Also, when deleting, as soon as it finds greater element than is being deleted.

VERSION2: Stop (or pause) background thread on every insert and delete and then restart it. When the only one operation that may wait for it is get max. If insert/delete call frequency never let to complete it, it will eventually complete on next getting max and will not start until next delete max.

- CT December 31, 2014 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

The proof doesn't cover non-comparison based models. Simply quoting the NlogN lower bound doesn't say much because we were using a hash table (a data structure that's not comparison-based) from the beginning.

Still, I like this answer and it's probably what an interviewer would look for.

- Anonymous January 05, 2015 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

how can N processors find out max element in O(1)?

- alpha January 05, 2015 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

>The proof doesn't cover non-comparison based models
it doesn't need to. We're within comparison model, since data structure implements MAX() operation. Finding maximum of two (or more) numbers is, by definition, a comparison operation, so building sorting algorithm with the help of series of MAX() yields a comparison sorting algorithm.
In fact I don't even need to prove that it must be a comparison sorting.

- 0xF4 January 05, 2015 | Flag

Comment hidden because of low score. Click to expand.

1

of 1 vote

> how can N processors find out max element in O(1)?
Easy. For instance, we create a distributed "linked list" using N processors, where each processor stores single number and we keep all numbers/processors sorted. In order to organize a linked list, each processor might keep an index of the peer. Finding MAX() in this case is single message round-trip with the last processor, since it holds the maximum.
insert, delete and search operations are implemented by broadcasting to all processors, and in worst case the processor might forward the data the peer at most once.
Insert is the most complex operation here. It can be implemented in O(1) time complexity in following way. Let's say processor receives a broadcasted message with instruction to insert number X. If it is the last processor then new processor is allocated and becomes the last one. If it is not he last processor, but X greater than the number which is stored locally, it asks the processor on the right whether its number is equal or less than X - if reply is positive then new processor is allocated and "inserted" in between of those two, otherwise nothing happens.

- 0xF4 January 05, 2015 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Why can't we build a Max-Stack?
stackoverflow.com/questions/7134129/stack-with-find-min-find-max-more-efficient-than-on

- codejakk07 February 01, 2015 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

> Why can't we build a Max-Stack?
Because of the presence of delete operation

- 0xF4 February 01, 2015 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

Yeah, the presence of the delete operation wouldn't allow things to be as simple as saving the max value in an extra variable.
Appears that the interviewer may have been looking for the same argument.

- teli.vaibhav December 30, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

For the operations other than max, a hashtable seems like the best fit like you assessed.

- teli.vaibhav December 30, 2014 | Flag

Comment hidden because of low score. Click to expand.

0

of 4 vote

You should use a hashmap to save all your numbers and a stack to track the current maximum value; Whenever the number being inserted to greater than the top of the stack, you should push the number to the stack. Whenever the number being deleted is greater than the top of the stack, you pop the stack. The top of the stack will always be the maximum value.

- Anonymous December 30, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

3

of 3 votes

I don't think that it is possible that way. For example, the top of the stack contains 200 and you insert 100, then 150. just 200 will be in the stack. If I delete 200, how are you going to find out the current max after deletion (which is 150 and not in the stack)?

- Masumbuet December 30, 2014 | Flag

Comment hidden because of low score. Click to expand.

1

of 1 vote

Your hashmap can contain a pointer back into stack and remove elements in the middle.

- Anonymous January 07, 2015 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

What if I develop my own hashCode() which is unique and represent the order of value of my bucket , now If I need fetch max, i just need to fetch element at bucket size -1.
I understand hard to develop but not impossible

- Vi December 30, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

Can't we use a max-heap? In the case of the previous poster's scenario (add 100 -> add 200 -> add 150), after heapifying the data-structure we would end up with max 200, then when we execute the final condition (remove 200), the node with 150 will swim up and become max. The sink and swim operations will be O(logn) but extracting max will be O(1). We can check during delete operation if we are deleting max of heap, then heapify the remaining nodes accordingly.

- Anonymous December 30, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

Yeah, max heap is a solution but as mentioned in the question, all operations should be O(1).
I think the interviewer wanted a discussion to see if was possible.

- Victor December 30, 2014 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

dffd

- Anonymous December 31, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

If such a thing was possible, a Binary Max Heap would never have all the craze it has.

- Urik Lagnes January 01, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

a data structure with a hash map and max heap.
insert, delete and search is O(1) with hash map.
Insert and delete also trigger a background thread to update the heap. This thread is running in background, not affecting the time for insert and delete.
Max return the top of heap, which is the max to our knowledge so far.

- ABCD January 25, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

what about adding hashmap for the insert,delete,search operation and in the data field of value part stored last max element. Also stored max element value in this data structure for getting max in O(1).
For eg. at first max=0, when new element is added max will be set that value.After this say new element is added having value > max, then add new value<value of second element,max> in hashmap and update max.

- Fx January 26, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

This won't work. It'll be interesting exercise for you to find a counter-example for your own proposal.

- 0xF4 January 26, 2015 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

Can we not try TreeMap?

- sg1234 March 01, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Well this type of data structure is really interesting
What I can think of is:
For Insert/Delete(a particular element)/Search: O(1)
Hashmap is the best solution
If want to have max operation as well
We can have Hashmap having hashing algorithm implemented in such a way that takes comparison into account (means greater element will map to greater hash key). (Though empty keys could be a problem but that can be solved using parallelism or other useful data structure for keys management)

Hope it will help!

- Bharat Kumar Arya January 09, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Why can't we build a Max-Stack?
"stackoverflow.com/questions/7134129/stack-with-find-min-find-max-more-efficient-than-on"

- codejakk07 February 01, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

You should use a hashmap to save all your numbers and a stack to track the current maximum value; Whenever the number being inserted to greater than the top of the stack, you should push the number to the stack. Whenever the number being deleted is equal to the top of the stack, you pop the stack. The top of the stack will always be the maximum value.

- Anonymous February 07, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

-2

of 2 vote

You should use a hashmap to save all your numbers and a stack to track the current maximum value; Whenever the number being inserted to greater than the top of the stack, you should push the number to the stack. Whenever the number being deleted is greater than the top of the stack, you pop the stack. The top of the stack will always be the maximum value.

- Lu December 30, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

1

of 1 vote

Consider the following sequence:
Add 100 --> add 200 --> Add 150 --> remove 200
At the end, the stack will only have 100, but the max is 150. Next max call will return 100 not 150. Hence keeping a stack won't help.

- Victor December 30, 2014 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Ok, but you will be missing lots from current maximum to previous top.

- CT January 01, 2015 | Flag

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.