Amazon Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
how about linked list with each node containing continous range of numbers.
initial state [1,INF]
checkout [2,INF]
checkout [3,INF]
checkout [4,INF]
checkout [5,INF]
checkin(1) [1,1]->[5,INF]
checkin(3) [1,1]->[3,3]->[5,INF]
checkin(4) [1,1]->[3,INF](perform merging of nodes)
checkout will be 1st number of the 1s node and do necessary merging of nodes if required.
I can consider it as O(1) solution... one way is allocated 128 MB array store all bits as 1 in the array... now if checkout() called then apply binary search on 1 to 128 and find out first non-zero array element.... in that 32 bits of element find out first non-zero bit(using logic n&n-1) that is the min number and return that and also as we are checking out this number we need to make that number as 0... for check-in verify if that number is available if not then make that bit as 1.
space complexity O(n)
time complexity O(log 128) = O(8) = O(1)
Usually with problems like these, you should talk about the expected frequency of checkout and checkin, and the interviewer might give you an assumption which might simplify the problem.
In any case, here is a solution for the case when the pattern is not known.
I believe we can do
checkout : O(log^2 n) time
checkin (); O(log n) time
where n is the number of currently checked out numbers.
This is based on the following:
Suppose you had an array A[1...n] which contains a sorted list of numbers from {1, 2 ....}.
Find the least number in {1,2, ...} which is not in A[1....n].
A binary search works for this: Check if A[current] >= current.
if A[current] = current, go to the right half. If not, go to the left half.
Using a balanced binary tree with order statistics to simulate the array will make retrieving A[current] to take O(log n) time, and so the binary search will be O(log^2 n).
Checkin will become O(log n). (If we used a plain array, checkin will take O(n) time, hence we used the tree).
We can probably optimize this (and perhaps only in constant, not asymptotic) by maintaining intervals, rather than individual numbers.
Guys, any algo where we store nums 1...infinity for initialization was rejected right away... and the frequency of checkout and checkin is not defined...
Two sets and a stack is enough to solve this.
Sets are for removed (checkout) and reinserted (checkedin based on a "not-contains" performed on the former set) numbers.
Stack is required to hold the infinite amount of wordsized integers that represent where we left as we go through the infinity, in order to resume checkouts if the second set (reinserteds) is depleted.
Time complexity is either O(1) as # of previous checkouts are greater than checkins (only the counter is going to be incremented) or basic set data structure insert/search other way around.
Build a binary Heap for all the numbers from 1 to n. This would make sure that the min number always stay at the root.
Check out and checkin would be done on O(logn) time.
Ok then here is the walk through for my previous explanation which seems to be obscure for most:
#1 first of all think of the case where infinite amount of checkouts are performed without a single checkin. Here we need a stack of MAX_INTs with no regard to space complexity. As soon as our int or long typed counter wraps we persist it on the stack and continue incrementing checkout counter for subsequent operations. This where the stack comes into play and it is better to encapsulate this logic around it (infinite sized integer).
#2 gather all the numbers in a set, this is the first set.
#3 now assume that at your Nth checkout they attempt to check in a number, use the first set to see if it is OK to insert it. If it is ok remove it from the first second and insert into the second set.
#4 if second set is not empty, it will be the one that will service checkout requests until depletion. Each removed number will take its slot back in the first set.
#5 if second one gets depleted which means the first set retains all the previously checkedout numbers, we get back to our stack based virtual INTEGER counter to serve checkout requests.
As you can see we don't need any preallocated infinite buffers, yet expanding memory consumption depending on the difference between the total number of checkouts and checkins which means quite a few memory as they go along evenly. However we have an implementation with a flexible incremental logic which can satisfy an insane difference brtween checkouts and checkins.
I hope that is clear.
Some comments
1) You don't need two sets. Just the checkin set is enough. During checking, If the number is greater than the counter you ignore it. Otherwise you add it to the checkin set. Why waste space?
2) Using a stack for counter is just too weird! Why not use a vector? Or a Linked List?
Your memory consumed will always be proportional to the peak number of checkouts that happened, which is probably not that good...
Nonetheless, it is a reasonable idea.
I was thinking that we can choose a "dense" data structure that maps to sparse spaces well && is based off of the disk. Disk => infinite storage is the appx.
So, lets say we choose the Btree. If we checkout an index that is not populated i.e. a new node insert, then just insert the number into the btree. If you do a checkout of an already existing number, use a sentinel to represent that it is checked out. If you check it back in, delete the number from the tree.
The lowest is always the left most.
But. the infinite checkout is not addressed as Yemrus's soln...still trying to grok it.
Use a min_heap and an int MIN initialized to 1.
checkin(int a) {
if (a < MIN ){
heap.insert(a);
}
checkout() {
if (!heap.empty()){
return heap.extract();
} else {
min += 1;
return min-1;
}
Idea is use the heap to store any checked in elements less than MIN. While checking out return min from heap, if heap not empty else return MIN and increment MIN.
class Number_Pool{
private HashMap<Integer,Boolean> invalidMap;
public Number_Pool(){
this.invalidMap=new HashMap<Integer,Boolean>();
}
public int checkout(){
for(int i=1;i!=Integer.MAX_VALUE;i++){
if(!this.invalidMap.containsKey(i)){
this.invalidMap.put(i, true);
return i;
}
}
return Integer.MAX_VALUE;
}
public void checkin(int newItem){
if(this.invalidMap.containsKey(newItem)){
this.invalidMap.remove(newItem);
}
}
}
Hash table where:
1. Addition is O(1) with keeping track of the minimum
2. Removal is O(1)
I feel it can be done as follow,
1. Create a bit map, where each bit signifies an integer.
2. initialise the bit map
3. in case of checkin, if the bit is not set , set it to true
4. In case of checkout , if the bit is set unset it.
This will be done in constant time. The only concern seems here is to maintain the bitmap for an infinite stream.
There are 3 operations - check-in (could be any nth element), check-out, get least check in number. If we use BitSet (arraylist that stores everything in bits), the time complexity to set/get bit at nth location would be constant O(1), since it uses array under-neath and random-access search from array is O(1). So, check-in and check-out could be done by saying bitset.set(n) or bitset.get(n) in O(1). To get the least check-in number - we can do 2 things : sequential search O(n) (based on the application - if lets say least numbers are most likely to be checked-in first thereby hoping to achieve O(1) in most cases) - goal here is that we are mostly expecting the best case scenario. The second thing could be to do binary search O(log n) - worst case to find the least checked-in number.
In terms of space complexity, with Bitset we can store 8 bits in 1 byte. That's like achieving log n with base 8. Which is pretty good.
Maintain two things:
- Anonymous August 03, 2012The largest checkedout number (L), a min-heap of numbers that have been checked back in.
On Checkout, if min-heap is empty, increment L and return L.
If min-heap is not empty, extract-min Heap and return the min.
On Checkin, insert into heap. You can try doing some optimizations when heap size becomes L etc.
initial L=1, and min-heap is empty.
For this problem, I guess the memory usage will be the key factor in deciding the approach, rather than the frequencies (which will affect it, no doubt, but not as much).