Amazon Interview Question
Software Engineer / DevelopersCountry: India
Interview Type: In-Person
@rotinom thanks for the response.
Few questions my union solution is not o (1) it is o (n).
Coming back to your solutions
We don't have any clue what the nature of the numbers could be . It could be just 11111111 and 2222222 so keeping that much space may not be ideal.
Let's say we have two arrays saving bits of the two inputs.
Even though bit operation will be faster than string comparison the fact that u have to read over the arrays will make all the solutions o (n)
I would attempt something like this
Let's assume Initially both A and B are null sets A = {} and B = {}
1) Create 3 data structures Hash Table, List_Union, List_Intersection
2) Whenever u get a new number in A or B add it to the hash table
2a) if the number already exists in hash table, add the number to List_union
2b) If number does not exist in hash table, add it to hash table and add that number to List_Intersection
3) you can maintain these 3 data structures and whenever u have to find the A-B or B-A just do AUnionB - AIntersectionB
I think you can take care of it with single hashmap.
e.g.
Given inputs A = {1,2,3,4} , B={1,2,5}
Writing to the hashMap will be something like..
H = {1: 'AB', 2: "AB", 3: "A", 4: "A", 5: "B" }
A Union B = H.keys()
A Intersection B = [key for key in H.keys() if H.get(key) == "AB"]
A - B = [key for key in H.keys() if h.get(key) == "A"]
B - A = [key for key in H.keys() if h.get(key) == "B"]
Following your idea:
import java.util.HashMap;
import java.util.Iterator;
public class NumbersStore {
static HashMap<Integer, String> tmpTable = new HashMap<Integer,String>();
private static void printByBuffer(String bufName){
String InterAcumContainer = " ";
Iterator<Integer> keySetIterator = tmpTable.keySet().iterator();
while(keySetIterator.hasNext()){
Integer key = keySetIterator.next();
if(tmpTable.get(key).equals(bufName)){
if(InterAcumContainer.equals(" ")){
InterAcumContainer = ""+key;
}else{
InterAcumContainer = InterAcumContainer+","+key;
}
}
}
System.out.println(bufName+": "+InterAcumContainer);
}
private static void printUnion(){
String unionAcumContainer = " ";
Iterator<Integer> keySetIterator = tmpTable.keySet().iterator();
while(keySetIterator.hasNext()){
Integer key = keySetIterator.next();
if(unionAcumContainer.equals(" ")){
unionAcumContainer = ""+key;
}else{
unionAcumContainer = unionAcumContainer+","+key;
}
}
System.out.println("Union: "+unionAcumContainer);
}
private static void addValue(int keyIndex, String bufName){
if(tmpTable.containsKey(keyIndex)){
String tmpBufName = tmpTable.get(keyIndex);
if(!tmpBufName.equals(bufName)){
tmpTable.put(keyIndex, "AB");
}
}else{
tmpTable.put(keyIndex, bufName);
}
}
public static void main(String[] args){
//add some values
addValue(1, "A");
addValue(1, "B");
addValue(2, "A");
addValue(3, "B");
addValue(4, "B");
addValue(5, "A");
printUnion();
printByBuffer("AB");
printByBuffer("A");
printByBuffer("B");
}
}
Assuming numbers are only added
Define a struct with an 2 bit value field and a next pointer, prev pointer
now make a hash_table of int, the above struct node
let 2 bit field be such that
if == 01 number is in B
if == 10 number is in A
if == 11 number is in A and B
Maintain three lists
1. that goes through nodes with bit value = 11 (intersection list)
2. that goes through nodes with bit value = 01 (B-A)
3. that goes through nodes with bit value = 10 (A-B)
4. for Union return all elements that are in hash
operations
adding number, if not present already add to either lis 2 or list 3 based on whether it is in set A or set B
if present remove the current from either 2 or 3 and move to head of 1.
All operations are O(1)
I'd like to solve this problem like the following.
Thread A: will manage two data structures for set A
- hash_map with key = number and value = true if A intersection B
- hash_set for A - B
similarly
Thread B: will manage two data structures for set B
- hash_map with key = number and value = true if A intersection B
- hash_set for B - A
Thread C: will manage two data structures for A U B and A intersection B
- A U B: hash_map with key = number and value = bit fields(true if number contained in A, true if number contained in B)
- A intersection B: hash_set
If a number N is about to come to A, the producer notifies Thread A and Thread C
Thread A:
- updates hash_map with N
- waits for Thread C is done with N
Thread C:
- updates hash_map with N
- if N is contained in A and B, update hash_set for A intersection B and notifies Thread A with updated N, otherwise, just notifies Thread A
Thread A:
- wakes up
- if A intersection B is updated with N, it updates its hash_set. otherwise, just finish.
Similar procedure can be applied to a new number to set B.
@estif None of those is "fast" except union, which is O(1). The others take O(N) time.
- rotinom April 08, 2014If the numbers are coming in fast, then you can do some things to mitigate that, like put in a circular buffer, and pass it off to a separate thread. It's the typical producer-consumer pattern.
Secondly, what is interesting about this problem, is that you can do your typical tradeoff of space for speed. So, if you're getting numbers off the wire, then you have fixed set of numbers you can support, namely 2^32.
So, create two 2^32 bit arrays (only 536Mb each), labeled A & B. Number comes in, it's O(1) to flag the correct bit. Accessing the correct bit is simple arithmetic.
Union is A | B on all the bits.
Intersection is A & B.
A-B is A^ (A & B)
B-A is B^ (A & B)
All these operations will be fast-as-hell bit operations, and they're all O(1), since the array sizes are fixed.
You'd need to profile this for the actual performance gains, because the approach is a bit silly.
Otherwise, you could perform your various set constructions on the fly, as numbers are coming in.
The words "real time" are imprecise. What are the requirements? Is it 10ms? 20ms? 5ms?