Amazon Interview Question
Software Engineer in Testshow do you calculate the product...
Finding the k largest elements is easy and it can be easily done with bit arrays.. but the key point here is how do you do the calculation of k largest elements??
@swathi if i am not wrong finally our heap will contain the k largest element then we can easily multiple them isn't it ?? you can do it using any tree traversal :)
Create Int32 array in size of 134217728, to get a bit for each possible number might be represented as Int32 (this array initiated as default to all 0).
Read numbers from file one after the other (assuming there is a single number per line)
For each number read set its corresponding bit (in case it wasn't set already) in the array (calculate appropriate location of corresponding int within the array and set the bit within)
At all times maintain the maximum number you facing
after finishing reading the all file go over all int numbers in arrays (starting from the int represents the bit of the largest number backwards), for each int != 0, print corresponding number represented by each set bit, until K bits where observed (or you finished searching the all array).
I can foresee 2 probs
1. U r making an assumption that storing K integer in memory will take < 1G . K can be greater than 1G.
2. If i understand your algo properly . The heap may not contain K largest elements . Look at this order let say k=3 and you have 10 ,3,4 in the heap. Next element that comes in is 5. You will not add it to heap
The file size is 4GB. i.e. 2^32 bytes. The number of integers would be [on 32 bit machine] 2^32/4=2^30.
The main memory size is 1GB i.e. 2^30 bytes.
So, The size is exactly the same.
We can follow one out ofthe two below approaches:
Approach 1:
Read the file in chunks say 4 chunks. Maintain a min heap of size k & update it as said by the @wgpshashank. After the file scanning is over, we are left with k largest elements.
Now, it depends whether we are using signed int or unsigned int, In simple words, it is a good point to discuss with the interviewer because overflow may occur. We will have to handle it accordingly.
Approach 2:
Instead of reading file in chunks, we can create a bit vector & mask it accordingly where each bit will represent one integer. So, the space complexity reduces by a factor of 8.
Follow the above approach of min heap afterwards.
we can do it using minHeap :) lest say file size is s=2^32 byte so number of integer it can have will be N=s/32
- WgpShashank July 07, 2011Algorithm
1.create min-heap of size K
2. for remaining elements read from file one by one compare it with root of heap if its greatre then root then replace it with root & call heapify
3. repeat whole algo until EOF
our heap will contain the k largest elements , return product of all the elemnmts in heap will give us answer
Time Complexity (O(k)+O(n-k)logk)
Do Notify me if i am wrong or any other algo ??
Shashank