Amazon Interview Question for Software Engineer / Developers


Country: United States
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
2
of 2 vote

My understanding of top 10 product ids are about statistics (purchase counts, sale income, and etc) associated with 10 product ids.

The minimum heap uses the statistics as node value, instead of the product id itself. Therefore, there should be an auxilary data structure to be used together with the heap so that product id can be mapped to heap node. For this purpose, hash table is perfect.

To get product history in the past hour, I assume that a queue is needed. Each queue element contains product id, properties and timestamp. Periodically, out-dated product records are removed from the queue and recent records are added to the queue.

All the "deltas" will be updated to the heap:
1) Lookup product id in hash table
2) If there is existing heap node, update heap node value using the delta, and heapify to maintain heap property.
3) If there is new product id, add it to the heap.

We can't really keep only 10 nodes in the heap - we need to keep all product ids as long as its property value >0. As a safe approximation, we may keep less nodes, for instance, the smaller value of 256 or number of all relevant product ids. Keeping only 10 could cause nontrivial errors.

- jzhu May 18, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

My solution is to keep a heap and a list, the head will keep the elements and should have a "update" operation. the list keeps all the elements arrived in past 1 hour. when a new element arrived, we push it to the back of the list, and check fro the front of the list to delete out-of-time elements; while deleting these elements we need to update the "request number count" of that element in the heap, and delete the element if the count goes to 0 (this prevents the heap becoming too large). when we do the top 10 query just extract heap for 10 times.

- srnttt November 06, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

"add" also need to update heap, sometimes do insert

- srnttt November 06, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 2 vote

Keep a heap of 10 elements and keep calling heapify with each new prod id recieved. Preserve top 10(all heap elements) at the hour boundaries. And create a new heap per hour based on timestamp.

- kiran April 16, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 2 vote

I think it is easier just maintain a min heap with a size of 10, for every new value that is larger, just pop the old value and push the new value. Inspect the value after 1 hour.

- Anonymous April 16, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

outer loop bubble sort 10 times: Complexity O(10n)

- Anonymous April 25, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Bubble sort on the date collected in one hour will blow up. Heaps provide a way of getting approximately instantaneous result for top 10 retrieval.

- Learner May 03, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Past 1 hour = Current time to cur_min-60 mins.

We need 2 data structures.
1. For each minute maintain map of - {product_id, current_min_frequency}
2. Heap(priority queue) of all product_ids in the last hour, priority by current_hour_frequency

How to maintain:-

As each new request/product_id is read in,
1. Add to current min - prod_id, freq in current min [if already present, increment current_min_freq]
2. Increment current_hour_frequency in priority queue, and re-heapify

After each min,
1. remove the pair for (curre min - 60 mins)
2. decrement current_hour_freq by cur_min_frequency of that oldest mint in Priority Queue, and re-heapify. If cur_hour_freq reaches 0, delete that product_id's node from min_heap.

When queried, get the top 10 elements from heap.

- X July 31, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Itcecsa, the question is not clear. It can have many meanings. Did they ask to keep a running list of top 10 items purchased in last one hour with new purchases constantly being received through stream? Or is the list already stored in a file or database and you are asked to write a query to list top 10 items. Please add more details.

- jatin085 April 20, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

This was blazing fast on my i7 laptop with 6 GB RAM and 64 bt OS.. so just have an int matrix of size 60 x 1_000_000.. use some global variable to denote what index is currently used e.g. 35 if its 35th minute.


all function e.g. sum, top10 take less then 1 second..

total memory used was less than 1 GB.. i checked using

Runtime.getRuntime().totalMemory()

in reality process method will also be passed index of min for which largest data is provided. right now i am just using it for all 60 min just to check speed which turned out as around sec.. so actually it will be 60 times faster.

public class Top10 {

	public static void main(String[] args) {
		int[][] store = new int[60][1_000_000];
		int[] lastMinuteData = new int[1_000_000];

		// Main algo
		process(store, lastMinuteData);
		int[] sum = sum(store);
		int[] top10 = top10(sum);
	}

	public static void process(int[][] store, int lastMinuteData[]) {
		for (int i = 0; i < 60; i++) { // for each minute
			for (int j = 0; j < 1_000_000; j++) { // for each record
				store[i][j] = lastMinuteData[j];
			}
		}
	}

	public static int[] sum(int[][] store) {
		int k[] = new int[1_000_000];
		for (int j = 0; j < 1_000_000; j++) {
			for (int i = 0; i < 60; i++) {
				k[j] += store[i][j];
			}
		}
		return k;
	}

	public static int[] top10(int[] productCounter) {

		int index[] = new int[10];
		int count[] = new int[10];

		for (int j = 0; j < 1_000_000; j++) {
			compare(j, productCounter[j], index, count);
		}
		return index;
	}

	private static void compare(int i, int c, int[] index, int[] count) {
		for (int j = 0; j < count.length; j++) {
			if (c >= count[j]) {
				index[j] = i;
				count[j] = c;
				return;
			}
		}
	}
}

- asim.ghaffar August 15, 2013 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More