I have 10 million 10-bit integ

Amazon Interview Question for Software Engineer / Developers

2

of 2 votes

27
Answers
I have 10 million 10-bit integers to sort, how would you sort them and what's the time complexity?

Follow-on question: Instead of sorting integers, I now have 10 million pairs to sort. Each pair consists of a 10-bit integer and an object, the sort order is determined by the 10-bit integer. Will your original sort algorithm hold or do you need sort it differently?

(A word of advice: Ask as many questions as you want during the interview, but you MUST be quick. Also, don't mention anything until you've thought it through clearly, otherwise you're just inviting more questions. Time is of essence, you're too slow if this question takes you more than 15 minutes to come up with the optimal solution, because remember, you have to leave time for explanations and other questions)
- chriscareercup November 12, 2012 in United States | Report Duplicate | Flag | PURGE
Amazon Software Engineer / Developer Sorting

Email me when people comment.

An error occurred in subscribing you.

Country: United States

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 18 vote

part 1:

int[1024] sort;
for (i in list) {
  sort[i]++;
}

which is O(n).

part 2: if the pairs are really just integer and object, I would probably just have an array of linked lists of some sort, which will still be O(n) (get the first item in the appropriate list and add the new item before that)

- Mike Lewis November 12, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I am not getting how this code works

- Suyash November 17, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Imagine that you need to sort and store the following one-digit numbers:

098723459681623612093851092837561827349878210

now, you could sort all of those and store them in order, or you could just have an int[10] and for each number, add to that cell in the array. so for the first digit, you do sort[0]++, for the second digit you do sort[9]++, etc.

At the end, rather than actually having the list, you have a summary of the list. You know that the list starts with four zeroes, but instead of storing that as 0000, you just have sort[0] = 4. To reproduce the list in sorted order, you would need to loop through the entire list and for each index, print that index a number of times equal to the value in that cell.

- Mike Lewis November 17, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Could you explain how the second part works?

- ~Viv~ November 20, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Assuming that there's no sub-ordering within the pairs, so it doesn't matter which of the objects associated with 0 comes first, what we're going to do is instead of having sort[0] = the number of zeroes, we're going to have sort[0] = ArrayList<Object> where that list is all of the objects associated with 0. Adding to the front of a linked list is O(1) so that's nice and fast. Does that make sense?

- Mike Lewis November 21, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Nice. Once you realize that 10 bit integer is nothing but 1024, it is a simple count sort.

- OliWtist November 28, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Thanks Mike for elegant solution and simple straightforward explanation

- oslearner December 11, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Sorting usually implies that all the data from the input exist in the output.
The problem says that there are 10 million integers but the solution Mike provided gives 1024 integers, so it just loses the data and therefore seems to be incorrect. Am I missing something?
Also, consider the input that consists of 5 million of number 512 and 5 millions of number 3. The proposed solution does not produce the correct answer, doesn't it?

- S.Abakumoff December 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

But yes you are sweetie...the knowledge of "Count Sort" :P

- The Artist December 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

okay. counting sort is not in place algorithm, it needs O(N) extra memory. shouldn't it be considered as the stop sign?

- S.Abakumoff December 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

1) why wouldn't it get the correct answer for the case of 5 million 2s and 5 million 3s?

2) as far as needing extra memory, it doesn't require O(N), it only requires 1024 extra ints, so 4096 bytes plus minimal overhead for the definition of the array, and that doesn't need to grow unless N gets larger than an int (and even then it just doubles in size when you use longs, and technically it could be an unsigned int in languages that support that so N needs to be twice as big). That doesn't seem like a big concession to make in return for sorting in O(N) time, and if there isn't anything important in the original data's ordering, you could actually look at this as a really efficient form of compression and throw out the original list.

- Mike Lewis December 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Okay, the code you have posted seems to be the 1st part of the counting sort(the code below copied from wikipedia article)

# allocate an array Count[0..k] ; THEN
# initialize each array cell to zero ; THEN
for each input item x:
    increment Count[key(x)]

You didn't complete the code, so strictly speaking it IS invalid

Then, the counting algorithm analysis clearly stays that because it uses arrays of length k + 1 and n, the total space usage of the algorithm is also O(n + k).

- S.Abakumoff December 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I would be shocked if my initial answer wasn't sufficient in an interview, but if they actually needed you to re-expand the compressed data into the original sorted list, it's pretty straightforward:

ArrayList<Integer> output;
for (int i = 0; i < 1024; i++) {
  for (int j = 0; j < sort[i]; j++) {
    output.add(i);
  }
}

- Mike Lewis December 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Strictly speaking this code is invalid as well. The output should be the array, not the array list because the input is array. At least you should call output.toArray() or something like that.

Then, this code is ignoring the fact that all the elements of the input array are 10 bit integers and is using Integer type(32 bit), so 40Mb of extra memory instead of 10Mb(or 20Mb for 16-bit integers) is being used.
Finally your code is 1) invalid 2) not-optimal. I would be shocked if you were hired with this level of answer :P

- S.Abakumoff December 28, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

now you are just trying to make yourself feel good by arguing on unimportant points. number of upvoting of Mike's solution is a proof enough for his ingenuity specially for the second part. i would have walked out of the interviewer if the interviewer argued the way you are doing. Looking back at all your comment i am damn sure that until yesterday you didn't even know about Count Sort. Looking at youur comment about the O(n) extra space makes you look even more stupid. sweetie there's nothing wrong in accepting the fact that you did not know something and someone else helped you learn it. demeaning/destroyin someone's work would not make you greater than the creator of that work.

- The Artist December 28, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Thanks for your valuable comment, Mr. Artist, I am glad that someone is tracking my messages :) So good to not feel lonely.

- S.Abakumoff December 28, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

if seeking attention is your intent behind all your fuss over nothing, then you are at the wrong forum buddy

- The Artist December 28, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Yeah maybe u are right, thanks again!

- S.Abakumoff December 28, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

agree to mike above...10 bit integer means just 1024 distinct values and count sort is the best way. mike's suggestion to the second part can be achieved with an array of structures with struct definition as follows

typedaf struct array_element
{
object obj; //origianal original def
array_element next;
}

define an array like below
array_element a[1024];

where a[i] = first pair in the given list with integer value i. all following pairs with int i should be stored in as the next node of a[i] and so on

- The Artist November 12, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Agree, Counting sort is the best way to sort integers where digits repeat frequently.

- Manish Pathak November 24, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

as the range of words is 0-1024(10 bit number), we can use counting sort

- Anonymous December 03, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@Mike Lewis is this so called counting sort?

- Summer January 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

yes. good luck with your interviews!

- Mike Lewis January 27, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Hi Mark,
thanx for easiest solution, but i have a question
do we really need int sort[1024] ?
since they are numbers between 0-9 ie 10 values.

- vin February 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Can't we use Radix Sort Using (Counting Sort or Insertion Sort)????

- Aman March 02, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

great solution, but a negative integer in the list would throw ArrayIndexOutOfBound error. Right, as it is not mentioned positive integers? Which would mean the number range from -1024 to 1024. A slight modification would suffice though
int[1024] sort;
int[1024] sortNegative;
for (i in list) {
if(i < 0) {
sortNegative[i*(-1)]++;
} else {
sort[i]++;
}
}

- Anonymous April 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@Mike Lewis, your solution will work if the algorithm need not be stable. ie none of the given elements have any satellite data, that must be preserved after the sort. In case of repeating numbers, this algorithm will break.
Counting sort requires O(n) space in order to be stable. For this problem that means O(1000000) extra space.
Radix sort would be a good choice in this problem. It is stable and requires constant extra space. Running time is O(nk) where k is the max number of digits.
In this particular case, it beats comparison sort algorithms as k(4) is much smaller than logn(1000).
logn>>>k

- Aparna September 15, 2013 | Flag Reply

CareerCup

Amazon Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews