## Amazon Interview Question for Software Engineer / Developers

• 0

Country: United States

Comment hidden because of low score. Click to expand.
3
of 5 vote

What is the meaning of file size is too large ? Too large to fit into memory ? If yes, then I will go with external merge sort.

Comment hidden because of low score. Click to expand.
0

I do not know who has downvoted king's answer, but I would try to do it too - use the external merge sort if we are limited in the memory.

Comment hidden because of low score. Click to expand.
0

Burstsort and its variants are cache-efficient algorithms for sorting strings and are faster than quicksort for large data sets, first published in 2003.[1]

Burstsort algorithms use a trie to store prefixes of strings, with growable arrays of pointers as end nodes containing sorted, unique, suffixes (referred to as buckets). Some variants copy the string tails into the buckets. As the buckets grow beyond a predetermined threshold, the buckets are "burst", giving the sort its name. A more recent variant uses a bucket index with smaller sub-buckets to reduce memory usage. Most implementations delegate to multikey quicksort, an extension of three-way radix quicksort, to sort the contents of the buckets. By dividing the input into buckets with common prefixes, the sorting can be done in a cache-efficient manner.

Comment hidden because of low score. Click to expand.
1
of 1 vote

There are several part of the question

1. Fastest sorting algorithm for string depend on the input
- if all the input string has equal size: radix sorting is the best choice with complexity of O(k*n) with k is the size of the string. Space complexity O(n) since it can not be done in-place
- if the input has very different size: and the longest string can be large > 20, quick sort is generally favored . The space complexity is O(1), it can be done in-place

2. Sorting when the number of string is large. When he mean "large", it should be inferred as large than memory can handle (for both in-place or out-of-place)

The answer for the case that we can sort in the memory I have mentioned above.
Now I assume we need to do external sorting.

One potential solution (that can combine with the other in-memory method that I mentioned above)

- Sort all string based on the length of the string
- you need only a linear scan to collect the length
- you need another linear scan to put string on to the group based on the length

- Based on the counting histogram, divide the pre-sorted string into group that fit onto the memory then sort
- Chose radix sorting if the length in the group uniform
- Chose quicksort if the length is non-uniform or the number of string is small log N is a small number

- At this point the only extra step is to merge between sorting string elements of the same length in consecutive groups.

Comment hidden because of low score. Click to expand.
0

very clean and good solution as i can think

Comment hidden because of low score. Click to expand.
0
of 0 vote

I think we can use heap sort where we will sort all the strings in O(nlogn) and it is an inplace algorithm so auxillary space will be 0 in this case.

Comment hidden because of low score. Click to expand.
0
of 0 vote

I think we can use burstsort .
here is the wikipedia link h**p://en.wikipedia.org/wiki/Burstsort

Comment hidden because of low score. Click to expand.
0
of 0 vote

In this sort with radix sort mixed with count sort. If k is size of biggest string and n is the total no of string then order will be O(k*n). First start sorting from the last character of strings(kth element) and then move towards left.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Strings set (Alphabet) is small in size so counting sort is best for string sorting linear time.

Comment hidden because of low score. Click to expand.
0
of 0 vote

3-way radix quicksort is a good start

``````public static void quick3Way(String[] a) {
int lo = 0;				// the initial index
int hi = a.length - 1;	// the last index
int d = 0;

quick3Way(a, lo, hi, d);
}

private static void quick3Way(String[] a, int lo, int hi, int d) {

if ( hi < lo ) return;

int i = lo + 1;
int lt = lo; int gt = hi;
// partitioning character
int v = charAt(a[lo], d);

while ( i <= gt ) {
int t = charAt(a[i], d);
if ( t < v ) exch( a, lt++, i++ );
else if ( t > v ) exch( a, i, gt-- );
else ++i;
}

quick3Way(a, lo, lt-1, d);
if ( v >= 0 ) quick3Way(a, lt, gt, d+1);
quick3Way(a, gt+1, hi, d);
}

private static void exch(String[] a, int i, int j) {
String t = a[i];
a[i] = a[j];
a[j] = t;
}``````

Comment hidden because of low score. Click to expand.
0

Uses approx 2N lg N character compares.

``````private static int charAt(String str, int d) {
if ( d >= str.length()  ) return -1;
return str.charAt(d);
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

bucket sort is the best option.quicksort would actually take o(nklogn) whre k:avd length of the strings

Comment hidden because of low score. Click to expand.
0
of 0 vote

Comment hidden because of low score. Click to expand.
0

How can you sort using ternary search tree ??? There is no way to sort using ternary search tree.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Burstsort and its variants are cache-efficient algorithms for sorting strings and are faster than quicksort for large data sets, first published in 2003.[1]

Burstsort algorithms use a trie to store prefixes of strings, with growable arrays of pointers as end nodes containing sorted, unique, suffixes (referred to as buckets). Some variants copy the string tails into the buckets. As the buckets grow beyond a predetermined threshold, the buckets are "burst", giving the sort its name. A more recent variant uses a bucket index with smaller sub-buckets to reduce memory usage. Most implementations delegate to multikey quicksort, an extension of three-way radix quicksort, to sort the contents of the buckets. By dividing the input into buckets with common prefixes, the sorting can be done in a cache-efficient manner.

Comment hidden because of low score. Click to expand.
-1
of 1 vote

What about radix sort on MSD (most significant digit)? Assume strings will be small in size, then time is O(k*n) << O(nlogn)

Comment hidden because of low score. Click to expand.
-1
of 1 vote

In this sort with radix sort mixed with count sort. If k is size of biggest string and n is the total no of string then order will be O(k*n).

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.