Amazon Interview Question for SDE1s


Country: United States




Comment hidden because of low score. Click to expand.
1
of 1 vote

For small "k" this can be done in O(k N log(N)) using divide and conquer.
Start by sorting the list of points with respect to "x" and "y". Let's form two sorted arrays this way, namely Ax and Ay.

1- Take Ax and divide it into lower and upper halves (roughly N / 2 length). That is, divide the array around its median (say Mx).
2- Run the algorithm for the first half and then second half, independently (divide and conquer: this is a recursive call).
3- Lets assume by solving it over the two subarrays, "delta" is the minimum length found.
NOW We want to see if the two closest points are in different subarrays.
4- Now choose all points where "|x - Mx| < delta".
5- Go through all such points, (at most N). For a given point (x, y), look for the next "7" points on the array sorted with respect to "y". This will take O(N) time. At the end, if you find a pair with distance less than "delta", return it as solution, otherwise, return "delta".

I saw this algorithms in some course notes from Algorithm I in Stanford.

- Ehsan January 12, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
Comment hidden because of low score. Click to expand.
0
of 0 votes

The algorithm is correct. When you compare across the "y" axis for seven consecutive points, it makes sure that any point with distance smaller than that of sub-problems is detected.

For a proof of correctness, I suggest you refer to the Algorithms course in Stanford by "Tim Roughgarden". He solved it as an example for divide and conquer (I took it on coursera).

- Ehsan January 14, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Yes, but this solves only the case when k is 2, What about the others? I think this algorithm is NP hard and only an approximate algorithm would work in the general case

- Anonymous January 14, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

The problem is NP hard, not the algorithm. Sorry

- Anonymous January 14, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I guess the question is finding "k" closest pair of nodes. That is, we consider n choose 2 cases and sort them based on distance and then return the "k" smallest. Otherwise, depending on the definition of "k closest points", the problem can be much harder.

For the former case, the algorithm above is better than listing all pairs since it is O(knlogn). If k = n, the complexity will be O(n2 logn) which is that of sorting all pairs.

- Ehsan January 14, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Clarification Required:

K closest point among each other?
OR
K closest point from origin?
OR
K closest point from some given point?

- PKT January 12, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
-2
of 2 votes

I think its about first option.

- thelineofcode January 12, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

It was an online test and the question above is all. But I think it's the first one.

- 2013renting2013 January 12, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Just a random thought:

If the problem were on 1D, the solution would be to sort the points in ascending order, then you would take the first k items and consider that your k-set of closest points with a cost equal to the difference of the k point minus the first one, then you'd visit k + 1 point and see if the distance between the k + 1 point and second one is smaller than the current minimum cost, if so, shift the k-item one point to the right, otherwise start a new k window with its first index located at the current point and repeat the procedure until you cannot create a k window anymore. By that time you'd have the k-set with the closest items, though, of course, you have to consider the edge case where you have less than k items in your original set.

That algorithm takes O(nlgn + n)

If we take close notice, the invariant we need to hold for the algorithm to work is that for all 1 <= i <= n - 1 point it is true that the euclidean distance d(i - 1, i) <= d(i - 1, i + 1), so can we still hold this property on a 2D scenario? I think so.

If you sort the items by its euclidean distance to the origin (in the scenario where all points are positive), then by triangulation the property holds. In the edge case where you have negative values for x or y axis, the property may break, so to fix that you'd have to locate the new origin with its x-axis equal to the smallest x value from the list of points and similarly, with the y-axis equal to the smallest y value from the collection.

This algorithm takes O(n) to find the smallest x and y values, O(lgn) to sort them in ascending order by its distance to the new origin and finally O(n) to get the k-set of closest points (using the euclidean distance as a way to calculate the cost of the current k-window).

- donkey code January 14, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

It would not. Take {(0, 1), (1, 0), (1.1, 1)} which sorted in the order of distance from origin, but the first two are pretty far away.

IMHO, It is not trivial to define a meaningful total-ordering (if exists) on a set of vectors.

- Ehsan January 14, 2014 | Flag
Comment hidden because of low score. Click to expand.
-2
of 2 vote

You can use k means algorithm.
For example n = 12 and k = 3. So we have to create 4 clusters (call it 'nc' - number of clusters) which contain 3 points.
1) Choose randomly 'nc' - 4 points as an initial cluster's centers.
2) Assign remaining points to the centers (closest points to the center's center are assigned to it). in this way we have just created clusters.
3) Recompute a center of each cluster. New cluster's center is the average of coordinates of all points in the cluster.
4) Repeat 2,3 till there is no changes in the clusters. In other words we found nc clusters which contain k closest points.
5) Check the total distance between points in each cluster and choose the one for which the total distance is minimum.

- thelineofcode January 12, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

You're good. But is there a faster way to do this?

- 2013renting2013 January 12, 2014 | Flag
Comment hidden because of low score. Click to expand.
-2
of 2 votes

If I come up with a better idea I'll post it. I'm not sure if it can be done faster.

- thelineofcode January 12, 2014 | Flag


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More