Given a million points (x, y),

Amazon Interview Question

0

of 0 votes

33
Answers
Given a million points (x, y), give a O(n) solution to find 100 points closest to (0, 0).
- Illusion November 25, 2012 in United States | Report Duplicate | Flag | PURGE
Amazon Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 5 vote

Can be done in O(n) without additional memory... Just modify quick select(selection problem) a bit...here is an tested implementation in java:

public static int getMinDist(Point[] arr, int num) {
		int k = 0;
		int start = 0;
		int end = arr.length - 1;
		while (k != num - 1) {
			k = quickSelect(arr, start, end);
			if ((start + k) < (num - 1))
				start = k + 1 + start;
			else if ((start + k) > num - 1)
				end = start + k - 1;
			else
				k = num - 1;
		}
		return k;
	}


static int quickSelect(Point[] arr, int start, int end) {
		int pivot = (end - start) / 2;
		swap(arr, pivot, end);
		int i = start, j = end - 1;
		while (i <= j) {
			if (getDist(arr[i]) > getDist(arr[end])
			&& getDist(arr[j]) < getDist(arr[end])) {
				swap(arr, i, j);
				i++;
				j--;
			} else {
				if (getDist(arr[i]) < getDist(arr[end]))
					i++;
				if (getDist(arr[j]) > getDist(arr[end]))
					j--;
			}
		}
		swap(arr, i, end);
		return i;
	}

static int getDist(Point p) {
		return (int) Math.sqrt(p.x * p.x + p.y * p.y);
	}

- Anshul December 02, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Your analysis is incorrect. This is an O(n^2) solution. In the best case, your solution is O(n) but that is only if you get really lucky and you choose the pivot correctly first try.

On average this solution should be O(nlog(n)).

- Chris December 02, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

If pivot is choosen using right algorithm....quickselect gives linear time complexity on average case as we discard the 2nd half always....if Quick Sort is known as O(n log n) algo because of its average case this should be known as O(n) algo.....although its worst case should be still O(n^2)

- Robin December 03, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

This is on average O(n). Using the master method, one can see this easily. The recurse is as follows: a*T(n/b) + O(n)^d. Now a = 1, and b = 2 (for average case), and d = 1. Therefore a < b^d, so its case two of the master method, meaning O(n)

- Barry February 12, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

This is the algorithm that I gave him:

1) Go through all the points once to calculate distance of each of the points from (0, 0).
2) Copy first 100 points in an array.
3) Sort the array
3) Starting point number 101, compare it to the last element of the array.
If the point is smaller, swap the points.
Swap the new last point in the array with points that come before it until the array is sorted again.
Repeat step 3 until end.

- Illusion November 25, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

then it will become O( n log 100) not O(n)

- Anonymous November 25, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

with heap also it will be O( n log 100) not O(n)

- Anonymous November 25, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

with heap also it will be O( n log 100) not O(n)

- Anonymous November 25, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

it will be O(n + 100log100) = O(n)

- niraj.nijju November 25, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

for any element that fits into smallest 100, you need to rearrange the heap / array to keep heap / array in sorted order, this will take log 100... you need to do this exercise for all n elements... so time complexity is O( n log 100) and space complexity is O(100)

- Anonymous November 25, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

O( n log 100) = O(n * constant K) since log 100 is a constant = K O(n) = O(n)

so the solution have time complexity O(n)

- Vin November 25, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Log100 is constant. So no problem.

- musfiqur November 25, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

A max heap of size 100 is better than the array implementation for array contant will be 100log100 while for max heap it will be just log100

- The Artist November 26, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

i guess u meant min heap...

- Bevan November 26, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Nope. It's max heap. Insert the first 100 distances in the max heap thus the root node will have the maximum of the 100 distances. For every following distance check if the distance is less than the root (max) distance in the heap. If it is less delete the root and insert the new distance.

- The Artist November 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I am sorry.. u r absolutely correct...
idk what was i thinking...

- Bevan November 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

What if the next point is greater than root, discard it? and when the root node is deleted, the heap is balanced again?? Could you please elaborate the answer a bit.

- Kori December 12, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@kori: yes of course...if the next point is greater than root just discard it and whenever you delete the root you need to reheapify the heap...every delete and insert operation on a heap is O(log n) operation

- The Artist December 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

In one pass compute distance from 0,0 [O(N)]
Now, partitioning algo with partition (array, k) which partitions the array around the k-th smallest element (here k=100). After partitioning, the first 100 elements are the closest to (0,0). Partitioning alo. runs in O(N).

- Anonymous November 27, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I would use a min heap, as I go along and compute distances. Retrieval will be in O(1) time. The complexity will be O(N)+O(N*logN). But when N is million, log N would be smaller than the constants in previous solutions.

- Amit (amit.8561@gmail) November 30, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

a max heap of size 100 would be a better design in terms of time complexity

- The Artist December 07, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

How long do you think that it is appropriate to think for this kind of question?

- Lyubomir November 30, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

@Lyubomir: 2 minutes max...the question is a rephrase of the general question of k min (or max) elements in a N size array

- The Artist December 07, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

here my C++ solution:

#include <iostream>
#include <vector>
#include <limits>
#include <algorithm>
using namespace std;

typedef struct dist_ind {
	float dist;
	int index;
	dist_ind(float dist_, int i): dist(dist_), index(i) {}
}dist;

typedef struct pos {
	int x;
	int y;
	pos(int x_, int y_): x(x_), y(y_) {}
}pos;

vector<pos> find_k_clostest_points(vector<pos> vec, int x, int y, int k) {
	vector<dist_ind> v;
	vector<pos> out;
	
	if(vec.size() - 1 <= k) {
		return vec;
	}
	
	for(int i = 0; i < vec.size(); i++) {
		float distance = sqrt(pow(abs(x - vec[i].x), 2) + pow(abs(y - vec[i].y), 2));
		v.push_back(dist_ind(distance, i));
	}
	
	for(int i = 0; i < k; i++) {
		float min = numeric_limits<float>::max();
		int min_index = 0;
		
		for(int j = 0; j < v.size(); j++) {
			if(v[j].dist < min) {
				min = v[j].dist;
				min_index = j;
			}
		}
		
		out.push_back(vec[min_index]);
		v[min_index].dist = numeric_limits<float>::max();
	}
	
	return out;
}

int main() {
	// your code goes here
	
	vector<pos> vec = {{1,2}, {3,4}, {5,6}, {20,30}, {2,2}, {10,11}, {4,4}};
	
	vector<pos> k_points = find_k_clostest_points(vec, 0, 0, 3);
	
	for_each(k_points.begin(), k_points.end(), [](pos p) { cout << p.x << ", " << p.y << endl; } );
	
	return 0;
}

- Anonymous December 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

here the improved version with a heap:

#include <iostream>
#include <vector>
#include <limits>
#include <algorithm>
#include <queue>
using namespace std;

typedef struct dist_ind {
	float dist;
	int index;
	dist_ind(float dist_, int i): dist(dist_), index(i) {}
}dist;

typedef struct pos {
	int x;
	int y;
	pos(int x_, int y_): x(x_), y(y_) {}
}pos;

typedef struct comparison {
	bool operator() (const dist_ind &lhs, const dist_ind &rhs) {
		return lhs.dist < rhs.dist;
	}
}comparison;

vector<pos> find_k_clostest_points2(vector<pos> vec, int x, int y, int k) {
	priority_queue<dist_ind, vector<dist_ind>, comparison> q;
	vector<pos> out;
	
	if(vec.size() - 1 <= k) {
		return vec;
	}
	
	for(int i = 0; i < vec.size(); i++) {
		float distance = sqrt(pow(abs(x - vec[i].x), 2) + pow(abs(y - vec[i].y), 2));
		
		if(q.size() == k) {
			if(q.top().dist > distance) {
				q.pop();
				q.push(dist_ind(distance, i));
			}
		}
		else {
			q.push(dist_ind(distance, i));
		}
	}
	
	while(!q.empty()) {
		out.push_back(vec[q.top().index]);
		q.pop();
	}
	
	return out;
}

int main() {
	// your code goes here
	
	vector<pos> vec = {{1,2}, {3,4}, {5,6}, {20,30}, {2,2}, {10,11}, {4,4}};
	
	vector<pos> k_points = find_k_clostest_points2(vec, 0, 0, 3);
	
	for_each(k_points.begin(), k_points.end(), [](pos p) { cout << p.x << ", " << p.y << endl; } );
	
	return 0;
}

- Gerald December 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Above solution can be improved to get O(n)
1. For each (x,y),compute distance from (0,0)
2. Get top 100 elements using bubble sort(each bubble sort cycle for 100 times only)
This will give O(n) as 100 is constant . So need to do bubble sort cycle for 100 times still will be O(n)

- Bhavnesh July 20, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 3 vote

(1) Find the distances of the points from (0,0) - O(N)
(2) Find the minimum value in the list and store the point and remove it from the list - O(N)
(3) Do step (2) until 100 points are found

Complexity:
O(N) + O(100 * N) = O(N)

- Ram November 25, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

convincing answer... simple logic.. good one.

- voora.ravishankar December 03, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Ya but what if we are given (N/3) in place of 100 or some really big number...it becomes O(n^2) right?
Top solution using quick select has better chances than this on that case...

- Mahesh December 03, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

finding min value in a list is of order N^2.

- suvrokroy December 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@suvrokrov: oh really? and i thought there are couple of algorithms available that can sort the entire list in O(nlogn) time

- The Artist December 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Distance from (0,0) is min if |x| + |y| is min for any cor-ordinates.

I use two stacks for this algo.

1. if main stack is empty insert insert (x,y)
else
2.if |x|+|y|<Main Stack.top(x,y) then pop it and put in second stack.Keep popping unless you get a value less than eq to |x|+|y|
3.Insert x,y, into main stack
4.Pop all elements from second stack and put it back into main stack.

At any time the main stack contains points at min distance for (0,0)

Let me know if it can be improved

- suvrokroy December 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

@suvrokrov: really so according to you (2,2) and (3,1) are equidistant from origin?

- The Artist December 16, 2012 | Flag

CareerCup

Amazon Interview Question

Books

Videos

Resume Review

Mock Interviews