Facebook Interview Question
iOS DevelopersCountry: United States
Interview Type: Phone Interview
It is as simple as taking a k size array and populating it with first k points.
Now
if next element < arr[0]
then arr[0] = element
Call heapify i.e
check arr[0] with its children i.e. 2i and 2i+1 where i is the root.
Let me know if there is a problem, I will write a c++ code
here is my idea of that:
-(NSArray*)closestPointsFrom:(NSArray *)points forK:(int)k
{
NSArray *sorted = [points sortedArrayUsingComparator:^(id obj1, id obj2) {
FBPoint *p1 = (FBPoint*)obj1;
FBPoint *p2 = (FBPoint*)obj2;
float dist1 = sqrt(p1.X*p1.X+p1.Y*p1.Y);
float dist2 = sqrt(p2.X*p2.X+p2.Y*p2.Y);
if (dist1> dist2)
return NSOrderedDescending;
else if (dist1 < dist2)
return NSOrderedAscending;
else
return NSOrderedSame;
}];
NSMutableArray *closest = [[NSMutableArray alloc] init];
for (int i = 0; i < k; i++)
[closest addObject:[sorted objectAtIndex:i]];
return closest;
}
it looks like O(N*log(N))
heapify using which key? x coordinate or y? or are you going to calculate distance from all the numbers?
In fact you don't need a heap or a binary search tree here, because all the array is already loaded into memory. So you can use some modification of Quick-Select algorithm, to achieve O(n) speed and O(1) extra-memory (because tail recursion can be effectively eliminated) in case you don't need your initial array later on
See wikipedia for "Selection_algorithm" and "Partial_sorting"
Actually, one would need a heap, if
1) points are not yet loaded, and you receive them one-by-one (i.e. it's not a set of points, but a stream)
2) or maybe you are out of memory, and you have to preserve the initial array (quickselect-based algoritm would modify it) - then you can use heap with O(k) memory instead of O(n) copy of array
So here is my C++ solution of finding k largest nums in the array - one can easily modify it to work with points and distances. To eliminate worst case one should also randomize selecting pivot
#include <vector>
#include <iostream>
#include <algorithm>
#include <ctime>
using namespace std;
// partition descending
size_t Partition(vector<int>& v, size_t start, size_t end) {
int i = start;
for (size_t j = start; j < end; ++j) {
if (v[j] > v[end]) {
swap(v[j], v[i]);
++i;
}
}
swap(v[i], v[end]);
return i;
}
vector<int> FindKLargestInArray(vector<int>& v, size_t start, size_t end, int k) {
if (end <= start) {
if (k == 0) {
return vector<int>(1, v[start]);
} else {
return vector<int>();
}
}
size_t pivot = Partition(v, start, end);
size_t pivotRelative = pivot - start;
if (pivotRelative > k) {
return FindKLargestInArray(v, start, pivot - 1, k);
} else if (pivotRelative < k) {
vector<int> res(v.begin() + start, v.begin() + pivot + 1);
vector<int> rightRes = FindKLargestInArray(v, pivot + 1, end, k - pivotRelative - 1);
res.insert(res.end(), rightRes.begin(), rightRes.end());
return res;
} else {
return vector<int>(v.begin() + start, v.begin() + pivot + 1);
}
}
/***********************************************/
ostream& operator<<(ostream& o, const vector<int>& v) {
for (size_t i = 0; i < v.size(); ++i) {
o << v[i] << " ";
}
return o;
}
vector<int>& operator+=(vector<int>& v, int x) {
v.push_back(x);
return v;
}
vector<int>& operator,(vector<int>& v, int x) {
v.push_back(x);
return v;
}
int main() {
srand(time(NULL));
vector<int> v;
v += 0, 1, 2, 2, 3, 4, 4, 6, 8, 8, 8, 9, 10,
11, 12, 12, 14, 15, 17, 19, 20;
random_shuffle(v.begin(), v.end());
int k = 4; // in interface count k from 1, but in implementation count k from 0
cout << v << endl;
vector<int> largest = FindKLargestInArray(v, 0, v.size() - 1, k - 1);
sort(largest.begin(), largest.end()); // optional
cout << largest << endl; // 15, 17, 19, 20
return 0;
}
Nice! Esentially,
1) Find the kth smallest element in O(n) time (en.wikipedia.org/wiki/Selection_algorithm#Linear_general_selection_algorithm_-_Median_of_Medians_algorithm)
2) Partition the elements to less than and greater than k in O(n) time (en.wikipedia.org/wiki/Partial_sorting#Direct_application_of_the_linear-time_selection_algorithm)
Essentially O(n) overall..!
Aaha! Nice!!! :-)
We can get linear time by using selection rank and then returning the first k elements. Essentially we find a pivot, shift larger items to the right, smaller or equal items to the left. If the final position of our pivot point is k then we're done, if our pivot is greater than k then we repeat to the left otherwise we repeat to the right. Finally copy and return the first k elements.
O(n) time complexity since the first partition will cost n, subsequent partitions will (on the average case) be half of the previous partition (the randomized selection of the pivot point locks in the expected time), so O(n) + O(n/2) + O(n/4)... then one final O(k) to copy the k items for a total of O(n) time. This solution assumes you can modify the input array, if you can't you could copy it initially and bump up the space complexity to O(n) but the algorithm would still be linear.
package com.my.tst;
import java.awt.Point;
import java.util.Arrays;
import java.util.Comparator;
public class ClosestPoints {
private static final Point root = new Point();
Comparator<Point> c = new Comparator<Point>(){
double pointDistance(Point p1, Point p2){
return Math.sqrt(Math.pow(2, (p1.getX() - p2.getX())) + Math.pow(2, (p1.getY() - p2.getY())));
}
@Override
public int compare(Point p1, Point p2) {
double d1 = pointDistance(p1, root);
double d2 = pointDistance(p2, root);
if(d1 < d2){
return -1;
}
if(d1 == d2){
return 0;
}
return 1;
}
};
public Point[] getClosestToRoot(Point[] input, int k){
if(k <= 0) return new Point[0];
int l = 0; int h = input.length -1; int target = k-1;
int p = partition(input, l, h);
while(p != target){
if(p < target){
p = partition(input, p+1, h);
} else{
p = partition(input, l, p-1);
}
}
return Arrays.copyOf(input, k);
}
int partition(Point[] input, int l, int h){
int range = h-l+1;
int p = l + (int)Math.random()*range;
swap(input, p, h);
Point pivot = input[h];
int firstHigh = l;
for(int i = l; i < h; i++){
if(c.compare(input[i], pivot) < 1){
swap(input, i, firstHigh);
firstHigh++;
}
}
swap(input, firstHigh, h);
return firstHigh;
}
void swap(Object[] input, int i, int j){
Object tmp = input[i];
input[i] = input[j];
input[j] = tmp;
}
}
// this is max heap problem
#include <iostream>
#include <cstdio>
#include <cmath>
#include <queue>
#include <cstdlib>
#include <vector>
using namespace std;
struct point{
int x;
int y;
double dist;
};
bool operator<(const point& a, const point& b) {
return a.dist > b.dist;
}
int main(){
priority_queue<point> pq;
int n;
cin>>n;
vector<point> v;
for(int i=0;i<n;i++){
point p;
int xco,yco;
cin>>xco>>yco;
p.x = xco;
p.y = yco;
p.dist = sqrt((xco*xco)+(yco*yco));
v.push_back(p);
}
int k;
cin>>k;
for(int i=0;i<k;i++)
pq.push(v[i]);
for(int i=k;i<n;i++){
point temp = pq.top();
if(temp.dist > v[i].dist){
pq.pop();
pq.push(v[i]);
}
}
while(!pq.empty()){
point temp = pq.top();
pq.pop();
cout<<temp.x<<" "<<temp.y<<" "<<temp.dist<<endl;
}
return 0;
}
Max(klogk , (n-k)logk)
static class Solution{
static int[] array={11,32,2,4,1,9,11};
static int k=3;
public static void main(String[] args){
for(int i=0;i<k;i++){//klog(k)
HeapifyUp(i);
}
for(int i=k;i<array.length;i++){//(n-k)logk
if(array[i]<array[0]){
int temp = array[i];
array[i] = array[0];
array[0]=temp;
HeapifyDown(0);
}
}
for(int i=0;i<k;i++)
System.out.print(array[i]+" ");
}
public static void HeapifyUp(int last){
if(last==0)
return;
int parentIndex = (last-1)/2;
if(array[parentIndex]<array[last]){
int temp = array[parentIndex];
array[parentIndex] = array[last];
array[last]=temp;
HeapifyUp(parentIndex);
}
}
public static void HeapifyDown(int first){
int leftChildIndex = first*2+1;
int rightChildIndex = first*2+2;
if(leftChildIndex>=k)
return;
int maxChildIndex = leftChildIndex;
if(array[rightChildIndex]>array[leftChildIndex] && rightChildIndex<k)
maxChildIndex = rightChildIndex;
if(array[maxChildIndex]>array[first]){
int temp = array[maxChildIndex];
array[maxChildIndex] = array[first];
array[first]=temp;
HeapifyDown(maxChildIndex);
}
}
}
@interface KTPoint : NSObject
@property (nonatomic, assign) float x;
@property (nonatomic, assign) float y;
@property (nonatomic, assign) float distance;
@end
@implementation KTPoint
- (NSString *)description
{
return [NSString stringWithFormat:@"x:%f y:%f distance:%f",self.x,self.y,self.distance];
}
@end
- (NSArray *)pointsClosestToOrigin:(NSArray *)pts size:(NSUInteger)k
{
NSMutableArray *points = [pts mutableCopy];
for(KTPoint *point in points)
{
point.distance = (point.x*point.x) + (point.y*point.y);
}
int from = 0;
int to = (int)[points count]-1;
// if from == to we reached the kth element
while (from < to) {
int r = from, w = to;
float mid = [(KTPoint *)points[(r + w) / 2] distance];
// stop if the reader and writer meets
while (r < w) {
if ([(KTPoint *)points[r] distance] >= mid) { // put the large values at the end
id tmp = points[w];
points[w] = points[r];
points[r] = tmp;
w--;
} else { // the value is smaller than the pivot, skip
r++;
}
}
// if we stepped up (r++) we need to step one down
if ([(KTPoint *)points[r] distance] > mid)
r--;
// the r pointer is on the end of the first k elements
if (k <= r) {
to = r;
} else {
from = r + 1;
}
}
NSMutableArray *results = [NSMutableArray arrayWithCapacity:k];
for (int i = 0; i < k; i++) {
[results addObject:[points objectAtIndex:i]];
}
return [results copy];
}
This question tells - given the set of points meaning the points are already available and also it says find the first K points and not Kth point closest to 0,0. My approach to this is to calculate the distances of every point to the 0,0 by creating distanceSquared array. Then this array gets sorted by distance from 0,0 by using quick sort algorithm O(n log n). As a result first K points are printed out. Space complexity is O(2n) and time complexity is O(2n log n).
<?php
$points = array(
array(10,10),
array(15,12),
array(20,250),
array(1,5),
array(56, 4)
);
function partition(&$distancesSq, $left, $right, $pivotIndex)
{
$storeIndex = $left;
$pivotValue = $distancesSq[$pivotIndex][1];
// swap pivot to right
$tmp = $distancesSq[$pivotIndex]; $distancesSq[$pivotIndex] = $distancesSq[$right]; $distancesSq[$right] = $tmp;
for($i = $left; $i < $right; $i++)
if ($distancesSq[$i][1] < $pivotValue)
{
$tmp = $distancesSq[$storeIndex];
$distancesSq[$storeIndex] = $distancesSq[$i];
$distancesSq[$i] = $tmp;
$storeIndex++;
}
// move pivot to storeIndex
$tmp = $distancesSq[$right];
$distancesSq[$right] = $distancesSq[$storeIndex];
$distancesSq[$storeIndex] = $tmp;
return $storeIndex;
}
// Quick sort distances.
function sortDistances(&$distancesSq, $left, $right)
{
if ($left < $right)
{
$pivotIndex = (int)($left + ($right-$left)/2);
$newPivotIndex = partition($distancesSq, $left, $right, $pivotIndex);
sortDistances($distancesSq, $left, $newPivotIndex-1);
sortDistances($distancesSq, $newPivotIndex+1, $right);
}
}
function findClosest($k, $points)
{
// Find distances from 0,0
$distancesSq = array();
for($i = 0; $i < count($points); $i++)
{
$distancesSq[] = array($i, $points[$i][0]*$points[$i][0] + $points[$i][1]*$points[$i][1]);
}
// Now find the first K points close to 0,0.
// Sort all distances using quick sort O(n log n)
sortDistances($distancesSq, 0, count($distancesSq)-1);
// Display the first $k points
for($i = 0; $i < $k; $i++)
echo "{" . $points[$distancesSq[$i][0]][0] . "," . $points[$distancesSq[$i][0]][1] . "}\n";
}
findClosest(2, $points);
?>
- (NSArray *)closeToOriginPoints:(NSArray *)points forK:(int)k {
[points sortedArrayUsingComparator:^NSComparisonResult(id obj1, id obj2) {
CGPoint p1 = ((NSValue *)obj1).CGPointValue;
CGPoint p2 = ((NSValue *)obj2).CGPointValue;
float dist1 = sqrt(p1.x*p1.x + p1.y*p1.y);
float dist2 = sqrt(p2.x*p2.x + p2.y*p2.y);
return (dist1 > dist2) ? NSOrderedDescending : (dist1 == dist2) ? NSOrderedSame : NSOrderedAscending;
return [points objectsAtIndexes:[NSIndexSet indexSetWithIndexesInRange:NSMakeRange(0, k)]];
}
Likewise, std::algorithm's partial_sort works too here.
static CGFloat CGPointLengthSq(CGPoint a)
{
return (a.x * a.x + a.y * a.y);
}
std::vector<CGPoint> closestPointsToOrigin(std::vector<CGPoint> points,
size_t k)
{
std::partial_sort(points.begin(),
points.begin() + k,
points.end(),
[](CGPoint a, CGPoint b)
{
return CGPointLengthSq(a) < CGPointLengthSq(b);
});
return std::vector<CGPoint>(points.begin(), points.begin() + k);
}
What about this solution in python?
import sys
def distance(point):
x, y = point.split()
return float(x) + float(y)
def main(points):
if len(out) <= K:
if not len(out):
out.append(points)
else:
for i in range(len(out)):
if distance(points) < distance(out[i]):
out.insert(i, points)
break
elif len(out) < K:
out.append(points)
if __name__ == '__main__':
K = int(sys.stdin.readline().strip())
out = []
while True:
data = sys.stdin.readline().strip()
if not data:
break
main(data)
print out[:K]
A linear time solution using quickselect to find the k-th euclidean distance, and then iterating through the list, first to find all the points that are closer than this k-th point, and then second to find all the points that have the same euclidean distance as the k-th point, until we reach k elements, at which point we stop and return those points.
# Given a set of 2D points, some integer k, find the k points closest to the origin, (0,0).
import math
import random
def get_euclidean_distance(point):
return math.pow(math.pow(point[0], 2) + math.pow(point[1], 2), 0.5)
def partition(arr):
if len(arr) == 1:
return ([], arr[0], [])
if len(arr) == 2:
if arr[0] <= arr[1]:
return ([], arr[0], arr[1])
else:
return ([], arr[1], arr[0])
pivot_index = random.randint(0, len(arr) - 1)
pivot = arr[pivot_index]
left = []
right = []
for i in range(len(arr)):
if i != pivot_index:
if get_euclidean_distance(arr[i]) > get_euclidean_distance(pivot):
right.append(arr[i])
else:
left.append(arr[i])
return (left, pivot, right)
def quickselect(arr, k):
(left, pivot, right) = partition(arr)
if len(left) == k - 1:
return pivot
elif len(left) > k - 1:
return quickselect(left, k)
else:
return quickselect(right, k - len(left) - 1)
def k_closest_points(arr, k):
kth_closest_distance = get_euclidean_distance(quickselect(arr, k))
count = 0
output = []
for i in range(len(arr)):
if get_euclidean_distance(arr[i]) < kth_closest_distance:
output.append(arr[i])
count += 1
for i in range(len(arr)):
if get_euclidean_distance(arr[i]) == kth_closest_distance and count < k:
output.append(arr[i])
count += 1
return output
arr = [[1,2], [4,5], [3,1], [5,5], [5,1], [-1, -2], [-1,-3]]
print(k_closest_points(arr, 4)) # [[1, 2], [-1, -2], [3, 1], [-1, -3]]
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
class Solution {
static void Main(String[] args)
{
Point[] arr = { new Point(1, 1), new Point(4, 4), new Point(5, 5), new Point(3, 3), new Point(2, 2), new Point(6, 6), new Point(9, 9),
new Point(15, 15), new Point(11, 11), new Point(10, 10), new Point(13, 13), new Point(12, 12), new Point(14, 14), new Point(7, 7),};
for (int i = 0; i < arr.Length; i++)
{
Console.Write("(" + arr[i].X + ", " + arr[i].Y + "), ");
}
Console.WriteLine();
QuickSelectK( arr, 0, arr.Length,2);
for (int i = 0; i < arr.Length; i++)
{
Console.Write("(" + arr[i].X + ", " + arr[i].Y + "), ");
}
Console.WriteLine();
Console.In.ReadLine();
}
public struct Point {
public Point(int x, int y)
{
this.x = x;
this.y = y;
}
private int x;
private int y;
public int X { get { return x; } set{ x= value; }}
public int Y { get { return y; } set { y = value; }}
public static bool operator <(Point p1, Point p2)
{
return (p1.X * p1.X + p1.Y * p1.Y) < (p2.X * p2.X + p2.Y * p2.Y);
}
public static bool operator >(Point p1, Point p2)
{
return (p1.X * p1.X + p1.Y * p1.Y) > (p2.X * p2.X + p2.Y * p2.Y);
}
}
public static int QuickSelectionPartition(Point[] list, int left, int right, int pivotIndex)
{
Point pivotValue = list[pivotIndex];
//move pivot to right
list[pivotIndex] = list[right-1];
list[right-1] = pivotValue;
int storeIndex = left;
for(int i=left; i<right-1; i++)
{
if(list[i]<pivotValue)
{
Point temp = list[i];
list[i] = list[storeIndex];
list[storeIndex] = temp;
storeIndex++;
}
}
Point tempo = list[right-1];
list[right-1] = list[storeIndex];
list[storeIndex] = tempo;
return storeIndex;
}
public static void QuickSelectK(Point[]list, int left, int right, int k)
{
if (left == right) return;
int pivotIndex = (left + right) / 2;
pivotIndex = QuickSelectionPartition(list, left, right, pivotIndex);
if (pivotIndex == k)
return ;
else if (k < pivotIndex)
QuickSelectK(list, left, pivotIndex - 1, k);
else
QuickSelectK(list, pivotIndex + 1, right, k);
}
}
this code is in C# and according to Quickselect algorithm suggested in en.wikipedia.org//wiki//Quickselect
/* Using the euclidean distance we can calculate the difference between two points in 2D
*/
func getEuclideanDistance(_ x : Int, _ y: Int) -> Double {
return sqrt(pow(Double(x), 2) + pow(Double(y), 2))
}
func kClosetPoint(_ p : inout [[Int]], _ k : Int) -> [[Int]]{
let m = p.sorted{
getEuclideanDistance($0[0], $0[1]) < getEuclideanDistance($1[0], $1[1])
}
return Array(m[..<k])
}
var points = [[3, 3], [5, -1], [-2, 4]]
let m = kClosetPoint(&points, 2) // [[3, 3], [-2, 4]]
Build a max heap of first k elements. Now for every element left, check if it is smaller than the root of max heap. If it is, then replace the root and call heapify. Do it till the end
- DashDash March 09, 2013Time complexity : klogk