## Facebook Interview Question

SDE-2s**Country:**United States

A really easy way of lowering the time complexity would precomputing the distances for every pair of index at the start, resulting in O(k*n^2 + n^3)

Ah the distance precomputing idea is nice. Sounds like it can indeed be done in O(n^3) then (since k<=n).

regarding minDist(), it looks like that the center of the group will be in the middle, so the following implementation should work:

```
private static int minDist(int[] arr, int start, int end) {
int mid = (start + end) / 2;
int sum = 0;
for (int i = start; i <= end; i++) {
sum += Math.abs(arr[i] - arr[mid]);
}
return sum;
}
```

@anonymous no your idea is not correct it can be middle(very likely) but not true in general..

Actually anonymous might be right here, even though I also repeal the idea at first. Let me try proving it.

Let m be the index of the middle element. Suppose n>m. We will show dist[n] has to be greater than dist[m]. Vice versa when proving the same for n<m.

Imagine the array being split into 3 sections by m & n:

(1) Left section has elements <= m

(2) Right section has elements >= n

(3) Middle section has elements between m & n

Let d=a[n]-a[m]. We know d>=0 when the numbers are sorted.

The elements in each section is now either closer or farther away:

(1) Each element in left section is now a distance "d" farther

(2) Each element in right section is now a distance "d" closer

(3) Each element in middle section is now either a "distance < d" farther or closer

Since m is in the middle, the left section comprise half the elements, while the right & middle sections comprise the other half. As such, the total increased distance for the left section must be greater or equal to the decreased distance from the right & middle section combined. So dist[n] >= dist[m].

yes anonymous is correct a more straight proof will be

let j be the position of the minimum and j!=mid

then if(j>mid=n/2) moving j to j-1 results in k*(n-j)- k *j change of the value

where k = A[j]-A[j-1] this change is k*(n-2*j) as u can see it is negative till j!=n/2

similarly for j<mid=n/2 which will result in k*j -k*(n-j) = k *(2*j-n) which is negative till j!=n/2

since this is one D array use DP

F(n+1,k) = min(F(n,k-1), F(n-1,k-1)+dist(a(n),a(n+1)), F(n-2,k-1)+dist(a(n+1),a(n),a(n-1) ...................)

also F(j,k) = 0 for j<=k

complexity.. O(k*n^2)

How do you compute each dist(a(n), a(n+1)...) in only O(1)?

I think that's at least O(n), making the overall complexity O(k*n^3)...

yeah i missed that... this dist can be calculate in O(log n) with some fancy work as the its only 1-D.. same thing for 2D will be O(n)

using mean and binary search for number closest to mean

a very basic bugg ridden code using recursion :D and O(n) for dist

```
int distance(int a[],int N,int J)
{
double mean=0;
int near=INT_MAX,near_id;
for(int i=0;i<J;i++)
{
mean+=a[N-i-1];
}
mean/=J;
for(int i=0;i<J;i++)
if(std::abs(a[N-i-1]-(int)mean) < near)
{
near = std::abs(a[N-i-1]-(int)mean);
near_id = i;
}
int sum = 0;
for(int i=0;i<J;i++)
sum+= std::abs(a[N-i-1]-a[N-near_id-1]);
return sum;
}
int calc_distance(int a[],int N,int K)
{
int min1=INT_MAX;
if(K<1)
return min1;
if(N<0)
return 0;
if(N<=K)
return 0;
if(K>1)
min1 = calc_distance(a,N-1,K-1);
else
min1 = distance(a,N,N);
if(K>1)
for(int i=1;i<=N-K;i++)
{
min1 = std::min(min1,calc_distance(a,N-i,K-1)+distance(a,N,i));
}
return min1;
}
```

Sort the array so that x_1 <= x_2 <= …. <= x_n. Now the problem is reduced to partitioning number N into K positive numbers a_1, a_2, …., a_K. From the partition we can construct a grouping: First we pick up a_1 numbers from x_1 to x_a for the first group, next we pick another a_2 numbers from x_(a1+1) ….x_(a2+a1), and this process continues. The problem can be reduced into a DP problem:

```
F(i, N, K) = 0 if K>=N, or K==0
F(i, N, K) = min_j (max(x[i+j] - x[i], F(i+j+1, N-j, K-1))) for 0<= j < N - K
```

This algorithmic problem is called

Jenks natural Breaks for coloring, see following post

"Oren Gal GIS Blog: Calculating Jenks Natural Breaks"

And it has name of its creator :-)

JavaScript implementation

1. sort array

2. find the best split of array into 2 intervals. Split second interval recursively in order to find the best split for K groups

```
//var items = [1, 1, 2, 3, 4, 4, 5, 6, 7, 9, 15];
var items = [1, 2, 5, 7];
var k = 2;
var mins = [];
for(var index =0; index < items.length; index+=1) {
mins[index] = {};
}
function getDistance(items, start, end) {
var result = 0;
var total = 0;
for (var index = start; index <= end; index += 1) {
var item = items[index];
total += item;
}
var mean = total / (end - start + 1);
/* it is better to use binary search here */
var bestPosition = null;
var bestOffset = null;
for (var index = start; index <= end; index += 1) {
var item = items[index];
if (bestPosition == null || bestOffset > Math.abs(item - mean)) {
bestOffset = Math.abs(item - mean);
bestPosition = index;
}
}
for (var index = start; index <= end; index += 1) {
var item = items[index];
result += Math.abs(items[bestPosition] - item);
}
return result;
};
function getMinDistance(items, pos, k) {
var result = null;
if (pos >= items.length - k) {
return 0;
}
if (k == 1) {
result = getDistance(items, pos, items.length - k);
} else {
for (var end = pos; end < items.length - k; end += 1) {
/* var distance = getDistance(items, pos, end) + getMinDistance(items, end + 1, k - 1); */
var posDistance = null;
if (!mins[end + 1].hasOwnProperty(k - 1)) {
posDistance = getMinDistance(items, end + 1, k - 1);
mins[end + 1][k - 1] = posDistance;
} else {
posDistance = mins[end + 1][k - 1];
}
var distance = getDistance(items, pos, end) + posDistance;
result = (result == null) ? distance : Math.min(result, distance);
}
}
return result;
}
var result = getMinDistance(items, 0, k);
```

please help explain clue. what means

```
For the first case, there is no need to move any object.
For the second case, group objects 1 and 2 together by moving the first object to position 2.
For the third case, group objects 1 and 2 together by moving the first object to position 2 and group objects 3 and 4 together by moving object 3 to position 7. Thus the answer is 1 + 2 = 3.
```

please show use numbers example in answer for first case and second case and third case?

is first case: 3, no move?

is second case: 3 3, first 3 move to second 3?

but what is object 3 and 4 in third case?

thanks patience and help.

This is very similar to the DP proposed above but implemented using memoization

```
class Solver:
def getmindist(self , start , end):
#print "mindist",start,end
if end-start<=1:
return 0
if self.distlookup[start][end] != -1:
return self.distlookup[start][end]
target = self.pos[start]
dist = 0
for x in self.pos[start:end]:
dist += abs(target-x)
mindist = dist
for i in xrange(start+1 , end):
delta = abs(self.pos[i]-target)
dist += (i-start)*delta
dist -= (end-i)*delta
mindist = min(dist , mindist)
#print "mindist",start,end,":",mindist
self.distlookup[start][end] = mindist
return mindist
def evalcost(self,index,k ,total):
#print "eval",index,k,total
if index<0:
return 0
if k==1:
cost = self.getmindist(0,index+1) + total
self.mincost = min(self.mincost , cost)
return cost
if self.lookup[index][k]!=-1:
return self.lookup[index][k]+total
mincost = 100000000
for i in xrange(index+1):
dist = self.getmindist(i,index+1)
cost = self.evalcost(i-1 , k-1 , total+dist)
mincost = min(cost-total , mincost)
self.lookup[index][k] = mincost
return mincost+total
def getmincost(self , pos , k):
print pos,
self.pos = pos
self.mincost = 100000000
n = len(pos)
self.distlookup = [[-1 for j in xrange(n+1)] for i in xrange(n+1)]
self.lookup = [[-1 for i in xrange(k+1)] for j in xrange(n+1)]
self.evalcost(n-1 , k , 0)
return self.mincost
s = Solver()
print s.getmincost([1,1,3] , 3)
print s.getmincost([1,2,4] , 2)
print s.getmincost([1,2,5,7] , 2)
```

```
private static int solve(List<Integer> grp, int center, int gCost, int tCost, int rGrp, int idx, int[] pos) {
if(idx == pos.length) {
if(rGrp > 0) {
return Integer.MAX_VALUE;
} else {
return tCost + gCost;
}
} else {
if(rGrp == 0 && grp.size() == 0) {
return Integer.MAX_VALUE;
}
}
int mCost = Integer.MAX_VALUE;
int newCtrGrpCost = computeCost(grp, pos[idx]);
if(rGrp > 0) {
if(grp.size() > 0) {
// case 1: Old center remains, terminate group
int newCost = solve(new ArrayList<Integer>(), -1, 0, tCost + gCost + Math.abs(center - pos[idx]),
rGrp - 1, idx + 1, pos);
mCost = Math.min(newCost, mCost);
}
// case 2: This is the new center, terminate group
int newCost = solve(new ArrayList<Integer>(), -1, 0, tCost + newCtrGrpCost, rGrp - 1, idx + 1, pos);
mCost = Math.min(newCost, mCost);
}
//case 3: Old center remains, same group continues
grp.add(pos[idx]);
if(grp.size() > 1) {
int newCost = solve(grp, center, gCost + Math.abs(center - pos[idx]), tCost, rGrp, idx + 1, pos);
mCost = Math.min(newCost, mCost);
}
// case 4: This is new center, same group continues
int newCost = solve(grp, pos[idx], newCtrGrpCost, tCost, rGrp, idx + 1, pos);
mCost = Math.min(newCost, mCost);
grp.remove(grp.size() - 1);
return mCost;
}
private static int computeCost(List<Integer> grp, int center) {
int cost = 0;
for(int i = 0; i < grp.size(); i++) {
cost += Math.abs(center - grp.get(i));
}
return cost;
}
```

Same DP approach as kkr.ashish above:

F(n+1,k) = min(F(n,k-1), F(n-1,k-1)+dist(a(n),a(n+1)), F(n-2,k-1)+dist(a(n+1),a(n),a(n-1)

where dist(a(n),a(n+1)...) represents the minimum distance to group a(n),a(n+1)... into a partition.

Overall complexity is O(k*n^3) since each dist() takes O(n).

- Sunny February 02, 2014