## Google Interview Question

SDE-3s**Country:**United States

**Interview Type:**Written Test

Since there's a constraint that all points in a cluster has to be within 5 units, that means they have to fit in a 5x5 square.

Then the problem is finding the square that contains the most points.

```
public void FindMaxClusters(int[,] plot, int clusterSize){
int xMax = plot.GetLength(0);
int yMax = plot.GetLength(1);
int maxPoints = 0;
// loop through the plot and all possible squares
for(int i=clusterSize -1; i<xMax; I++){
for(int j=clusterSize -1; j<yMax; j++){
// get number of points in this square
// or we can return a list of points instead of an int
int numberOfPoints = GetNumberOfPoints(I -clusterSize -1, I, j - clusterSize - 1, j);
if(numberOfPoints > maxPoints)
maxPoints = numberOfPoints;
}
}
}
```

If the points are very sparse, and we know the location of each point, then we can only check the areas around each point.

1) Sort array by x coordinate. First, merge all the points in sequential order as long as diff x < 5. compare i th with i+1th, it is sorted. If not then start a new cluster from the next element.

2) Now you have clusters that satisfy the first condition.

3) Now for every cluster sort by Y coordinate, and do the step 1 for Y co-ordinate. Split the cluster if necessary along y.

```
1) Sort array by x coordinate. First, merge all the points in sequential order as long as diff x < 5. compare i th with i+1th, it is sorted. If not then start a new cluster from the next element.
2) Now you have clusters that satisfy the first condition.
3) Now for every cluster sort by Y coordinate, and do the step 1 for Y co-ordinate. Split the cluster if necessary along y.
```

1) Sort array by x coordinate. First, merge all the points in sequential order as long as diff x < 5. compare i th with i+1th, it is sorted. If not then start a new cluster from the next element.

2) Now you have clusters that satisfy the first condition.

3) Now for every cluster sort by Y coordinate, and do the step 1 for Y co-ordinate. Split the cluster if necessary along y.

I find the problem description a little confusing. It follows from the example that the distance relationship does not have to hold pairwise for all points of a cluster (as someone mentioned that means finding a square that contains the most points which is a harder problem) but rather a simple transitive relationship is defined where a cluster could even be made from points arranged in straight line every 5 units apart in other words points A and B are in the same cluster if they are connected directly (within a square of 5 units of each other) or indirectly via x1 .. xn where (A,x1), (x1,x2), ... (xn, B) are within a square of 5 units of each other. In this case the problem boils down to a simple union-find.

Union find and dfs

- code reviewer May 30, 2019