National Instruments Interview Question
Software Engineer / Developersstruect Point
{
int x;
int y;
// int z; // if this is a point in 3D coordinate
};
int nearestPoints(const Point &target, const vector<Point> &points, const int distance)
{
int count = 0;
for(int i = 0; i < points.size(); ++i)
if((target.x - points[i].x) * (target.x - points[i].x)
+ (target.y - points[i].y) * (target.y - points[i].y) <= distance*distance)
++count;
return count;
}
Case 1: Sparsely
Scan the entire linked list and store all points for which (x-x0)^2 + (y-y0)^2 <= D^2
Case 2: Densely
Consider a square grid of size 2D around the point P.
We will ignore all other points
check all points one by one and check the distance from point P
Select point if distance is less than D
Rational for the choice of data structure
1. Sparsely: If there are few points which are spreader widely in 2D space then allocating memory for a 2D array will be waste as most of the blocks of such array will be empty. Hence linked list is best choice.
2. Densely: If there are huge points and they are concentrated around the point P then it will be a waste to store info of all points in linked list as in linked list we also have to allocate memory to pointers which will be waste if points are huge
if this the query is going to be repetitive, then one time activity would fetch the results in O(1) time...
For every point i create a hash <distance(i , j), point j>
and check if there is a value exists for the given node with given distance d...
key is do you want to have repetitive query or not query or not. If query is repetitive then use of k-d tree can be effective.
Thing is it will take O(nlogn) time to create the tree and space would be O(n).
After creating the tree query can be solved in worst case O((square root of n)+number of points satisfying the value). Worst case looks pretty bad but average case is good with O(logn+number of points satisfying the value).
The data structure used doesn't matter unless we know how the function used and how frequently it is used. If the data structure is fixed and the function could be used on any point, then we could think about a data structure that optimizes the computational time.
- Rajesh Konda June 29, 2009The question doesn't specify in which dimension the point is in. If it is an n-dimensional space, perhaps a regular grid of n-dimensional cubes of space could be represented using an n-dimensional array. Each array element or n-tuple uniquely identifies an n-dimensional cube. In one pass, each point could be linked to the corresponding cube. The way the points are linked could be a list or an array. It's easier to build this structure using a list, and it takes less space to store when using an array.
Given a point, the cube in which it exists could be easily identified. Then depending on the distance, one can narrow down the set of points being looked at by narrowing down the surrounding set of cubes.
Further more, if we know the distance D in advance or the range of values for distance D, the n-dimensional cube size could be optimized. Like previously mentioned, 2D is a good size for the dimension of a cube, if D is fixed. That way you make sure that you are atmost looking at 8 surrounding cubes. Given a range, I guess you take the average and use it for D.
Also, in case the number of points is huge, the data could be stored in a distributed network and hence the computation could be optimized, and also parallelized.