Microsoft Interview Question
SDE1sCountry: India
Interview Type: In-Person
Am just confused, why is this not the problem equal to finding minimum value of x in the set? If we have a set of records with say 2 columns, then between ymin and y max every entry should come. So simply finding x min should give us the answer.
My solution was on the lines of Balanced BST. Each Node in BST will hold following values
1. leftYMin (minimum Y value in the left subtree)
2. leftYMax (maximum Y value in the left subtree)
3. rightYMin (minimum Y value in the right subtree)
4. rightYMax (maximum Y value in the right subtree)
5. leftXMin (minimum X value in the left subtree)
5. rightXMin (minimum X value in the right subtree)
6,7. pointers to left , right child
Now, given two Ymin and Ymax, we start at the root node. First we examine the left subtree. Following are the possibilities
1. Ymin and Ymax are within the interval [leftYMin, leftYMax]. In this case we straightaway hold the value of leftXMin in XLeft, i.e. XLeft = leftXMin
2. Ymin and Ymax partially overlap with interval [leftYMin, leftYMax]. In this case we recursively travel down the left child and look for value of XLeft
3. Ymin and Ymax and interval [leftYMin, leftYMax] are disjoint. In this case we discard the left subtree.
Similarly, we examine the right subtree and find the value of XRight.
Our min(X) is min(XLeft, XRight)
At the outset, this solution looks something like interval trees/augmented trees/Range Indexed Trees but I am not sure and I will have to look if that is the case.
PS: For the uninitiated folks who think we just need to find min(X) in the given set, I apologize for not being clear enough. Let me take an example.
Given: an unordered list of (x,y)= (1,28), (5,6), (11,13), (24,37), (8,1), (56,10), (78,19)
and given following value of Ymin, Ymax, the output would be something like
1. Ymin=13, Ymax=19, min(X) = 11
2. Ymin=6, Ymax=37, min(X) = 1
The solution is pretty easy, unless there is a typo in the question: The condition in the SELECT clause is "Y BETWEEN MIN(Y) AND MAX(Y)", i.e. any X regardless of Y, because all the Y values are between Ymin and Ymax inclusive. Then the problem becomes finding the MIN(X) coordinates. Note that SQL BETWEEN condition is inclusive of the boundary values.
Then you can find the Xmin in O(n) time, just linearly search through the values and then save the Xmin,Y pairs. The problems mentions the duplicates, so instead of a single Xmin,Y coordinate, we need to save all of them. Here is the code in C++11:
vector<const Point> findMinX(vector<Point>& points) {
unsorted_map<int, vector<Point> > x_points;
int min = INT_MAX;
for (const Point & point: points ) {
if (p.X < min) min = p.X;
x_points[pX].emplace_back(point);
}
return x_points[min];
}
@Vifi you can find Ymin and Y max in O(n) time, but it doesn't matter, because the BETWEEN condition includes all the Y values so it's irrelevant.
Ymin, Ymax is an input given, and we have to find Min(X) among all those points whose Y coordinate is between Ymin,Ymax. Moreover, naive solution of this is first scan through the list and find all points between Ymin,Ymax and then find the min(X) amongst them. This is a O(n) solution.
Can we do better? O(lgn)?
@Amm Sokun, Based on your question that you've posted, this problem is simply finding the min(X) for a given list of coordinates.
We cannot make better than linear time for finding the minimum in an _unorganized_ list, because we have to examine every element, otherwise we might miss it. Therefore, the optimal algorithm for finding the minimum element is Ω(n) (that is big Theta). If the elements were in a balanced tree then finding the element would take O(log n) time. If the list were sorted then finding the minimum is trivial and could have been done in O(1), but these are not stated in the problem.
This problem can be solved by segment tree and ST(dynamic programming solution).
- AangsEmail May 26, 2013