b2010
BAN USER- 1of 1 vote
AnswersIt was a pretty interesting question.
- b2010 in United States
Assume that you are given a fixed set of floating point numbers. Now given a new floating point number 'x', the goal is to efficiently find the number that is closest to 'x' from the fixed set. Question is: what data structure will you actually use for storing the fixed set of floating point numbers to achieve this?
Edit:
I missed to add. The interviewer further mentioned that I can not sort and that I can use any amount of time for creating the data structure (meaning this need not be efficient).
Since, I am not allowed to 'sort', I assumed that I can not use BST as I will have to compare numbers while populating the tree. But I didn't clarify it with him; I should have in hindsight.| Report Duplicate | Flag | PURGE
Amazon Research Scientist Data Structures
@successboy123, Good point. However, all I have to do to avoid this issue is to maintain multiple search paths until each one of them can be *safely* deleted.
Thus, in the present case, I'll include paths "1-4-." and "1-5-." and when I reach the next set of nodes I can delete the path "1-4-.-0" and only retain "1-5-.0". And if there was as 13.99, I would also retain "1-3-." until it reaches the next digit, for the given query 14.99. Isn't this correct?
@Itanium, Well I really can't argue with him in the interview as to, what the ideal solution *should* be. My job in the interview is to try and come up with some solution that also satisfies the constraints. And I believe he was definitely looking for an efficient solution that would have smaller time complexity.
- b2010 September 25, 2013Of course this is brute force search, and it has the worst case complexity of O(n) i.e. linear in the total floating point numbers in the set. In contrast BST would be O(log n) and the trie approach would be O(m) where m is the length of the 'longest float in n'.
- b2010 September 25, 2013Thanks for the answers. Even the k-d tree or for that matter any space partitioning method requires comparison to identify their order as to which goes to the left and which one to the right. Correct my if I am wrong.
My own answer during the interview was bit unconventional (I am not sure if the interviewer was convinced though).
I opted for a trie data structure for storing the set of floating point numbers, where each node would either store a digit or a decimal point. Thus the height of the trie would be length of the longest floating point number. The traversal at each node would involve finding the node that matches the corresponding digit in the number being tested or finding nodes that are at same distance. I will have to maintain multiple paths during the search until I can safely eliminate all but the closest path.
The problem in this is that in order to get the individual digits, I need to convert the number to string and back to number. While this is not an issue for populating the trie as I don't have to worry about the efficiency here, it will be an issue during retrieval.
Any thoughts about this approach?
Of course, BST or k-d tree would have been straight forward answers if 'sorting' constraint was not there.
BST would be O(log n), not O(n log n). Also pls read the previous answers and discussions.
- b2010 September 26, 2013