Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
how about hashtable with link-list chaining.
For instance use first 1-3 characters as KEY. And in each entry link-list is used to chaining the words.
Christ, chrome, Christian, ......
use "Chr" as the hash key, and all the word in linked by a list.
Typo checking: not entry found in list, then it's a typo
auto typing: show some of words in this list. (depending on the rules)
- The database here consists of frequently misspelled words and associated correct spellings .
- The best in-memory DS to represent a large number of wrong strings and associated correct spellings is a trie / TST
- but such a structure would not scale well in case there are a vast number of such words .
Use a trie. For example if the user types seperate instead of separate then using trie traverse till there is a match i.e, till sep. Then get all the words starting with sep. use dynamic programming to get the cost to change seperate to each of the found words. Sort the words based on this cost and get the top 5 or 10 based on the application needs and display them to the user.
1) Have both prefix and suffix trie.
When a user enters a word try to match using prefix trie till there is no mismatch. Get all the words starting from that. Then do the same for suffix. Now run the DP algo to get the closest set of words with minimum modification to transform the typed word to all of these words selected. Display the closest matching words.
2)You can also have a separate table for most common misspellings for the words and display from both this and the above.
In the trie mainatin the count for each word based on its usage.
Firstly, consider the scenarios:
- Pankaj Gadge October 14, 20121) When user types a particular word
if(word == appropriate vocabulary word)
no suggestion;
else
suggestions with the closet matching words from the given dictionary
Assumptions:
1) There a should be a dictionary of correct vocabulary words, a) you can create you own b) several API's can be found online.
2) The dictionary should be stored as a form of sorted hashmap with no duplicate values.
Proceesing
1) When user enters the word, compare that word with the value of dictionary sorted hashmap,
if match is found then proceed
else compare the word with the closest matching word in the dictionary, pull out the closest matching words for the user to display.