Design an algorithm to search

Microsoft Interview Question for SDE1s

0

of 0 votes

28
Answers
Design an algorithm to search for Anagrams of a word in a dictionary. The interviewer stressed on not processing all the words in the dictionary.
- CodeNameEagle May 10, 2013 in India | Report Duplicate | Flag | PURGE
Microsoft SDE1 Algorithm Data Structures

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: Phone Interview

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 8 vote

Actually you need to find all permutations of a given word, which exists in given dictionary.

- googlybhai May 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Brilliant! This should work.

- Anonymous May 10, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

+1 for this.

- Prithvi May 11, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Correct but the sentence is not very clear. I think what it means is:

1. Create a list of all the possible anagrams of a word
2. For each item in the list, check in the dictionary and see if it is a real word

- Anonymous May 11, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I described this option but the interviewer was not that happy about it. But this is a good approach. I think he wanted me to pre process the dictionary may be. Not sure though.

- CodeNameEagle May 13, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Create a hashmap<String><List>
iterate over the words in the dictionary
Sort the words and store it in the hashmap as a key (if the key didnt exist)
add the word to the current hashmap.

Similarly, when you encounter the second word and sort it, check if there is already an entry for that sorted word in hasmap, if yes, then simply call hashmap.get(keyname), and add the word to the list that is returned by hashmap.

After this process, you can sort the word to be searched and get the list from the hashmap

- Suchita June 27, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 3 vote

Pre-process the dictionary: Store the whole dictionary in a trie. This will give search O(1).

Now generate all the permutations of the given word and check for the generated word in dictionary.

Total time complexity: O(n!) where n is the length of the word.

- Samarth Gupta May 13, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

the only way i can think of is:
1. compare length. not length match means no anagrams. this will reduce a lots of words.
2. Split the remaining words in group of characters. god-> g,o,d
3. Sort the letters. and compare hash.

- Prithvi May 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I like this, but step #3 requires a sort of each word and then the hash? If we could get rid of that sort, we could save that nlogn time per word. What if we could do a hash that was insensitive to the order of the letters? Then we wouldn't need to sort them, saving that nlogn time. What if the sort were as silly as adding up the values of each letter. We could get false hits and then have to check each possible hit against other possible hits, maybe we would come out ahead here.

In any case, this is processing all the words in the dictionary, which would seem to be against the "not every words should be processed..." stipulation.

- dn May 11, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

add a step 0 -
- Search only the words whose letter starts with a letters in given word.

Some modification that can be done after can be done after step 2 but i don't think these are elegant:
step 2.1 -store the original words(split-ed) in a array and in a set.
step 2.2 -add the letters in the new word in the set. If the length of the set changes its a miss. those words which pass the tests will require further processing

A note before step 3:-
I am not aware of any hash insensitive to the order. We might take sum of hash of each letters and compare the aggregate hash. a miss means no anagrams. i seriously doubt vice-versa is true.
Finally step3:

- Prithvi May 11, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

To find length of each word in dictionary, you have to access it, and thus is not a good solution.

Plus say the word is "Anagram"
Possible permutations of this word = 7!/3! = 7*6*5*4 = 840 only

While the dictionary will have a lot of words in comparison.
Therefore to decrease the number of comparisons, we should check for the permutations rather than the dictionary itself.

- divinexsoul May 14, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

did u ask how the words in a dictionary are stored ?

- bbarodia May 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Assuming its large enough. Not sure what the interviewer was expecting though. I told him about length of words, Starting alphabet matching etc but he was not satisfied.

He stressed about the fact that not every word should be processed from the dictionary.

- CodeNameEagle May 10, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Starting character matching will not work.
eg: army, mary

- Anonymous May 10, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

First sort you string...
Find anagrams of this string (using permutation) ( ;) they will be generated in sorted order) (I think permutation algorithms do like this)
Then finding them in order in the dictionary will be easier...

- coding.arya May 11, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

#1. process only the words in the dictionary whose length equals to given words length;
#2. sort the given word's letters, sort the selected word's letters form #1;
#3. compare the result from #2, if equal, then fine one

- Sylla May 12, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

Assuming that you can pre-process dictionary.

Pre-process dictionary.
1) create new dict
2) for each word w1 in dictionary
sorte_w1 = sort(w1)
dict[ sorte_w1].append(w1)

Query: find all anagrams for word q
1) sorted_q = sort(q)
2) return dict[ sorted_q]

- Anonymous May 12, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Can we decide how dictionary itself is stored?

- Anonymous May 12, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I think you might be onto something. I thought about this but forgot to ask my interviewer. He did mention doing some computations initially so that its easier to find anagrams.

- CodeNameEagle May 13, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

1. Generate permutations of the word.
2. Binary search through the "dictionary" (e.g. if given as a string array, or file directories - basically, avoid loading the entire dictionary into memory).
O(N! * log(P)), where N is the length of the word, and P is the number of words in the dictionary.

- shine.wang.ws May 12, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Dictionary Creation:

Data Structure: HashMap(AnagramHash key) contains HashMap(uniqueHash key)

hash1: anagramHash() - > Returns same hashcodes for all anagrams and not for non-anagram words
hash2: unique for each word

search:

Find hash 1 and hash 2 for any word and get the info

Time: O(1) i.e constant time

Anagram search:

find hash1 and return all objects in the inner hashmap

Time: O(1)

- asenthilkumargce May 13, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1) from the given word, separate all the alphabets and take only those sets from the dictionary starting with those alphabets
eg. given word "god", then tk the set of words starting with 'g','o' and 'd'
2) find the length off the given word and make a group of words with the same length.
3) that's it...

- Amit Lopes May 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

sorry
after step 2 find the words with wit all the alphabets from the given word

- Anonymous May 14, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

"interviewer stressed on not processing all the words in the dictionary."
That's the key point in this problem. I read algo for this problem long back somewhere, it was like this :

Map each alphabet with a prime number and product of prime values of letters in a word will act as hash. Now search in the given dictionary this hash value. If dictionary has been implemented as trie then search in the trie and break from a particular word if its prime product exceeds the corresponding value for the given word.

You need to take care of oveflow by carefully chosing prime values for letters, so higher frequency letters should be given lower prime value.

- Cerberuz May 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

sort the word...find all anagrams of the word...
store the anagrams in lexicographically greater to smaller order!

find first word....in dict..
next word will b present from beginning till where first word was found...
next word will b present from beginning till where second word was found...
hence reducing search space on evry iteration!

- Anonymous June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Build a anagram HashMap of the dictionary itself. Key is the sorted string of characters of the words and value is a list of all the anagrams. Given a word, sort it and get the value of the sorted string from the HashMap. The list should have all the anagrams of the given words.

You only build the HashMap once. Sorting the input word has a complexity based on the sorting algorithm used (mostly O(nlogn)). Searching for list of anagrams is O(1) and looping through the list should also take O(m). Assuming the size of the word is small enough and the anagram list is mostly small, complexity should be O(1) once the HashMap of dictionary is created.

- gsumanthkr June 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This seems the best :)

Pre process dictionary:
      While inserting each word, sort the word & add it to trie according to this sorted word. & store the actual word.
      When anagram for a word is to be find, just sort the word & lookup into trie. You will find list of words there.
      Complexity will be O(nlogn) in worst case which will be of sorting word to be searched but as the words can't be too long, this should not be the issue.

- SK June 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

- Search only the words whose letter starts with a letters in given word.
- Eliminate the dictionary words.length != given word.length.

- Dhass May 10, 2013 | Flag Reply

CareerCup

Microsoft Interview Question for SDE1s

Books

Videos

Resume Review

Mock Interviews