Microsoft Interview Question
Software Engineer / DevelopersCountry: India
Interview Type: Phone Interview
You can use modified version of Rabin-Karp algorithm to solve this. Basically if your hashing function is incremental and sorts the input (for small strings this is really O(1)) before calculating the hash then basically this problem reduces to just scanning the document and matching the hash.
Hi,
I did not get you. Can you please provide a bit detailed explanation. Sort the input. In the above example 'imt' is already sorted now if you match this with the content of the file. It will not get any match. But the word 'submitted' contains the match 'mit' which is one of the permutaion of imt.
Scan the input, keep track of last word index. Since the match is of 3 letters, For every next 3 letters compare its hash with that of the match. If they are equal return the last word index.
Not sure whether this is correct solution ? or high on time
to avoid checking whether next three letters r permutation of the match which might be expensive op.
Hashing saves comparison time. Have a look at Rabin Karp algorithm. It uses incremental hash functions that can accommodate for sliding window problem in pattern matching. If you just modify the hash function so order of the characters do not matter then you are done.
There's a way to do this easily without hashing. Why not just, for each word, keep a sliding window of letter counts (use a map). You know the length of the search string (call it K). So start with that many letters in the word (count them using a map), then for each consecutive letter, add the next letter and subtract the letter K characters before in the string. If the map ever contains the exact characters in your search string, you've just matched the pattern. You can even avoid the O(K) map comparison each time by starting the map not empty but with the negative of the count of letters you're looking for. Then you just check if the map is empty each time (O(1)).
Solution of this problem using Rabin-Karp algorithm is very simple and optimistic. It only differs in one way from conventional brute force approach. In conventional brute force approach we will compare every substring with search string and find out the matches. But in Rabin-karp approach we just compare hash of both search string and substring and hash is nothing but assume just a sum of ascii values of characters of string.
So brute force approach produces complexity around O(nm) and with Rabin karp we can achieve O(n+m) which is significantly better.
One important optimization in Rabin Karp approach is computation of hash of a substring of length m in constant time. This is the only factor which let it achieve complexity of O(n+m) instead of O(nm).
@Vikas: I doubt if a plain Rabin-Karp algorithm will work. The question does talk about finding the permutation of the sequence in the given sentence. Don't we have to sort the sentence and the sequence in the alphabetical order before applying Rabin-Karp algorithm?
This qns is very similar to the one which requires finding the smallest substring containing all the characters of a given string (very well known qns). So, just scan the whole doc take each word and perform the above operation.
@ Nscent wrong approach here is a counter example.
the string is abcdefghi and the word to be searched is cef
now ur approach give the cdef which is the smallest substring containing cef which is wrong.
@ Kamal, the solution provided by Nascent just needs a little improvement.
Improvement: In the end if length of searching pattern matches that of a substring containing all the characters then you can say that you found the relevant word.
One more thing to add, that algo was on a word to search, here we have to search in sentence on words.... so more improvement required.
@ Kamal, the solution provided by Nascent just needs a little improvement.
Improvement: In the end if length of searching pattern matches that of a substring containing all the characters then you can say that you found the relevant word.
One more thing to add, that algo was on a word to search, here we have to search in sentence on words.... so more improvement required.
Assuming the sentence consists of 26 alphabets(lowercase),take an array of size 26(bool)
- Anonymous November 30, 2011-Set all the elements in array to 1 that are in the pattern (e.g. for x set 'x'-'a')
-Start parsing the sentence. If we get contiguous elements in the sentence till strlen(pattern) for which the corresponding element in array is set, return the starting location.
-else continue looking from the next element where we haven't found its corresponding element set in the array.
Time: O(n) Space: O(1) [Const]