Amazon Interview Question Software Engineer / Developers

  • amazon-interview-questions
    0
    of 0 votes
    24
    Answers

    You have a list of 1 million distinct English words. Each word is between 1 to 40 letters long and contains only alphabets, no space or special characters. The list is already sorted alphabetically.

    Given a word shorter than 40 letters, find all words in this list that are only 1 letter different from this word, spelling order is not important.

    There are 3 types of matches: 1. Swap one letter with another and you have an exact match 2. Remove one letter and you have an exact match and 3 Add one letter and you'll have an exact match. For example: Given the word "coverage", these are valid matches:

    1. "converge". Swap 'n' with 'a'
    2. "coverages". Remove 's'.
    3. "overage". Add a 'c'.

    What's the time complexity of your algorithm?

    Can your algorithm handle the request to find words that are 2 letters different from the given word?

    - chriscareercup on November 12, 2012 in United States Report Duplicate | Flag
    Amazon Software Engineer / Developer Algorithm

Country: United States


Comment hidden because of low score. Click to expand.
10
of 10 vote

here's a O(n*m) (n=no of words in the dictionary, m = max length of the word)complex solution with space O(52). Perform the following steps:
1. Take all the words from the lis with length same as the given word or +/- 1
2. Create a letter count array/map of the given word (for eg if the word is "coverage" then a=1, c=1, e=2, g=1, r=1, v=1) (say a[26] contains the letter count from a to z)
3. take each word from step 1. create its letter count array (say b[26]
4 for i = 1 to 26 sum += abs(a[i]-b[i]). each word with sum value = 1 or 2 will be the required word.

- The Artist on November 12, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

This looks like a cool solution. I was trying to work something out using 3 regular expressions, but I think your method is the way to go.

However, it will fail for terms which are anagrams. As an example, if we search for "Dictionary", the word 'Indicators' will be counted as a word which needs only 1 letter changed to get an exact match.

How can we change your technique to account for this?

- Michael Howard on November 12, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@Michael - it is supposed to find "indicators" if input is "dictionary"

In the question, it is clearly mentioned that "spelling order is not important."

see my C# implementation below

- siva on November 12, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I see now. In that case, the method which you and The Artist have posted is definitely the correct solution.

- Michael Howard on November 12, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

There is a small issue in the solution. If you are accepting words with sum = 2, you might end up taking words which are different by 2 letters also.

- Anjana on November 13, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I think your algorithm does not consider "The list is already sorted alphabetically."
Sorted list must benefit your algorithm.

- Eric on November 13, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@Anjana: sum=2 should be considered when the size of the dictionary word is same as the given string

@Eric: that could be a decoy. I did not find much help from the sorted list.

- The Artist on November 14, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Couldn't get the part where sum=2...i mean if we find exact match, sum=0, differ by 1 then sum=1. How does sum=2 fit in. Please explain.

This might be a little dumb question but nonetheless, would be great if you would explain.

- Jas on November 14, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@Jas: consider the example given in the question where you convert 'coverage' to 'converge'. In that case the sum will be two. Basically here you are taking one character of the given string and replacing it with any other alphabet.

- The Artist on November 15, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Step 4 is wrong.
Suppose we have words "a" and "b".
The arrays would be a[][]=[1][0] and b[][]=[0][1].
The result would be sum=0 which means that "a" and "b" are completely equals.

- sm.pavlenko on January 29, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I am not sure why this question would be asked(as you have to traverse the whole dictionary of million words). The better question to ask would be how would you design the dictionary so that you can print the words efficiently.
The answer to that would be :
Have your dictionary stored in key , value pair : key : count[26]array, value list of all strings that match (basically all anagrams will be in this string)
Now given word with count array G[]
There are the following three cases :
1) anagram of Given(G) + An extra alphabet
2) anagram - 1 alpha
3) change one alphabet in G
Now solution for above 3 cases :

1) for i = 0 to 25 : {G[i]++ ;  printlist(hash(G)); G[i]--; }
2) Take old G, not from 1) . for i = 0 to 25 : {if(G[i]) {G[i]-- ;  printlist(hash(G)); G[i]++; }
3) for i=0 to 25
{
	G[]= given;
	if(G[i])
	{
	G[i]--;
	for j=0 to 25 -> skip j=i;
	{
		G[j]++;
		printlist(hash(G))
		G[j]--;
	}
	G[i]++;
}

- Hill Billy on March 09, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

and i think my above solution will work for any difference just the sum value need to be modified

- The Artist on November 12, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

public List<string> FindWord(List<string> dictionary, string word)
{
List<string> output = new List<string>();

int[] charCount = new int[26];
char[] input = word.ToCharArray();

for (int i = 0; i < input.Length; i++)
{
charCount[input[i] - 'a']++;
}

int[] temp = new int[26];

foreach (string _word in dictionary)
{
if (Math.Abs(_word.Length - input.Length) == 1 || Math.Abs(_word.Length - input.Length) == 0)
{
charCount.CopyTo(temp, 0);

char[] _input = _word.ToCharArray();

for (int i = 0; i < _input.Length; i++)
{
temp[_input[i] - 'a']--;
}

int sum = 0;

for (int i = 0; i < temp.Length; i++)
{
sum += Math.Abs(temp[i]);
}

if (sum <= 2)
{
output.Add(_word);
}
}
}

return output;
}

- siva on November 12, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

How about a backtracking solution? Lets say we build a trie of the one million words. lets say input word is of length n. We check all the branches and keep track of number of mismatches and if mismatch count is less than permitted we allow traversing the trie else we exit the branch and check another branch using backtracking.

This approach however checks for words with only same length. I am checking how this can be modified to words with different length.

backtracking(root, "coverage", 1);

backtracking(trie node,string input_word, int mismatch_allowed)
{
    if (mismatch_allowed < 0)
        return;
    if (input_word.length == 0)
    {
        print current node details (i.e. complete word till this node);
        return;
    }

    for character_branch in (all_branches_at_root)
    {
        if (character_branch == input_word[0])
        {
             // first char matches, check next chars
             backtracking(character_branch_node, "overage", mismatch_allowed);
         }
         else {
             // character mismatch reduce the count
             backtracking(character_branch_node, "overage", --mismatch_allowed);
         }
    }
}

- Anonymous on November 12, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I think the key to solving this problem is to use the information that the list of words is sorted (i.e.: you already have the word list preprocessed to help you with the query).

Consider (for complexity computation):
N - number of words in the word list (max. 1 million)
L - size of a word (max. 40)
A - size of the alphabet (= 26)

For 1 letter distance you can use the following algorithm:
1. Generate all the possible words that are 1 letter distance away from the query word
- this has the complexity O(L*A)
2. Look up each of the generated words in the word list using a binary search:
- the look up of an individual word is O(L*logN)
- we have O(L*A) lookups => final complexity is O(L^2*A*logN) which is roughly (not considering the hidden constant) about 832.000 operations which is better than O(L*N) which is roughly 40 million operations.

For distance = 2, this algorithm performs worse than the O(L*N) version.

- cristian.botau on November 13, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

i think you missed the point in the question where it says that the order does not match which can result upto a max of (40!)/(2^14) possible combinations which is humungous.

- The Artist on November 14, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Yeah, my bad :)

Although, if the order doesn't matter, I don't see how the fact that the list is sorted may be helpful.

- cristian.botau on November 14, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

The question is more interesting and of practical use if the ordering of letters is important.

- Vikas on November 19, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

@anjana

In the algo presented by The Artist

a minor modification is needed such that it doesnt take into account of the words such as (sumit with sumitas)

what we can do is make the changes in that space i.e O(26)
for example the word sumit has to be searched
so s-1,u-1,m-1,i-1,t-1

when the word sumita is encountered all the fields in O(26) is zero except a which is -1,
so this is our word.

now if we take sumi , then only value corresponding to field t is -1.

but if we take a word such as samit then field corresponding to u would be 1 and a would be -1,

in a iteration corresponding to the word length there can be only a pair of 1 & -1;
we wont consider if there is 1&1 or -1 & -1 which might arise in the example of (umi or sumitaf) which would be the case for difference in length corresponding to 2.

- kumsaha on November 21, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

We can solve the problem by the approach artist suggested. But we can reduce the time complexity from O(mn) to O((number of words with size+number of words with size-1+number of words with+1))m.

Simply apply the binary search on the list given to find the first occurance of the word with size-1, It would take O(logn)(i.e. log one million= approximately 20). From that index onwards you can traverse the list to until you reach a word with size+2. Since the list is already sorted, We don't need to traverse the entire list.

- Eswar on November 13, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Sorry for my ignorance. It doesn't work. I just confused of alphabetical sorting

- Eswar on November 13, 2012 | Flag
Comment hidden because of low score. Click to expand.
-1
of 1 vote

Couldn't get the part where sum=2...i mean if we find exact match, sum=0, differ by 1 then sum=1. How does sum=2 fit in. Please explain.

This might be a little dumb question but nonetheless, would be great if you would explain.

- fallen_angel on November 14, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

Couldn't get the part where sum=2...i mean if we find exact match, sum=0, differ by 1 then sum=1. How does sum=2 fit in. Please explain.

This might be a little dumb question but nonetheless, would be great if you would explain.

- The Artist on November 14, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

@fallen_angel: see the comments to my original response.

- The Artist on November 15, 2012 | Flag


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book walking you through getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More