Amazon Interview Question
Software Engineer / DevelopersCountry: India
Interview Type: Written Test
Take a HashMap<String, Boolean> put all the word in the wordlist into this HashMap with initial default value true.
Start from the given word put it in a Queue and then find all its friends by calling a method editdistance to find all the words from the wordlist that are at editdistance 1 and putting them int the Queue. Mark the found words in HashMap as false.
Do this till the Queue is empty or all words are exhausted in the wordlist present in HashMap.
<?php
class Levenshtein {
public function LevenshteinDist( $srcStr, $tarStr) {
$srcLen = strlen($srcStr);
$tarLen = strlen($tarStr);
$disrArr = array();
for ( $i = 1; $i < $srcLen; ++$i) $disrArr[$i][0] = $i;
for ( $j = 1; $j < $tarLen; ++$j) $disrArr[0][$j] = $j;
for ( $j = 1; $j < $tarLen; ++$j) {
for ( $i = 1; $i < $srcLen; ++$i) {
if( $srcStr[$i] == $tarStr[$j] ) $disrArr[$i][$j] = $disrArr[$i - 1][$j - 1]; // no operation required
else $disrArr[$i][$j] = min($disrArr[$i - 1][$j] + 1, $disrArr[$i][$j - 1] + 1, $disrArr[$i - 1][$j - 1] + 1); //deletion, insertion, substitution repectively
}
}
//print_r($disrArr);
return $disrArr[$srcLen - 1][$tarLen - 1] + 1;
}
}
$obj = new Levenshtein();
echo $obj->LevenshteinDist( 'kitten', 'sitting');
?>
Create a trie of given list of words. It will take exactly o(n).
Now for the given word search for all the words whose distance is 1. As for the word "hello", first substitue each letter from 'a' to 'z' except that letter.
e.g aello, bello, ...hallo, hbllo, etc. and keep searching in the trie. if found increment the count.
Do the same after deleting & adding the one letter in each position of the given word. Count will be the answer.
well the above suggested algorithm does work but it takes a lot of time. Is their anyway i could improve upon that. I want to get this completed within 5 seconds or so. Below is my code.
public class Main {
private static Queue<String> qAlreadyVisited = new LinkedList<String>();
public static void main (String[] args) {
/*assume that the file is having the list of all words against which i check the distance*/
File file = new File("../CodeEval_Copy/src/levenshtein_Distance/Test");
BufferedReader in = null;
try {
in = new BufferedReader(new FileReader(file));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
String line;
ArrayList<String> listOfWords = new ArrayList<String>();
Queue<String> qFriends = new LinkedList<String>();
String socialNetworkForWord = "abcde";
try {
while ((line = in.readLine()) != null) {
if(line.trim() != null && line.trim() != "")
listOfWords.add(line);
if (!(Math.abs((line.length() - socialNetworkForWord.length())) > 2)){
if ((levenshteinDistance(socialNetworkForWord, line)) == 1){
//System.out.println(line);
qFriends.add(line);
}
}
}
test(qFriends, listOfWords);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println(qAlreadyVisited.size());
}
public static void test (Queue<String> qFriends, ArrayList<String> listOfWords){
while (!qFriends.isEmpty()){
getAllFriends(qFriends, listOfWords);
}
}
public static void getAllFriends(Queue<String> qFriends, ArrayList<String> listOfWords){
String findFriendFor = qFriends.poll();
qAlreadyVisited.add(findFriendFor);
for (int i = 0; i< listOfWords.size(); i++){
String str = listOfWords.get(i);
if (!(str.length() - findFriendFor.length() > 2)){
if ((levenshteinDistance(findFriendFor, str) == 1)){
if (!(qFriends.contains(str)) && !(qAlreadyVisited.contains(str))){
qFriends.add(str);
}
}
}
}
}
public static int levenshteinDistance(String s, String t){
// for all i and j, d[i,j] will hold the Levenshtein distance between
// the first i characters of s and the first j characters of t;
// note that d has (m+1)*(n+1) values
int[][] arr = new int[s.length()+1][t.length()+1];
// source prefixes can be transformed into empty string by
// dropping all characters
for (int i=1; i<= s.length(); i++){
arr[i][0] = i;
}
// target prefixes can be reached from empty source prefix
// by inserting every characters
for (int i=1; i<= t.length(); i++){
arr[0][i] = i;
}
arr[0][0] = 0;
for (int i=1; i<= s.length(); i++){
for (int j=1; j<= t.length(); j++){
if (s.charAt(i-1)== t.charAt(j-1)){
arr[i][j] = arr[i-1][j-1];
}
else{
arr[i][j] = Math.min(Math.min(arr[i-1][j] + 1, arr[i][j-1] + 1),
arr[i-1][j-1] + 1);
}
}
}
return arr[s.length()][t.length()];
}
}
As a preprocessing step, we need to any way build the complete social graph from the wordlist in O(N^2) time. Two nodes in the social graph would be connected if the corresponding words are separated by a Levenshtein distance of 1.
- random May 09, 2012Once the graph is constructed, we can fire a query to get the largest connected subgraph for any word , that would internally execute as a DFS starting at the queried word.
The Levenshtein distance or edit distance between any two words can be computed using Dynamic Programming (Refer to the problem of computing edit distances between two words. Feeling lazy to google the resource out on the web)