Given a dictionary, and a list

Google Interview Question for Software Engineer / Developers

2

of 4 votes

41
Answers
Given a dictionary, and a list of letters ( or consider as a string), find the longest word that only uses letters from the string. [I didn't meet this question, what's the best solution?]
- Obiwana March 11, 2014 in United States | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 1 vote

Algo:
1. Put all chars of string in a hashmap
2. traverse word by word from dictionary and for each word check all letter presents in hastable.
3. keep track of longest string traversed.

- Amit. March 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

This is the first thing I thought off and I am pretty sure this is the simplest and the most efficient way.
Let's think about it's complexity:

Let length of input string = m
Then 1. is O(m) time and O(m) space

Let the number of words in the dictionary be = n
Let the length of the longest word in the dictionary be l
2. is O(n * l)

Total O(m + n*l) = O(n) Linear time

Let me know if you need more clarification or find any problems with this approach.

- sushant.singh1987 March 14, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Wouldn't using binary search reduce the time complexity further?

- srikantaggarwal March 18, 2014 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Initially we need to preprocess the dictionary. For every string in dictionary we need to sort it and store it as a key and value as original string in Hash Table.

Sort the given characters.

Let's assume that you have a function isWord(w), which checks if w is a word using a dictionary. Let's for simplicity also assume for now that you only want to know whether for some word w such a splitting is possible. This can be easily done with dynamic programming.

Let S[1..length(w)] be a table with Boolean entries. S[i] is true if the word w[1..i] can be split. Then set S[1] = isWord(w[1]) and for i=2 to length(w) calculate

S[i] = (isWord[w[1..i] or for any j in {2..i}: S[j-1] and isWord[j..i]).

Finally return longest string.

Time complexity nlogn

- Anonymous March 19, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Correct me If I am wrong.
Find set of the input alphabet.
Find set of dictionary word.If both are same then check length of the word. Doesn't not solve the problem.

I assume that all characters from input must be in dictionary word for this to work.

- Praveen August 18, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Couple comments though:
1. it is possible that original string might contain same character several times so hashtable should better store number of occurrences of a symbol
2. sort original word dictionary by length in reverse order thus first match will be the answer

- Dmitry May 30, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

If the dictionary is given as a set of Strings, then we can just iterate over the dictionary and check whether that particular word can be formed with the given letters. If it is longer than the currently longest such word, then we store the word. At the end we can then return the longest such word.

Runtime is O(n + m*l) where n is the length of the whole dictionary (as if it were one string), m is the number of words in the dictionary and l is the length of the letters string.
Space complexity is O(w) where w is the length of the longest word in the dictionary.

public String getLongest(Set<String> dict, String letters) {
	String result = "";
	for (String s : dict)
		if (s.length() > result.length() && isWordOfLetters(s, letters))
			result = s;
	return result.length() == 0 ? null : result;
}

private boolean isWordOfLetters(String s, String letters) {
	Map<Character, Integer> occ = new HashMap<Character, Integer>();
	for (char c : s.toCharArray())
		if (!occ.containsKey(c))
			occ.put(c, 1);
		else
			occ.put(c, occ.get(c) + 1);
	for (char c : letters.toCharArray())
		if (occ.containsKey(c))
			occ.put(c, occ.get(c) - 1);
	for (int i : occ.values())
		if (i > 0)
			return false;
	return true;
}

- Zoli March 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

You sure your code works ?
You might want to loop over letters before looping over s.

Or better yet (optimization), you should do the loop over letters once in the outter function, and reuse it in the inner function:

public String getLongest(Set<String> dict, String letters) {

	// get letter counts for the input/letters string
	Map<Character, Integer> letocc = new HashMap<Character, Integer>();
	for (char c : letters.toCharArray())
		if (!letocc.containsKey(c))
			letocc.put(c, 1);
		else
			letocc.put(c, letocc.get(c) + 1);

	// check if each word is longer than previous longest valid word 
	String result = "";
	for (String s : dict)
		if (s.length() > result.length() && isWordOfLetters(s, letters, letocc))
			result = s;

	return result.length() == 0 ? null : result;
}

private boolean isWordOfLetters(String s, String letters, Map<Character, Integer> letocc) {
	Map<Character, Integer> occ = new HashMap<Character, Integer>();

	//count occurrences of letters in word 's' from dictionary
	for (char c : s.toCharArray())
		if (!occ.containsKey(c))
			occ.put(c, 1);
		else
			occ.put(c, occ.get(c) + 1);

	//if any letter in input string doesn't exist or exists in less number
	//in the word s, then the word doesn't pass our test
	for (char c : letters.toCharArray())
		if (!occ.containsKey(c) || occ.get(c)<letocc.get(c))
			return false;

	return true;  //word has all the characters of input string (letters)
}

- S O U N D W A V E March 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

S O U N D W A V E, thanks for the comment! Yes, I'm sure my code works, I've tried it with several inputs and it always returned the correct result.

You are right, the optimization is worth doing. Although the problem with your approach is that you actually lose all of the optimization you're doing. If you look at your code, you precompute the occurrences of the letters, yet you still iterate over all the letters in the isWordOfLetters function. If you want the optimization to actually take effect, you should only iterate over the EntrySet of letocc.

Also, I think your code would not work this way, because if there is a letter in letters, which does not appear in the given word, then you return false... In this case you shouldn't return false, because the word we are looking at does not necessarily contain all the letters of the given letters. In fact, you even return false if a letter occurs in the word but less times than in the given letters. This would also be a valid case, yet you return false.
The question asks: "find the longest word that only uses letters from the string", but it does not say that it has to use all letters.

- Zoli March 13, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Oops. I my interpretation of the question was different :P

- S O U N D W A V E March 13, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

haha, thats me, at every interview!

- puneet.sohi March 13, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

@ Zoli，
I think u may make a mistake in your code. When you check the hashtable value:
for (int i : occ.values())
if (i > 0)
return false;
return true;
I think you should return false when the hashtable value is smaller than 0. That means the word in dictionary contain the char which is not in the string letters

- Anonymous May 03, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Please let me know if this approach is possible:

Let's say the dictionary is a hashmap.
Permute all the combinations of words possible using the characters in the string.
Sort by length and start checking if the word exists in the dictionary.

- wolfengineer March 11, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Permutating through all combinations of letters of different length will be a very high number. I am sure there will be a better way to do it. Lets wait.

- kr.neerav March 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Its a possible solution. However, it would be quite costly. I think approaching the dictionary as a trie would give a quicker solution

- puneet.sohi March 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Assuming #letters in the given string is n

Approach 1:
1. Find and put all sustrings of string in array A. Run time : O(n^2)
2. Sort this array. Run time O(n^2 * log(n^2)
3. Starting from longest entry in this array, check if present in dictionary. A match will give the ans. Run time O(n^2) assuming lookup time is O(1)

so total run time = O(n^2 * log n^2)

- puneet.sohi March 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Approach 2:
Assuming dictionary is implemented as trie.

For each char in the string S, walk its trie entry in dictionary, checking to see if the path we're walking in trie has characters present from the string.

E.g. Say input string = "abzd"

Dictionary looks something like this:
b
a b c ... z

so we know 'b' is present in string and in dictionary, we follow this entry
'ba' is present in dictionary, as seen from above trie, its also present in string and so on.
Run time: O(n^2)

- puneet.sohi March 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@puneet - the letters in the substring can be in any order (since given is a list of letters, and we need to find words that uses only those letters irrespective of the order). So for you approach 1, finding all possible substrings would take more that O(n^2) in this case, I think.

- ss March 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

@SS: You're right. It would be much costlier.
I think it would be
nP0 + np1 + np2 .... npn -> A combinatorics problem( I don't know the answer to this)

So this approach will be quite time intensive..

- puneet.sohi March 13, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

dude,
sort the string ,and do backtracking

- rishi March 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

m kk

- Anonymous March 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I assume the dictionary is given with trie data structure.
Backtrack/BFS/DFS would give you the best result. You go from one state to another state if there is a letter that can be appended to the current string.

- mahdi.oraei March 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

we take 2-stage method:
1) break the input long string into substring by removing all letters not in the given letters. O(n) time complexity
2) now the task become: for each one of substrings, what's the longest valid word inside it.
this step should o(n^2) time complexity, any better way?

- samuel March 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Actually, for a given string, and a dictionary, we could find all words inside the string very efficiently using Aho–Corasick string matching algorithm. It's O(n) algorithm with enhanced prefix tree, but, if I didn't know that in advance and can not figure it out my own during an interview, nobody will complain me, right:)

- samuel March 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I just discovered Aho–Corasick string matching algo, fascinating.. Thanks for the answer. thumbs up..

- puneet.sohi March 13, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Initially we need to preprocess the dictionary. For every string in dictionary we need to sort it and store it as a key and value as original string in Hash Table.

Sort the given characters. And generate power set of given characters(We get 2 power n strings). Check each string in power set is available in hash table. if it available note it down string and it's length.

Finally return longest string.

Time complexity 2 power n

- Coder March 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Its a good method. Only caveat I see is, with a large # of entries in dictionary, your hash table would take up a lot of space..

- puneet.sohi March 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

This is nice solution. I doubt anyone can better do o(2 power n)

- Anonymous March 13, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

You should have asked for dictionary API.
1) If it is a HashMap<String, String>, then there is not much to do except for checking all pairs.
2) If it is a Binary Search Tree, and it supports ceil and floor, OR, it is a more sophisticated structure that can tell you given a string "abc" if there are any words having "abc" as prefix, then you could do a backtracking and search through the words which exactly contain your letters.

Rough idea:

static int max_length;
static String longest_word;
BacktrackingFunc(String prefix) {
	if (!dictionary.containsPrefix(prefix))
		return;
	if (prefix.length() > max_length) {
		longest_word = prefix;
		max_length = prefix.length();
	}
	for (int i = 0; i < given_str.length(); i++) {
		BackTrackingFunct(prefix + given_str.charAt(i));
	}
}

By taking the second approach, you will at worst search for all the words in the dictionary. But that would be the case where the dictionary only contains words made of given letters.

To get an idea, consider the case where you have letters "a, b, c" and a very large dictionary.
Then, to find a word made of "a,b,c" and length "N" you will at worst call the dictionary 3^N times, but that is the case where all such combinations exists in the dictionary.

- Ehsan March 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I have a linear algorithm.
Let's say we have a set of letter, S, and a list of dictionary, dict.

step
1 define a type freq, which is a array of 26 letters; counting the frequency of each letters in S and store them into S_freq; counting the frequency of each letter of each word in dict(i.e. store in dicct_Freq);

2 compare function: bool compare(freq a, freq b); return true if b is in a; this algorithm is constant with worse case 26 comparisons.

3 compare every freq in dict_freq with S_freq and stop when compare(S_freq, dict_freq[i]) returns true; then max = dict[i].size(); store the index in max_index;

4 go over the rest of list and compare when dict.size() > max; track the index of max;

5 return dict[max_index]

- RandomGuy March 14, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This can be done in O(nlogn) where n is the length of the search string ( or the characters to use)

The key is to store the dictionary that makes it amenable to do O(1) lookups. Dictionary should be stores as a HashMap where key is sorted unique set of characters and the value is a linked list of all words in the dictionary that uses all those characters. The linked list must always be in descending order of word lengths.

Lookup: Just sort input characters, lookup in the HashMap and return the 1st word in the linkedlist - O( time to sort chars )

Inserting a new word in the dictionary: Sort chars of the word, remove redundant chars, lookup the linked list in the HashMap and insert the new word at the right spot in the LinkedList (so that we preserve the descending world length criteria) - O (time to sort chars) + O(number of words in that bucket)

Code

public class SortedDictionary {

	private final Map<String, List<String>> dictionary;
	
	public SortedDictionary() {
		dictionary = new HashMap<String, List<String>>();
	}
	
	public static SortedDictionary create(String[] words) {
		SortedDictionary dictionary = new SortedDictionary();
		for (String word : words) {
			dictionary.add(word);
		}
		return dictionary;
	}
	
	public void add(String word) {
		String sortedString = getSortedString(word);
		List<String> words = dictionary.get(sortedString);
		if (words == null) {
			words = new LinkedList<String>();
			words.add(word);
		} else {
			insertAtSortedLocation(word, words);
		}
		dictionary.put(sortedString, words);
	}
	
	private void insertAtSortedLocation(String word, List<String> words) {
		int index = 0;
		for (String str : words) {
			if (str.length() <= word.length()) {
				words.add(index, word);
				return;
			}
			index++;
		}
		words.add(word);
	}
	
	private String getSortedString(String str) {
		char[] sorted = str.toLowerCase().toCharArray(); 
		Arrays.sort(sorted);
		StringBuffer result = new StringBuffer();
		boolean firstChar = true;
		char lastChar = ' ';
		for (char c : sorted) {
			if (firstChar || c != lastChar) {
				lastChar = c;
				firstChar = false;
				result.append(c);
			}
		}
		return result.toString();
	}
	public String search(String letters) {
		String sortedString = getSortedString(letters);
		List<String> list = dictionary.get(sortedString);
		if (list == null) return null;
		return list.get(0);
	}
}

- Looper March 16, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Saving some space from a normal hash map implementation, but pretty much the same idea:

# assume some dictionary and string such as:
# dict = ["abbad", "dandelion", "z", "daabbadd", "qqqqqqqqqqqqq"]
# str = "dab"
longest = ""
key = longest_size = 0
str.split("").each {|s| key |= (1<<(s.ord - "a".ord))}
dict.each do |r|
        v = size = 0
        r.split("").each {|s| (v |= (1<<(s.ord - "a".ord))) && size+=1}
        ((v ^ key) == 0 && longest_size < size) ? ((longest = r) && (longest_size = size)) : next
end
puts longest

O(n) complexity without making a hash map. Exploits the fact that characters incrementally increase and the given characters in a string can therefore map to a 26 bit bit-mask.

- Anonymous April 04, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Why is everybody talking about inserting things into hash tables? Maybe I don't understand the problem. Hash tables will give you a false negative when your target string has extra characters. Also, testing all of the permutations of the target string will be very expensive even at a cost of O(1) for testing them against the dictionary.

Maybe the key is that a dictionary usually doesn't have that many words in it compared to the number of permutations of a character set. 12! is about 479 million, but there probably aren't more than thousands of 12-letter words in the English language. Therefore, you probably want to test the dictionary against the target string, NOT permutations of the target string against the dictionary.

In fact, given today's L1/L2 cache architectures, I wouldn't be surprised if the brute force method is a very efficient way to do this - iterate through the dictionary, checking letters against the target string. On paper, sorting both dictionary and target strings will reduce comparison time from polynomial to linearithmic (where n is the size in characters of the dictionary). In reality, I'm not sure how much it helps with small strings.

To get to linear time, we could just construct a bit vector with a bit for each code point in the character set, set to true if the character exists in the target string. Things get even more interesting if you can assume a small alphabet - that bit vector fits in a register. Then a bitwise AND operation will tell you if the dictionary word can get its characters from the target string. This is still technically a linear solution, but I bet it's orders of magnitude faster. If you want to run comparisons against many target strings, you could preprocess the dictionary into a bit vector per word as well.

- Sir CodesALot April 11, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

And while we're talking about preprocessing the dictionary, might as well order the terms in descending order of length. That way, you can just stop when you find a match.

- Sir CodesALot April 11, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

What about the following idea:

Use a HashMap that contains an Integer as a key and a Heap of Strings as the value. The key represents the used letters of the word, e.g.:
a = 1
b = 2
c = 4
...
abcaabcabc = 7

public class LongestWordOfString {

	private static final Map<Character, Integer> _idMap = new HashMap<Character, Integer>();
	private final Map<Integer, PriorityQueue<String>> _dict = new HashMap<Integer, PriorityQueue<String>>();

	Comparator<String> stringLenComparator = new Comparator<String>() {
		@Override
		public int compare(final String o1, final String o2) {
			
			if (o1 != null && o2 != null) {
				// if the length is equal it doesn't matter which word will be returned
				return o1.length() > o2.length() ? -1 : 1; 
			} else if (o1 != null) {
				return -1;
			} else if (o2 != null) {
				return 1;
			} else {
				return 0;
			}
		}
	};
	
	static {
		// support only letters a...z
		final String tmp = "abcdefghijklmnopqrstuvwxyz";
		for (int i = 0; i < tmp.length(); ++i){
			_idMap.put(tmp.charAt(i), (int)Math.pow(2, i));
		}
	}
	
	
	public LongestWordOfString(final String... dictInput) {

		if (dictInput != null) {
			for (int i = 0; i < dictInput.length; ++i) {
				int id = determineWordId(dictInput[i]);
				if (_dict.containsKey(id)) {
					_dict.get(id).add(dictInput[i]);
				} else {
					final PriorityQueue<String> queue = new PriorityQueue<String>(10, stringLenComparator);
					queue.add(dictInput[i]);
					_dict.put(id, queue);
				}
			}
		}
	}

	private final int determineWordId(String word) {
		int retVal = -1;
		if (word != null && word.length() > 0) {
			word = word.toLowerCase(); // to support also upper letters for the determination of the ID
			retVal = 0;
			for (int i = 0; i < word.length(); ++i) {
				// assuming only aA...zZ exists
				retVal |= _idMap.get(word.charAt(i));
			}
		}
		return retVal;
	}

	
	public String determineLongestWordOfStringLetters(final String inputString) {
		
		String retVal = null;
		int id = determineWordId(inputString);
		if (_dict.containsKey(id)) {
			retVal = _dict.get(id).peek();
		}
		return retVal;
	}
}
...
	@Test
	public void test1() {
		LongestWordOfString lwos = new LongestWordOfString("abcabc", "abc", "AAAAAAAAAAAAAAABC", "xzasdf");
		System.out.println(lwos.determineLongestWordOfStringLetters("abc")); 
// AAAAAAAAAAAAAAABC
	}

The Heap is also removable if you only have to keep track of the longest String to a certain ID.
With this solution you can determine the word in constant time.

- Scavi April 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1) use the low 26 bit in an int to denote if a letter exists
2) pre calculate such an int for all words in the dict
3) use xor to check if a dict word matches the given word

- zjzzjz September 11, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

static List getLongest(String [] arr, String dictionary){
        int [] dicCount = new int[256];
        HashMap<Integer,List<String>> res = new HashMap<>();
        
        int max = Integer.MIN_VALUE;
        for(char c: dictionary.toCharArray()){
            dicCount[c]++;
        }
        for(String s: arr){
        if(s.length()<=dictionary.length()&&matches(dicCount,s)){
            if(s.length()>max){
                if(max!=Integer.MIN_VALUE) res.remove(max);
                max=s.length();
                res.put(s.length(),new ArrayList<String>());
                res.get(max).add(s);
            }
           else if(s.length()==max){
                res.get(max).add(s);
            }
        }
        }
        return res.get(max);
    }
    
    static boolean matches(int [] a,String target ){
        int [] dicCount = Arrays.copyOfRange(a, 0, a.length);
        for(int i =0 ;i<target.length(); i++){
            --dicCount[target.charAt(i)];
            if(dicCount[target.charAt(i)]<0) return false;
        }
     System.out.println(target);

        return true;
    }

- Anonymous June 28, 2020 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

static List getLongest(String [] arr, String dictionary){
        int [] dicCount = new int[256];
        HashMap<Integer,List<String>> res = new HashMap<>();
        
        int max = Integer.MIN_VALUE;
        for(char c: dictionary.toCharArray()){
            dicCount[c]++;
        }
        for(String s: arr){
        if(s.length()<=dictionary.length()&&matches(dicCount,s)){
            if(s.length()>max){
                if(max!=Integer.MIN_VALUE) res.remove(max);
                max=s.length();
                res.put(s.length(),new ArrayList<String>());
                res.get(max).add(s);
            }
           else if(s.length()==max){
                res.get(max).add(s);
            }
        }
        }
        return res.get(max);
    }
    
    static boolean matches(int [] a,String target ){
        int [] dicCount = Arrays.copyOfRange(a, 0, a.length);
        for(int i =0 ;i<target.length(); i++){
            --dicCount[target.charAt(i)];
            if(dicCount[target.charAt(i)]<0) return false;
        }
     System.out.println(target);

        return true;
    }

- yo June 28, 2020 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The issue with many of the explanations in this thread is that they assume you have access to the dictionary. For this question, the dictionary (list of strings) is wrapped in a class that contains only one public method, "Contains(string s)" to check if the string made up of some combination of letters exists in the dictionary. You can't preprocess the dictionary because you don't have access to it.

- anonymous January 08, 2021 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 5 vote

Post study questions in the forum, not here.

- Anonymous March 11, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

-2

of 2 votes

It is a question asked in Google interview.

- Obiwana March 11, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Yes, it is. Obiwana is right.

- Anonymous March 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

If this is a Google question, yay, I have a prayer :)

- JeffD March 18, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@JeffD: Good luck!

- Anonymous March 21, 2014 | Flag

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews