given a dictionary of wrods,fi

Google Interview Question for Software Engineer / Developers

3

of 5 votes

20
Answers
given a dictionary of wrods,find the pair of word with following property:
1,the two word don't have same letter.
2,the multiple of the two word's length is maximum.
i give a simple O(n*n*k)(k is the average length of word) method.but i think there will be better one .
- haozyname November 19, 2013 in china | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: china
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 9 vote

Assuming the word is A-Z/a-z only, use a bitmap to set which letters it contains.
e.g. ca => 000....101
bb => 000...010
Iterate over the words in decreasing order of length
for each pair of words, AND the bitmaps. Return the first pair that gives a 0 result.

This should be n*k + n*n

- Anonymous November 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Some optimizations depending on how the dictionary is stored:
- Keep track of the max word length on a page of dictionary (internal node in a trie like structure)
- Bloom filter for the intersection of letters that occur on a page of dictionary (internal node in a trie). If the given word and bloom filter have commonality, then skip the subtree.

- Anonymous November 19, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

excellent way, per word we'll however need uint64_t which should be fine i guess but what about point 2 of the question?

- confused_banda November 19, 2013 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

For point 2, I was thinking of sorting the dictionary by length (n*n). Then traverse it in a nested loop (n*n) in decreasing order of length to find the max length pair...

- Anonymous November 19, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 5 votes

the 'overlap' detection using bitmask is nice.

however, for finding max product length, it is incorrect to traverse the dictionary by decreasing order of length and return the first pair that has no overlap.

By starting out with the longest words, you are assuming that both the final 2 words have long length. But it could be the case that your 'found' pair has one long word and one short word. If that is the case, a pair of 2 average-length words can beat a pair of a long word and a short word. For example, 5 * 5 > 10 * 2

- hugh.hn November 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 votes

my attempt:

since a lot of English words have common characters between them (e.g., a majority of English words have letter 'e'), it is too expensive to loop through the entire dictionary every time you try to find a non-overlapping pair.

Rather, we could pre-compute the dictionary and divide words into 'buckets'. The English alphabet has 26 characters, so each word can belong to one or more buckets (eg: 'red' belongs to bucket R, bucket E and bucket D).

When searching for non-overlapping pair, the outer for() loop can be all words in the entire dictionary, so O(N).

For each word in the dictionary, we run a second inner for() loop to try to pair it with the longest non-overlapping word. For this inner for() loop, we only have to consider words that do not belong to any of the buckets of the first word. For example, if first word is 'red', we look at words that do not belong to buckets R, E and D. That will reduce significantly time spent in the inner for() loop from O(n) to O(k), where k is the number of words that have no overlap with first word.

Repeat for all words in dictionary. At all time, keep track of max multiple and update it every time we find a pair's multiple that beats current max.

- hugh.hn November 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Really nice to use bits to identify common characters.

- srikantaggarwal January 12, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Assuming you have an Dictionary technically a map
For 1st case :

a. Take out all the keys.
b. for each key get the value from map. Recursively iterate through the key and value to find the pairs with no common letter.

For 2nd Case:

a. Compute totalLength = key.length * value.length;
b. Maintain one more map with key as totalLength and value as List of keys.
c. at the end dump the values of largest key.

- kedarsdixit November 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Assume all words are stored in a trie tree, otherwise preprocessing all words.

For each word w, search w following the trie tree, if the character is w, return in current branch. Otherwise, keep searching. Return the maximum length word found in the trie tree.

- Jason November 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

For the love of god guys, big O notation is often a funny topic. It's cool to talk about, but it sometimes means nothing. For example, if you have a container of size 100 and you need some answer with it that can be done in O(n), it is sometimes fast enough (or even FASTER at times) to do a O(n^2).

This is a dictionary... 200k^2 isn't that big. I coded up a disgusting piece of code. It found an answer in a real dictionary in 0.9 seconds. Here is the answer:
Dictionary size: 235882
Words: chromochalcography uniquisitiveness
Product: 306

The code (of course):
bool valid_words(string const &w1, string const &w2)
{
if(w1.find_first_of(w2) == string::npos)
return true;
return false;
}

pair<string, string> find_two_words(vector<string> dict)
{
// sort copy of dictionary in order of descending size (biggest first)
sort(begin(dict), end(dict), [](string const &w1, string const &w2)
{
return w1.size() > w2.size();
});

// Find an answer
pair<unsigned, unsigned> ans; // indices into dict
int maxProduct = 0;
for(int i = 0; i < dict.size(); ++i)
{
if(dict[i].size()*dict[0].size() <= maxProduct)
return make_pair(dict[ans.first], dict[ans.second]);

for(int j = 0; j < dict.size(); ++j)
{
unsigned product = dict[i].size()*dict[j].size();

if(product <= maxProduct)
j = dict.size();
else if(valid_words(dict[i], dict[j]))
{
ans = make_pair(i, j);
maxProduct = product;
}
}
}

return make_pair(dict[ans.first], dict[ans.second]);
}

int main()
{
ifstream fin("dict.txt");
string word;
vector<string> dict;
while(fin >> word)
dict.push_back(word);

cout << "Dictionary size: " << dict.size() << endl;
auto biggest = find_two_words(dict);
cout << "Words: " << biggest.first << " " << biggest.second << endl;
cout << "Product: " << biggest.first.size()*biggest.second.size() << endl;
}

- TheRightAnswer November 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This is the most optimal solution at O(n * k). The assumption I take into account is that only one word is the longest value without specific characters (If there are two words then technically more than a single pair can be returned)

private static String calculate(List<String> dictionary) {
		//a little bit more efficient to make this ahead of time
		Character[] alphabet = new Character[26];
		for (int i = 0; i < 26; i++) {
			alphabet[i] = (char)('a' + i);
		}
		
		String[] longestWord = new String[26];
		
		//Build the longest word records
		for (String word: dictionary) {
			HashSet<Character> characters = new HashSet<Character>();
			for (char c: word.toCharArray())
				characters.add(Character.toLowerCase(c));
			
			int length = word.length();
			
			for (int i = 0; i < alphabet.length; i++) {
				if (!characters.contains(alphabet[i])) {
					int longest = longestWord[i].length();
					//Assumption there is one 1 word that is the longest value with out that character
					if (length > longest) {
						longestWord[i] = word;
					}
				}
			}
		}
		
		String pair = "";
		int maxProduct = 0;
		//Find the pair:
		for (int i = 0; i < longestWord.length; i++) {
			String word = longestWord[i];
			int length = word.length();
			
			HashSet<Character> characters = new HashSet<Character>();
			for (char c: word.toCharArray())
				characters.add(Character.toLowerCase(c));
			
			for (int j = i + 1; j < alphabet.length; j++) {
				if (!characters.contains(alphabet[i])) {
					String wordTwo = longestWord[j];
					int product = length * wordTwo.length();
					if (product > maxProduct) {
						maxProduct = product;
						pair = word + ":" + wordTwo;
					}
				}
			}
		}

		return pair;
	}

- Anonymous November 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I am sorry. I made a mistake in my original code. I guess I should've verified it before submitting it. The issue I did was I looped through the values I calculated as the longest and then took the product together.

THIS IS INCORRECT.

What I should've done was loop through the dictionary again and take the product from the letters not present and calculate this out.

This still results in O(n), but it is really O(2 * n * k) instead of O(n*k).

Sorry next time I'll think that through.

- RAS - Original Poster November 25, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Also since I didn't test this (just throwing it together for people who want a coding example) I should point out the above method would throw a null pointer exception because I did not initialize the indexes for the longestWord.

- RAS - Original Poster November 25, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I think this fails for:
["bcde", "acde", "abde", "abce", "abcd", "ab", "cd"]

- Anonymous November 30, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

1. Create a trie of the dictionary words
	For each word, find the list of letters that is not in the word, find the longest word which contains only these letters traversing the trie.
3. Output the maximum product out of all the words.

- commentor November 26, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1. Sort words
2. Build a boolean 2D table where row corresponds to a letter and column to a word in the dictrionary. Mark an element[c][w] true if the word contains the word w contain letter c.
3. Iterate over each row of the 2D table to contruct, for each letter, a set of words that do not contain the letter.
4. Iterate over the disctionary words in decreasing order. For each word, find the intersections of the sets computed in step 3 that correspond to the letters of this word. Then, multiply the length of the current word with the largest word in the intersection. Remember the result if is greater than previous maximum.
5. Repeat step 4 until the length of the current word is smaller than the square root of the previous maximum.

The overall complexity is roughly O(nlogn + m * S), where m is the overall number of letters in the dictionary and S is the average size of a set computed in step 3.

import java.util.Arrays;
import java.util.HashSet;



public class GetBiggestNonSharedWordMultiple {
	private final static int NUM_LETTERS = 26;

	private static int charToIndex(char ch){
		return Character.toLowerCase(ch)-'a';

	}



	public static int getBiggestMultiple (String [] words){
		int n = words.length;
		Arrays.sort(words); // O(n log n) ascending

		boolean [][] table = new boolean[NUM_LETTERS][];
		for(int i = 0; i < NUM_LETTERS; i ++) table[i] = new boolean[n];

		// O(m); m overall num of letters in input
		for(int w = 0; w < n; w++){
			String word = words[w];
			for(int i = word.length()-1; i >=0 ; i--){
				table[charToIndex(word.charAt(i))][w] = true;
			}

		}

		// O(k*n)
		HashSet<String> [] absenceSets = new HashSet[NUM_LETTERS];
		for(int i = 0; i < NUM_LETTERS; i ++) {
			HashSet<String> abSet = new HashSet<>();
			for(int w = 0; w < n; w++) {
				if(!table[i][w]) abSet.add(words[w]);
			}
			absenceSets[i] = abSet;
		}


		// O(m * AvgSetSize)
		int max = -1;
		for(int w = n-1; w >=0 ; w--){
			String word = words[w];	
			int wLen = word.length();
			if(wLen*wLen < max) break;
			if(wLen == 0){
				if(max < 0) max = 0;
				continue;
			}
			HashSet<String> currentSet = new HashSet<>();
			currentSet.addAll(absenceSets[charToIndex(word.charAt(0))]);

			for(int i =1 ; i < wLen ; i++){
				currentSet.retainAll(absenceSets[charToIndex(word.charAt(i)) ]);
			}


			int largestSize = -1;
			for(String otherWord: currentSet){
				if(otherWord.length() > largestSize) largestSize = otherWord.length();	
			}
			int mult = wLen * largestSize;
			if(mult > max) max = mult;


		}
		return max;

	}


	public static void main(String[] args) {
		String [] arr = { "hello", "world" ,"superb", "my", "mercury" };
		System.out.println(getBiggestMultiple(arr));
		
		
		
	}
	
	
}

- konst June 03, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This is my O(n*k) where n is number of words and k is longest length possible in the dictionary.

1. Make an array of 26 strings for each character with name longestWord[ ] .
2. Iterate through the entire dictionary and check for each character in that word if the string corresponding to that character(stored in array) is actually null or length is smaller than the current string. If yes, then update it to the current String.
3. Keep a variable max and Iterate through the dictionary again:
3a. Add all the characters of the current word to a HashSet <Character> restricted. Also, keep a variable currentMax.
3b. Iterate from char c='a' to 'z' and check if it that c is not in restricted. If it is not then get the String from longestWord corresponding to that char.
3c. Check if it is the biggest non-overlapping then update our currentMax.
3d. Now, check if the max<currentMax*currentWord.length() and if it is update our max and save the pair of max product.
4. Return this pair.

This works because for current string all the non-overlapping strings would not have the same character as the current and the product for the current string would be maximized only when the other string length is maximized. O(n*K+n*26)

- Zero2 July 26, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Can you clarify on second point? "2. the multiple of the two word's length is maximum. "

- confused_banda November 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

And by the way, that .9 seconds includes reading in the dictionary. I can only imagine that is the majority of the time used. My first return statement (as I predicted) executes rather quickly. The theory doesn't matter. Even if this were an exponential complexity, you pretty much know from domain knowledge that it won't do more than a few hundred thousand loops.

- therightanswer November 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews