Given a string and a dictionar

Google Interview Question for Software Engineer / Developers

3

of 3 votes

14
Answers
Given a string and a dictionary. Break the string into meaningful words.
- RXH February 21, 2013 in United States | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 3 vote

can it be done with DP? maybe a[i][j] keeps all lists of possible breaking points?

- adam2008 February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think there can be words of size 1 to n. So big Oh of this is polynomial time.
Algorithm/sudo code,

for i in 1 to 10
{
//search all words of size i in the entire string
find_words(int size,String inputString)
}

Followup question could be how would you optimize it.
Ans : We can run find_words across n threads OR n map-reduce jobs. I am not saying map reduce just because its google, but this is perfect candidate for map reduce.

- kasi.beck February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

we can start from the dictionary. Basically, we would build a word tree that consists of all the dictionary words, and then go from our string to find whether there is a path in the word tree and the end of our string is actually an end of a dictionary word.

- shengzhc March 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Let say A[k] is a boolean that tells us if string{0..k} can be broken into a set of valid words. Here A[k] would be true iff there is any j < k exists such that A[j] is true and string{j+1, k} is a valid word(validity can be check by keeping all the valid words in a hashtable).

A[n] will just give us if we can break the string into valid words but to know the words we can keep the starting point of the string along with the boolean in each A[k] and by back tracing we can find all the words. However this will give only one possible sequence of words, we can keep record of all the valid starting points of the string along with boolean in each A[k].

A[k] = OR { A[i] AND IsValid(String {i+1...k}) } for all i < k & i > 0

- Anonymous March 11, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

python code

def break_recursive(s):
    for i in range(len(s)):
        str1 = s[i:]
        str2 = s[:i]
        if str1 in words and str2 in words:
            print str1, str2

- EK MACHCHAR May 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

The method name is misleading. Should have named it break_iterative
When searching for just 2 substrings to be scanned in dictionary, the iterative is more intuitive than recursive.

- EK MACHCHAR May 13, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

bool is_word(char* s, int len)
{
    //assume provided.
    return true;
}

void split_str_to_words(char* s, int len)
{
    int i;

    if (s == NULL || len <= 0) {
        return;
    }

    for(i = 1; i <= len; i++) {
        if (is_word(s, i)) {
            //printstring(s, i);
        }
        if (i < len && is_word(s + i, len - i)) {
            //printstring(s + i, len - i);
        }
    }
    split_str_to_words(s + 1, len - 2);
}

- c0der August 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

solution using dynamic programming

boolean wordBreak(String s) //check if word can be broken into words
{
 int size = s.length();
 if (size==0) return true;
 boolean []dp = new boolean[size+1];
 for (int i=1; i<=size; i++)
 {
   for (int j=0; j<i; j++)
   {
	if (dp[j]==true && dictionary.contains(s.substring(j,i)))
	{
		dp[i]=true;
		break;
	}
    }
 }
 return dp[size];
}

- alex January 30, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 5 vote

this can be done in n^2 time... 

ex: googlesearchengine

1. for i=0 to n, look for all words till i to n i.e..
		start, end indexes
g		0, 0
go		0, 1
goo		0, 2
goog	0, 3
googl	0, 4
google	0, 5
googles	0, 6			... do this till end of sentence
.
.
o		1, 1
oo		1, 2
oog		1, 3
oogle	1, 4
oogle	1, 5		... do this till end of sentence
.
.
s		6, 6 
se		6, 7
sea		6, 8
sear		6, 9
searc	6, 10
search	6, 11
searche	6, 12	... do this till end of sentence
.
.
.
e		12, 12
en		12, 13
eng		12, 14
engi		12, 15
engin	12, 16
engine	12, 17
---------------
here existing words in dictionary are
go		0, 1
google	0, 5
sea		6, 8
search	6, 11
ear		7, 9
a		8, 8
arc		8, 10
hen		11, 13
engine	12, 17

here for existing words in dictionary, hash all the starting indexes as keys and ending indexes as data in hash table.

1. now start from 0 key in hash table
hash key		hash data
0			1		
so there exists word from 0-1 index, if this is valid word then next word should start from index 2 that is previous hash data+1
now look for key 2 in hash table, this is not exist so previous word is not valid

2. now try for hash key 0 and hash data 5, there exists word (0-5),
now look for 6 key in hash, this exists and data is 11, there exists word (6-11)
now look for 12 key in hash, this exists and data is 17, there exists word (12-17) and end of sentence
-----------------------

this also can be done with 2D array of nxn... n is sentence length

- algos February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Thanks very much for your post! However, I am still lost. for example, why start from index 2 if this is valid word in 0-1 index.

Could you post the code?

- RXH February 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

index 0-1 is valid word then next valid word should start from index 2.
ex..
higoogle
here first valid word is "hi" index from 0-1
next valid word should start from index 2 and end at index 3(go) / 7(google)

question is to break the sentence into meaningful words, means to insert spaces in sentence that should form words exists in dictionary

- algos February 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

So how about "hinttt" which hi can be a valid word and in can also be valid word. Hi is 0-1and in is 1-2 which we cannot start from index 2.

- RXH February 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

So how about "hinttt" which hi can be a valid word and in can also be valid word. Hi is 0-1and in is 1-2 which we cannot start from index 2.

- RXH February 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 vote

google "google interview word break"

- bebo February 26, 2013 | Flag Reply

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews