Google Interview Question for Software Engineer / Developers






Comment hidden because of low score. Click to expand.
2
of 2 vote

Use trie for matching dictionary.

Recursively call when matched one word.

Depending on the dictionary, complexity varies, in the worse case, for dictionary with most similar words, such as {a, aa, aaa, aaaa, aaaaa, ...}, every char there is a recursive call and there may be 2^n possibilities. But you only need to output one solution, so the running time may not be to bad.

- Junxian.Huang February 21, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

A trie is much more efficient than a general hash table. Pseudocode for this solution:

match(A)
  n = len(A)
  if n == 0
    return true
  for i in [0..n-1]
    if trie.has_key(A[0..i]) and match(A[i+1..n-1])
      return true
    else if !trie.has_prefix(A[0..i]
      break
  return false

- yaojingguo December 26, 2013 | Flag
Comment hidden because of low score. Click to expand.
1
of 1 vote

I solved it using DP. I think the time complexity is O(n^2 + number of valid phrases).
My Algorithm:
1: check if a substring was searched, if so, return the cached result.
2: iterates though all substring starting at zero. If they are valid, try to form a phrase with the rest. If it has any valid frases, or it used all the letters, append it to the result array.
3: caches the result got in 2 and return.

for input : thisisawesome
and dictionary: [this, is, a, awe, we, some, awesome]
got: [this is a we some, this is awe some, this is awesome]

Code:

public class PutSpaces {
	private List<String>[] found;
	private Set<String> dictionary;
	public List<String> putSpaces(Set<String> dictionary, String str) {
		this.dictionary = dictionary;
		found = new List[str.length()];
		return putSpaces(str, 0);
	}
	private List<String> putSpaces(String str, int start) {
		if (found[start] != null)
			return found[start];		
		List<String> result = new ArrayList<String>();
		for (int i=0; i<str.length();++i) {
			String word = str.substring(0, i+1);
			if (dictionary.contains(word)) {
				if (word.equals(str)) {
					result.add(word);
				} else{
					List<String> next = putSpaces(str.substring(i+1), start+i);
					for (String phrase : next) {
						result.add(word + " " + phrase);
					}
				}
			}
		}
		return found[start] = result;
	}
}
public class PutSpacesTest {
	private Set<String> dictionary = new HashSet<String>();
	{
		dictionary.addAll(Arrays.asList(new String[]{"this", "is", "a", "awe", "we", "some", "awesome"}));
	}	
	@Test
	public void testPutSpaces() {
		String str = "thisisawesome";
		PutSpaces putSpaces = new PutSpaces();
		List<String> phrases = putSpaces.putSpaces(dictionary, str);
		System.out.println(phrases);
	}
}

- lucas.eustaquio August 02, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Your code doesn't look correct. You might add "i" and "saw" at the beginning of your dictionary; and let me know what output you get. In first look, it seems you are doing a greedy-like (not DP at all) approach. You might check your solution against few examples put here (as buried posted above): ideone.com/jgQGK

It'd be better, if you post your entire code to ideone.com. They support programs in 30+ languages.It'd be easier for all to play with your code, and catch any flaws.

- anonymous August 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Posted in there (ideone.com/tTV3W). Tested and got the output: [this is a we some, this is awe some, this is awesome]
About being DP or not, it is the DP top bottom approach. The high level logic is: Do i already have a solution for this subproblem? yes, then return. no? calculate it! So the first time the code needs a subproblem, it will solve it.
There was actually a minor error, but it wasn't in the logic (a +1 missing) in List<String> next = putSpaces(str.substring(i+1), start+i+ 1);

- lucas.eustaquio August 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

The code i posted did not compiled. Posted again in (ideone.com/yFwkN)

- lucas.eustaquio August 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Thanks for posting the entire code. However, it reported "compliation error".

Now I get your idea. But, it seems the complexity would be O(n^2 m) where n is size of string, and m is size of dictionary. You skipped complexity of O(m) which would incurred due to dictionary.contains(). Anyway, this is neat and correct solution to me. Thanks.

- anonymous August 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

no really. I used a hashset for dictionary (hashset is a hashhmap with key=value) so it is o n^2. Actually it also depends on the number of valid phrases found. And about the code i posted again on ideone.com/yFwkN and now it compiled. i should have put that public in the first time.

- lucas.eustaquio August 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Thanks for the clarification.

But, I can't agree on "true" O(n^2) complexity analysis. How could a hash function be O(1) when the key-size is variable? For example, how could a deterministic Turing machine compute a hash function on 1 million-chars string in constant time? IMHO, for variable sized key, you should not assume O(1) complexity for computing hash value. As, in this case, key length is O(n), then it would lead O(n^3).

- anonymous August 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

If i would model the class string certainly i would cache the hash value (it actualy happens in java) in this case, then amortized cost would be O(1). But I agree that it isnt a true O(n^2). But that is because the .substring part is O(n). Making it O(n^3), being n the number of chracters.

- lucas.eustaquio August 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

#include <iostream>
#include <vector>
#include <set>

using namespace std;

vector<string> cache[1000];
int state[1000];

void split(const char* s, int start, int end, set<string> dic)
{
if (start > end)
{
return;
}
if (state[start])
{
return;
}
string ss;
vector<string> result;
for (int i = start; i <= end; ++i)
{
ss += s[i];
if (dic.find(ss) != dic.end())
{
split(s, i + 1, end, dic);
if (i == end)
{
result.push_back(ss);
}
else
{
vector<string>::iterator iter;
for (iter = cache[i + 1].begin(); iter != cache[i + 1].end(); ++iter)
{
result.push_back(ss + " " + *iter);
}
}
}
}
state[start] = 1;
cache[start] = result;
}

int main()
{
set<string> dic;
dic.insert("this");
dic.insert("is");
dic.insert("a");
dic.insert("awe");
dic.insert("we");
dic.insert("some");
dic.insert("awesome");
string s("thisisawesome");
split(s.c_str(), 0, s.length() - 1, dic);
vector<string>::iterator iter;
for (iter = cache[0].begin(); iter != cache[0].end(); ++iter)
{
cout << *iter << endl;
}
return 0;
}

- wenlei.zhouwl May 15, 2012 | Flag
Comment hidden because of low score. Click to expand.
1
of 1 vote

There is a recursive way to find all the solutions:

void findSentences(string sentence,int position,string &result){	
	if(position==sentence.length()) {
		cout<<result<<endl;
		return;
	}
	TrieNode *dictIndex=&dictionary.topNode;

	for(int i=0;i+position<sentence.length();i++){
		dictIndex=dictIndex->moveForward(sentence[position+i]);
		if(dictIndex==NULL) return;
		if(dictIndex->wordMark){
			int resultLength=result.length();
                        //add new word 
			result+=" "+sentence.substr(position,i+1);
			findSentences(sentence,position+i+1,result);
                        //remove new word
			result.erase(resultLength,i+2);
		};
	}

}

TrieNode is a node in a prefix tree. Node has a wordMark field that denotes whether set of
characters formed by visiting all the parent nodes in a tree constitute a valid word.
Method call newNode=oldNode.moveForward(c) returns new trie node reachable from oldNode
using character c.
Complexity of finding one solution is by O(N*C) (N-sentence length, C-lenght of the longest
dictionary word). Complexity of moveForward(c) is O(1).
Number of solutions may vary greatly depending on a sentence length and size of the
dictionary. For instance if one uses Morse alphabet the number of valid sentences may easily
become exponential.

- igor grudenic December 08, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Keep scanning the input and appending characters to the word.
Ex:
t Check in Dict not present
th Check in Dict not present
thi Check in Dict not present
this Check in Dict present
add space
reset word to ""
start with next char.

This is a very straight forward solution solution. Can anyone suggest something faster ?

- AJ July 31, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

this is not guaranteed to find a solution.
For example, dictionary = {the, there, is}.
And input is "thereis".
If you add space after "the", there is no match for the remaining "reis". You have to consider "there is"

- Junxian.Huang February 21, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Hmm, that doesn't work though. lets say we go through your algo, following are the words that come up - "this" "i" "saw" "esome" where esome is not a word. For this particular algo, lets say you dont have to care for grammar and they accept the break down as "this" "is" "awe" "some".

- BA July 31, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

AJ, your method will require Backtracking to choose correct words from dictionary.
We should look for some other solution as well.

- PQ July 31, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Once, I was reading book in store. I don't remember, how it was called, something about Artifical Intelligence.
And in this book was example about excactly such task.
But the one thing I remember, that this task can be relatively easy solved using statistical methods, based on word coombination frequence.

- Andrii July 31, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

tries ?

- raped July 31, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

A simple BFS backtracking algo would work, and would yield:

This is a we some

Kind of like a threesome :-P

- memo July 31, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

solution like keep scanning the word till match found in dictionary is wrong.e.g. isolation. scanner will stop after reading is and could not find olation

- Anonymous July 31, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Thanks guys. Can someone elaborate on the correct solution

- AJ July 31, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Well, I would say DP will help here. To be sure that at any point when we find a match in dictionary and we put a space, remaining letters can also be grouped into some meaningful words, we have to depend on next letters.

word(i, j) = ( dict(i,j)? (word(i, j+1) && word(i+1, j+1)) :  word(i, j+1))

word is a function which decides if the letters between position i and j (inclusive of i and j) can be a word, i.e., can have spaces at its ends.
dict(i,j) returns true if letters form i to j position finds a match in dictionary.

As we do not want number of valid combinations possible so we can exit the procedure when dict(i, N) is reached and is true.

- Anonymous August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Well, I would say DP will help here. To be sure that at any point when we find a match in dictionary and we put a space, remaining letters can also be grouped into some meaningful words, we have to depend on next letters.

word(i, j) = ( dict(i,j)? (word(i, j+1) && word(i+1, j+1)) :  word(i, j+1))

word is a function which decides if the letters between position i and j (inclusive of i and j) can be a word, i.e., can have spaces at its ends.
dict(i,j) returns true if letters form i to j position finds a match in dictionary.

As we do not want number of valid combinations possible so we can exit the procedure when dict(i, N) is reached and is true.

Forgot to add in my previous post. Its complexity would be O(n^2).

- mindless monk August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I think either checking from forward or backward would yield same problem. Also, at the end we might get stuck without being able to form a word like anonymous said. (isolation) So, I am thinking either way, forward or backward,lets say forward, we can combine two individual words and see if it can form a bigger word. If it does,then we can split the sentence in more than 1 way. If we get stuck then we have to combine with previously occuring characters and see if a word forms.

Ex - we have "thisisawesome"
If start from left, we get - this i saw "esome". Now there is no "esome". So, combine that with prev char and check again, we get "wesome", no wrong again. Now check "awesome", yes we got the word. But "s" from "saw" is left. So, again do "is" and we found this too. So, finally we have "this is awesome". In some cases we can have multiple choices and may not always get exact sentence.

Now, this can lead to challenge of forming a sentence with a meaning. The ques doesn't have enough info to deduce this. We need some kind of info to help check if we have a meaningful sentence.

Anyway, my solution is also variant of backtracking. Any flaws/suggestions on my approach would be much appreciated. Thanks guys !!!

- avatarz August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

@singhashish.osu ......... I am trying to understand your approach.
Say you have "thisisawesome". Now i=0,j=0 So,check for (t && h), returns false. Check (th)- returns false. Finally when this forms, i=0 , j=3...now what happens? I am sorry,I don't get it.Can you give a detailed explanation. Its been ages since I've done DP.

- avatarz August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

DP is basically a tree based approach. So till i=0 and j=3, we will have one branch which will have "the". Now DP will shoot out a branch for handling the case where we start making a word from scratch, while the original branch will go on appending the letters to "the" to handle the case where the should not have been chosen as word.
The same rule applies on each branch, and at the end we will have all possible combination of words from the letters.

- mindless monk August 01, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Also using trie or prefix tree will be a solution here, but coming up with trie will again require DP.

- mindless monk August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

If the dictionary was a trie then we can do this effectively in linear time. The trie should also address words that for prefix of larger words.

- Kumar August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 2 vote

@singhashish.osu: I couldn't comprehend your recursive definition. Therefore, I doubt if it's possible in O(n^2).

I solved it using DP that takes O(n^3). The approach is to solve subproblem (i.e. substring) of size 2, then size 3, ... up to n. Every step uses information of previous computed results.

Let, dict[i][j] is "true" if substring[i...j] is a word in given dictionary
else, dict[i][j] is "false". Computing dict[i][j] for all possible i/j values, takes O(n^3) if we store the dictionary in trie.

Let, word[i][j] is "true" if substring[i....j] can be partitioned into 1 or more parts such that each part is either (a) a word, or (b) comprises 2 or more words.

Initally, word[i][j] = dict[i][j]

Now, by recursive definition, word[i][j] is "true" if and only if there exists an index k such that word[i][k] and word[k+1][j] both are true, for i <= k < j. Use a path[][] matrix to keep the partition point to place "space".

You can check the implementation here : ideone.com/EIGjJ

Modify the dict[] array to test with other inputs. Welcome any catch in this code. Thanks.

- buried.shopno August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Modified code is here (with special cases handling):
ideone.com/jgQGK

- buried.shopno August 01, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

dp[i][j] = Best split for string from chars i to j (inclusive)
dp[i][j] = dp[i][k] + " " + dp[k+1][j] if dp[i][k] is in dict i<=k<j

Ideally, each tokenization should be scored.
Eg: awesome = a + we + some is valid and so is awesome.. assuming the words (a, we, some, awesome) are present in dict.
Without scoring a tokenization, heuristics such as
1. longest token in tokenization => will favor "awesome"
2. Max number of tokens => will favor "a + we + some"

- Sriram S August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

I think your comment is a good starting point in clarifying what we are supposed to do here. I find the question itself ambiguous. It doesn't specify whether we should find just 1 or all of the possible ways to add spaces to the input. It doesn't even really say each tokenized word needs to be present in the dictionary. Not to mention if there are multiple interpretations then how should we pick the best like you suggested.

Having said that, most people seem to assume that we are supposed to find just 1 interpretation such that each tokenized word must be in the dictionary.

- Sunny December 08, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

None better then Trie Will Help Here :)

- WgpShashank August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

You are way too smart dude! How Trie would say deterministically a partition point?

If you dare to challenge, post your solution. Let us see how smart are you truly. I'm pretty sure to find bugs in your crappy/incorrect solution.

- anon August 01, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

And You Are Not Smart isn't it ? Have You Ever Implemented Trie . if You would have done it then you could not have been asked this question. only hint for you is that write insert function for trie , we know when we insert word in trie we make string a valid or invalid on the basis of termination point e.g. where we should end & make string valid , make end of word true its your choice, we will use same approach hear , whats the problem in this ?? if you are looking for the code of trie you can refer my blog & notify me if are able to find out bug there , solution of this problem already posted below (not accurate but has enough detail)

Shashank

- WgpShashank August 03, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

1.First sort the words of the dictionary.(say :net,ten,nte etc all will sort to ent which forms the key)
2.once all the words are sorted, check for each letter given in the phrase wether that letter exists in the sorted dictionary order. if exists add space else add a new letter and check the dictionary.

- DNS August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Trie is the simplest solution for this I think , eow val will be set for every word , check for eow(end of word) flag , when found add space.

- Badri August 01, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

How Trie would say deterministically a partition point?
If you dare to challenge, post your solution. I'm pretty sure to find bugs in your solution.

- anon August 01, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

what you are saying is true.... We can't do partiality in Tries.....

- swordfish August 02, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

This problem can be solved using Trie datastructure. You can check the code in my blog.
Here is the code

#include <string>
#include <map>
#include <fstream>
#include <iostream>
#include <vector>
#include <algorithm>
#include <deque>
//#include "Timer.h"

typedef std::vector<size_t>      RowType;
typedef RowType::iterator        RowTypeItr;
typedef std::deque<std::string> ResultType;
typedef ResultType::iterator     ResultTypeItr;
typedef std::string              String;
typedef String::iterator         StringItr;


struct TrieNode {
	typedef std::map<char, TrieNode *> ChildType;
	typedef ChildType::iterator        ChildTypeItr;

	String    m_word;
	ChildType m_childMap;
	bool      m_visited;

	TrieNode() :m_childMap(std::map<char, TrieNode*>()), m_visited(false) {} 

	void insert( String& word ) {
		TrieNode *pNode = this;
		for ( StringItr itr = word.begin(); itr != word.end(); ++itr) {
			char letter = *itr;
			if ( pNode->m_childMap.find(letter) == pNode->m_childMap.end()){
				pNode->m_childMap[letter] = new TrieNode();
			}
			pNode = pNode->m_childMap[letter];
		}
		pNode->m_word = word;
	}
	void search_r(TrieNode *pTrie, char letter, String& word, RowType& previousRow, ResultType &results, size_t maxCost ) {
		size_t columns = word.size() + 1;
		RowType currentRow;
		currentRow.push_back(previousRow[0] + 1);
		size_t prevcol = 0;
		for (size_t column = 1; column < columns; column++) {
			if ( word[prevcol] == letter ) 
				currentRow.push_back(previousRow[prevcol]);
			else {
				size_t min_elem = 0, temp = 0;
				temp      = (currentRow[prevcol] < previousRow[column]) ? currentRow[prevcol]: previousRow[column];
				min_elem  =  (previousRow[prevcol] < temp) ? previousRow[prevcol] : temp;
				currentRow.push_back(min_elem + 1);
			}
			prevcol = column;
		}
		if (currentRow[currentRow.size() - 1 ] <= maxCost && pTrie->m_word != "" ) {
			if ( ! pTrie->m_visited ) {
				results.push_back(pTrie->m_word);
				pTrie->m_visited = true;
			}
		}
		if((*min_element(currentRow.begin(), currentRow.end())) <= maxCost) {
			for ( ChildTypeItr itr = pTrie->m_childMap.begin(); itr != pTrie->m_childMap.end(); ++itr) {
				search_r(itr->second, itr->first, word, currentRow, results, maxCost );
			}
		}
	}
	void search( String& word, size_t maxCost, ResultType &results ) {
		RowType currentRow ( word.size() + 1 );
		int i = 0;
		for ( RowTypeItr itr = currentRow.begin(); itr != currentRow.end(); ++itr)
			*itr = i++;
		for ( ChildTypeItr itr = m_childMap.begin(); itr != m_childMap.end(); ++itr) 
			search_r(itr->second, itr->first, word, currentRow, results, maxCost );
	}
};

int main(int argc, char **argv) {
	//Timer t1("Time taken by Program : " );
	TrieNode trie;
    std::ifstream dictFile(argv[1]);
    String line; 
    if (dictFile.is_open()) {
        while (! dictFile.eof() ) {               
            std::getline (dictFile,line);
            if ( line.size()) {
				trie.insert(line);
            }
        }
        dictFile.close();
    }
	String inputString(argv[2]);
	String outputString("");
	String tempWord;
	for ( String::iterator itr = inputString.begin(); itr != inputString.end(); ++itr){		
		ResultType results;
		tempWord.push_back(*itr);
		trie.search(tempWord,0,results);
		if ( results.size() == 0 ){
			continue;
		} else {
			outputString += tempWord;
			outputString += " ";
			tempWord.clear();
		}
		results.clear();
	}
	outputString += tempWord;
	std::cout << "Output : " << outputString << std::endl;
}

- dongre.avinash August 02, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Apparently your code is buggy! It'd fail when one word is prefix of another word. To convince you (not me) that it's WRONG, I took your code, and ran with few inputs (got from buried's posted link), and it failed. Check out here: ideone.com/WYFaH

A suggestion for you - before posting a code (without even a single explanation), first work on paper with few examples. Your program failed for given example (with original post). Then why the hell you should post your crappy solution here. By posting something incorrect, your just confuse all people, and waste their time. And above all, in this way you won't learn ANYTHING!

I copy the output for impatient readers:

input: isawesomesomesomeis
output: i saw esomesomesomeis
correct output (checked with buried's solution): is awesome some some is 

input: thisisawesome
output: this i saw esome
correct output: this is awesome

- anonymous August 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

hmmm I see where is the problem, I will fix it.

- Anonymous August 03, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Fixed the issue:

#include <string>
#include <map>
#include <fstream>
#include <iostream>
#include <vector>
#include <algorithm>
#include <deque>

typedef std::vector<size_t> RowType;
typedef RowType::iterator RowTypeItr;
typedef std::deque<std::string> ResultType;
typedef ResultType::iterator ResultTypeItr;
typedef std::string String;
typedef String::iterator StringItr;


struct TrieNode {
	typedef std::map<char, TrieNode *> ChildType;
	typedef ChildType::iterator ChildTypeItr;
	String m_word;	
	ChildType m_childMap;
	bool m_visited;
	TrieNode() :m_childMap(std::map<char, TrieNode*>()), m_visited(false) {}
	void insert( String word ) {
		TrieNode *pNode = this;
		for ( StringItr itr = word.begin(); itr != word.end(); ++itr) {
			char letter = *itr;
			if ( pNode->m_childMap.find(letter) == pNode->m_childMap.end()){
				pNode->m_childMap[letter] = new TrieNode();
			}
			pNode = pNode->m_childMap[letter];
		}
		pNode->m_word = word;
	}
	void search_r(TrieNode *pTrie, char letter, String& word, RowType& previousRow, ResultType &results, size_t maxCost ) {
		size_t columns = word.size() + 1;
		RowType currentRow;
		currentRow.push_back(previousRow[0] + 1);
		size_t prevcol = 0;
		for (size_t column = 1; column < columns; column++) {
			if ( word[prevcol] == letter )
				currentRow.push_back(previousRow[prevcol]);
			else {
				size_t min_elem = 0, temp = 0;
				temp = (currentRow[prevcol] < previousRow[column]) ? currentRow[prevcol]: previousRow[column];
				min_elem = (previousRow[prevcol] < temp) ? previousRow[prevcol] : temp;
				currentRow.push_back(min_elem + 1);
		}
		prevcol = column;
	}
	if (currentRow[currentRow.size() - 1 ] <= maxCost && pTrie->m_word != "" ) {
		//if ( ! pTrie->m_visited ) {
			results.push_back(pTrie->m_word);
		//	pTrie->m_visited = true;
		//}
	}
	if((*min_element(currentRow.begin(), currentRow.end())) <= maxCost) {
		for ( ChildTypeItr itr = pTrie->m_childMap.begin(); itr != pTrie->m_childMap.end(); ++itr) {
			search_r(itr->second, itr->first, word, currentRow, results, maxCost );
		}
	}
}
void search( String& word, size_t maxCost, ResultType &results ) {
	RowType currentRow ( word.size() + 1 );
	int i = 0;
	for ( RowTypeItr itr = currentRow.begin(); itr != currentRow.end(); ++itr)
		*itr = i++;
	for ( ChildTypeItr itr = m_childMap.begin(); itr != m_childMap.end(); ++itr)
		search_r(itr->second, itr->first, word, currentRow, results, maxCost );
	}
};

bool stringCompare( const String &left, const String &right ){
   if( left.size() < right.size() )
      return true;
   if ( left.size() > right.size() )
	   return false;
   return false;
}

TrieNode trie;
std::string findWord( size_t &left, size_t &right , String &inWord){
	ResultType vecResult;
	String word = "";
	for ( size_t i = right; i <= inWord.length(); i++) {
		String w("");
		w.append( inWord.begin() + left, inWord.begin() + i );
		trie.search(w,0,vecResult);
	}
	if ( vecResult.size() != 0 ) {
		std::sort( vecResult.begin(), vecResult.end(), stringCompare);
		left = left + vecResult[vecResult.size() - 1 ].size();
		right = left + 1;
		word = vecResult[vecResult.size() - 1 ];
	}
	return word;
}

int main(int argc, char **argv) {
	
	trie.insert("i");
	trie.insert("this");
	trie.insert("saw");
	trie.insert("is");
	trie.insert("a");
	trie.insert("some");
	trie.insert("awesome");

	String inputString(argv[1]);
	String outputString("");
	size_t left  = 0;
	size_t right = 1;

	while ( right <= inputString.size()) {
		String w = findWord ( left, right, inputString);
		if ( !w.empty()) {
			outputString += w;
			outputString += " ";
		}
	}
	std::cout << outputString << std::endl;
}

D:\personal\Trie\Release>Trie.exe isawesomesomesomeis
is awesome some some is

D:\personal\Trie\Release>Trie.exe awesome
awesome

D:\personal\Trie\Release>Trie.exe isawesomesomesomeis
is awesome some some is

D:\personal\Trie\Release>Trie.exe thisisawesome
this is awesome

- Anonymous August 03, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Why you post such big code that you won't be able to write down at first sitting in a 30-min interview? So next time write manageable-size code.

Besides, you should give brief explanation / commented your code so that others can understand. No one needs an explanation-less solution. And finally, when you write a code, always mention what's the complexity you claim for it. All interviewers ask this to candidates.

- hmm August 03, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

I don't think DP works for the fundamental reason that the optimal solution does not consist of optimal solutions of subproblems - by adding a letter you create new words that would never have been considered before.

The simple linear approach described above - building a string until you come to a valid word, then starting over - works well, but what if you want to find all valid solutions? My answer was to use recursion. If the current string was not a valid word simply recurse deeper, appending the next character. If it is a valid word, recurse deeper in 2 ways: 1 as before, and again having added the valid word to your 'keeping track' string and having reset the current string.

Although in theory this could be 2^n, in reality this won't happen for any natural language - there simply won't be that many valid word boundaries.

An optimisation that a few people have mentioned is to use a trie rather than a hashset or similar for the dictionary. Trie accesses are linear in the length of the key, and because this is a natural language question there is a reasonable upper bound on this length of around 15 chars in the worst case (average more like 7), so lookup is essentially constant time. The added bonus is that tries provide a build in prefix function. This would allow early pruning of the recursive tree - if you current string is not a prefix for any word (which can be checked in constant time) that branch can be abandoned

- anon August 02, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

You probably have not good understanding on DP. When you start up with problem size (i.e. substring size) of length 1, and then add a letter to substring, you get a problem of size 2. This 2-letter problem could be thought of as 2 sub-problems each of size 1. For example, "a" and "i" both are valid 1-letter word. For "ai" which is a 2-letter substring, we can re-use the information we computed previously.

Now, consider another substring "zoo". If we add another letter 'm', "zoom" becomes a 4-letter substring. Now, there is no valid partitioning of zoom (i.e. zoo+m, zo+om, z+oom); hence, "zoom" is looked up in dictionary for a valid match. As this is valid word, we get single partition for word "zoom". However, if it were something else, like "zool", then surely no valid way to partition it. Then, we have to take following letters to make up valid partitioning, for example "zoology".

If you are still not convinced, you could check my DP solution posted above. Take this as a challenge to break this code with some counter example, where DP fails. Modify the dictionary as you wish.

- buried.shopno August 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

i think its just similar like the lexical analysis in the compiler.
we have dictionary then just using the linear traversal and store the latest varified word and then go deeper and deeper and as not any valid solution exist terminate for that string

- ducks August 02, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

your solution has some problems.

example. history if words [hi, his, story] in dictory

- zombie August 22, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

<pre lang="" line="1" title="CodeMonkey2950" class="run-this">import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

class PutSpaces {
private List<String>[] found;
private Set<String> dictionary;

@SuppressWarnings("unchecked")
public List<String> putSpaces(Set<String> dictionary, String str) {
this.dictionary = dictionary;
found = new List[str.length()];
return putSpaces(str, 0);
}

private List<String> putSpaces(String str, int start) {
if (found[start] != null)
return found[start];

List<String> result = new ArrayList<String>();
for (int i=0; i<str.length();++i) {
String word = str.substring(0, i+1);
if (dictionary.contains(word)) {
if (word.equals(str)) {
result.add(word);
} else{
List<String> next = putSpaces(str.substring(i+1), start+i+1);
for (String phrase : next) {
result.add(word + " " + phrase);
}
}
}
}
return found[start] = result;
}

public static void main(String[] args) {
Set<String> dictionary = new HashSet<String>();
dictionary.addAll(Arrays.asList(new String[]{"this", "i", "saw", "is", "a", "awe", "we", "some", "awesome"}));
String str = "thisisawesome";
PutSpaces putSpaces = new PutSpaces();
List<String> phrases = putSpaces.putSpaces(dictionary, str);
System.out.println(phrases);
}

}
</pre>

- lucas.eustaquio August 02, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Since your implementation relies on building the table _found_ for memoization, you spend time building the O(n^2) entries in the table. But since your implementation is recursive, how do you argue that the running time is still O(n^2) or O(n^3) as described above? I mean even though the table size is O(n^2), can you still have function calls that reads from the table _found_ more than O(n^2) times?

- ForLucas August 20, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

A second call to a recursive path alerady found will be O(1), so it won`t alter the overall complexity.Just look at the big picture: all this method does is build a found array with n entries at most. At it does n comparations to build each entries, therefore it is O(n^2). If you doubt, run the code doubling the string length and see how time grows.
If this wasn't dynamic programming, you would be correct, and the simple recursive method would have O(n!) complexity.

- lucas.eustaquio August 21, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Check out the paper titled "learning to tokenize web domains"
I have worked on this problem extensively. DP + Machine Learning is thus far the best you can do.

- Sriram S August 03, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Here is my simpler version
1. Build a set of all the words ( from a dictionary )
2. Have one pointer left, which points to start of the string
3. have another pointer right which pointers to the end of the string
4. try finding the word ( left .. right ) in the set
5. If found you got the word
6. if not, reduce right pointer by 1
7. goto step 4
8. if you find the word, move left equal to left + size of the word found
9. continue this till left is not equal to right.

#include <set>
#include <fstream>
#include <string>
#include <iostream>

std::set<std::string> pDict;

int BuildDictionary(const char *filename) {
    std::ifstream dictFile(filename);
    if ( dictFile.peek() == EOF ) {
        return -1;
    } 
    std::string line; 
    if (dictFile.is_open()) {
        while (! dictFile.eof() ) {               
            std::getline (dictFile,line);
            pDict.insert(line);
        }
        dictFile.close();
    } else {
        return EXIT_FAILURE;
    }
    return 0;
}
int main ( int argc, char **argv) {
	//BuildDictionary(argv[1]);
	pDict.insert("i");
	pDict.insert("this");
	pDict.insert("saw");
	pDict.insert("is");
	pDict.insert("a");
	pDict.insert("some");
	pDict.insert("awesome");

	std::string word(argv[1]);

	std::string::iterator left = word.begin();
	std::string::iterator right = word.end();
	std::string::iterator rightMoving = left + 1;
	while ( left != right ) {		
		std::string tempWord(left, right);
		if ( pDict.find(tempWord) != pDict.end()) {
			std::cout << tempWord << " ";
			left = left + tempWord.size();
			right = word.end();
			continue;
		} else  {
			right--;
		}		
	}
	return 0;
}

- dongre.avinash August 11, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

boolean InsertCorrectSpace(Char * S){
Char * temp;
int length = strlen(S);
for(int i=0; i<length; i++){
temp = strncpy(temp, S, i + 1);
if(IsInDictionary(temp)){
if(i == length-1){
return true;
}else if(InsertCorrectSpace(&S[i+1])){
print(temp);
return true;
}
}
}
return false;
}

- C code for this problem August 16, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Go to hell with your crappy backgracking solution. run the program for 1000 chars.

- anonymous August 20, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class Test
{
	public static void main(String[] args)
	{
		Set<String> dictionary = new HashSet<String>();
		dictionary.addAll(Arrays.asList(new String[] { "this", "is", "a", "awe", "we", "some", "awesome" }));
		String str = "thisisawesome";
		
		List<String> phrases = new ArrayList<String>();
		putSpaces(str, 0, phrases, dictionary, new StringBuffer());
		System.out.println(phrases);
//		[this is a we some, this is awe some, this is awawesome]

	}
	
	public static void putSpaces(String str, int start, List<String> list,Set<String> dictionary,  StringBuffer sb)
	{
		String word = "";
		for (int i=start; i< str.length(); i++)
		{
			word += str.charAt(i);
			if (dictionary.contains(word))
			{
				sb.append(word);
				if (i == str.length()-1)
				{
					list.add(sb.toString());
				}
				else
				{
					sb.append(" ");
					putSpaces(str, i+1, list, dictionary, sb);
					sb.setLength(i+2);
				}
			}
		}
	}
}

- Ted June 05, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

The solution can be done in N^2 with DP. The key is to find all the occurences of words in the dictionary in the string. This can be done in linear time with the Aho-Corasick algorithm.

- Vlad November 02, 2012 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

This is interval DP problem. ie. use dp[i][j] = -1 if str(i....j) can't be split and dp[i][j] = p, if it can be split as dp[i][p], dp[p+1][j]. and apply this logic recursively .

Overall order is O(N^3).
Variation of problem : Split such that min number of spaces is used

- Anonymous July 25, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

This is interval DP problem. ie. use dp[i][j] = -1 if str(i....j) can't be split and dp[i][j] = p, if it can be split as dp[i][p], dp[p+1][j]. and apply this logic recursively .

Overall order is O(N^3).
Variation of problem : Split such that min number of spaces is used

- Anonymous July 25, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

use trie data structure and everything will be fine.aaal is well

- Ajay August 31, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

thanks ajay for the explanation.

- UjjwalBengu August 31, 2014 | Flag


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More