[Phone screen] Let's say I ga

Goldman Sachs Interview Question for Software Engineer / Developers

0

of 0 votes

11
Answers
[Phone screen]
Let's say I gave you a long String and I wanted you to tell me the most common word in that String. How would you do that?

follow-up: OK, how would you estimate the size and time complexity of this solution? How would you estimate the ACTUAL size usage? (Hint: how many words are in the English language? Would having a dictionary in front of you help?)

follow-up #2: OK, how about if I gave you the entire works of Alexandre Dumas, one of the most prolific authors in history. How would your solution work? How could you change it to solve this more specific problem?

follow-up #3: Now, what if we wanted to find the most common PHRASE in his writings. (Upon clarification, the interviewer wouldn't give a specific length, so I clarified to finding as long as a common 10 word phrase, because anything longer is unlikely.)
- utopia August 12, 2014 in United States | Report Duplicate | Flag | PURGE
Goldman Sachs Software Engineer / Developer Algorithm Hash Table

Email me when people comment.

An error occurred in subscribing you.

Country: United States

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 1 vote

here is the algo :

1. Create a map of <word, frequency> i.e <String, Integer>
2. Every time you have a word you look up in the map and if it returns null you do insert <word, 1> if it is not null then you read the value and increment it by 1.
3. After reading the string you Iterate the map on words and hold word which has higher frequency.

Time complexity is o(n) where n is the number of words in the string
space complexity is the o(m +1) where m is the unique words in the string.

Yes having a dictionary will help bocz we can remove invalid words by checking in the dictionary. This will reduce space complexity and time complexity. Further we can also massage data by ignoring stems

Instead of loading it as a string first I would stream data so that I avoid memory spike. Further our map size might increases as new words are added to it. Also, we can use map reduce type job were we create map<word, frequency> of each play in parallel and later join/collapse them this will reduce the time it will take to get result.

Common phrase should be no different then above algorithm. However we need to rebuild our index with <phase, frequency>

- Prince August 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Can you please give the brief code logic .. I am new to coding please

- shaik August 14, 2014 | Flag

Comment hidden because of low score. Click to expand.

-4

of 4 votes

let me clear the time complexity, lets say you have n words of length m, for inserting a single word map will search and take O(n) time. this is checking of single word in map and u have total n words so overall TC = O(n^2). not O(n)

- ajayv October 22, 2014 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 votes

your complexity estimate is wrong. Map insertion is O( logn ) and you are doing this N times. O(n logn ).

- Anonymous February 19, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Based on Prince

public static String findMostCommonString(String longString) {
		if(longString.length()==0) return "";
		
		String[] words = longString.split(" ");
		Map<String, Integer> countMap = new HashMap<>();
		
		for(String word : words) {
			if(countMap.containsKey(word)) {
				int count = (Integer)countMap.get(word);
				countMap.put(word, ++count);
			} else {
				countMap.put(word, 1);
			}
		}
		
		int max = 0;
		String maxWord = "";
		for(Map.Entry<String, Integer> entry: countMap.entrySet()) {
			int count = entry.getValue();
			if (count>max) {
				maxWord = entry.getKey();
				max = count;
			}
		}
		
		return maxWord;
	}

- JIN August 25, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Yes, you can also create a has table based approach just like said by Prince above.
1. Hashtable<String,Integer> ht = new Hashtable<String,Integer>();
2. Tokenize each word and check the word as existing key in the table, if it exists increment the Integer part.
3. Finally check for each has slot and see whose Integer part is maximum, that will be the word whose frequency of occurence in the sentence is the highest.

- Santanu Naskar August 13, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Providing amazon employee referral. @securefer.com

- BIN August 16, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Cant a trie be used here to optimize space?

- Shoot October 27, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Yes you can. A trie will be more useful in the second situation where you will have millions of words. You can have a counter at every node to see its frequency

- sarangkunte1991 November 30, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

What about this code .

private static void countMostCommonlyUsedWord(String line) {
Map<String, Integer> map = new LinkedHashMap<String, Integer>();
StringTokenizer token = new StringTokenizer(line);
int noFrequency=0;
String str="";
while (token.hasMoreTokens()) {

String strToken = token.nextToken();

if (map.containsKey(strToken)) {
map.put(strToken, map.get(strToken) + 1);
if(map.get(strToken)>noFrequency){
noFrequency=map.get(strToken);
str=strToken;
}
} else {
map.put(strToken, 1);
}
}
System.out.println(map);
System.out.println(str);

}

- Nirakar November 13, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The ideal solution would describe how search engines work. What you need is -
1. A tokenizer
2. Stemming library
3. List of stop words
4. A hashmap

Method:
1. Tokenize the input text, filter tokens using stop words
2. Use stemmer to find stem of the given word
3. If the stemmed word is in hash map, increment its frequency count else add word to hash map with frequency 1
4. Along with word also store its position in the text

- Don November 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This is a solution to the initial question in code:

public static String mostCommonWord(String longString) throws InvalidDataException {

		if (longString == null) {
			throw new InvalidDataException();
		}
		
		HashMap<String, Integer> hashMap = new HashMap<String, Integer>();

		String subString;
		String mostCommon = null;
		int mostTimes = 0;

		StringTokenizer wholeString = new StringTokenizer(longString);
		while (wholeString.hasMoreTokens()) {
			
			subString = wholeString.nextToken();

			if (hashMap.get(subString) == null) {
				
				hashMap.put(subString, 1);
			} else {
				
				if (hashMap.get(subString) > mostTimes) {
					mostCommon = subString;
				}
				hashMap.put(subString, (hashMap.get(subString) + 1));
			}
		}
		
		return mostCommon;
	}

- avigailf February 02, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

To display the top repeated word

cat file | tr ' ' '[\n*]' | sort | uniq -c | sort -bnr | head -1

- Chinmay March 28, 2018 | Flag Reply

CareerCup

Goldman Sachs Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews