Interview Question
Senior Software Development EngineersCountry: United States
Interview Type: Written Test
1. Hash table: Because of the possibility of collisions, each insertion could take O(m) time, where m is the number of tokens or substrings, which is O(n). So complexity of inserting all tokens is worst-case O(m^2).
2. As for selection, selecting the kth element without sorting is O(n) in the worst case, but you have to do that O(m) times, m being the number of words/tokens.
I decided that an approach using suffix trees is the best for this problem. Suffix trees can be constructed in O(n) time, n being the size of the text string. One doesn't have to tokenize it beforehand - it is done for free when inserting the whole text into the tree. At the end node of each word, store pointers to the ends of the word in the text and a count of the current number of occurrences. At the end collect all these <word, # of occurrences> pairs, and do a radix sort. We thus end up with an O(n) algorithm.
Any comments are welcome.
I dont think Alex solution is so bad, he is asking to maintain count not insert in hash table, which should be constant theoretically, practically unless I insert same set of keys, collision will hardly occur so this works as O(n). Coming to selection you can use max heap of size k, to get max k elements in a constant time. nlog(k) is complexity, in a worst case of k=n(or approaching n), heap insertion becomes constant.
By the way, what is the interviewer response to your solution
Code is as below-
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
/**
* @author kedar dixit
*
*/
public class WordCount {
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
System.err.println("Enter A String");
String string = scanner.nextLine();
Map<String, Integer> map = new HashMap<>();
String[] tokens = string.split(" ");
int maxVal = 0;
for (String token : tokens) {
map.put(token, (map.get(token) == null ? 0 : map.get(token)) + 1);
if (maxVal < map.get(token)) {
maxVal = map.get(token);
}
}
System.err.println(map);
System.out.println(maxVal);
}
}
Sample Input and Output-
Enter A String
Enter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A StringEnter A String
{A=28, String=1, Enter=1, StringEnter=27}
28
If by tokens you mean words, keep a hash table to maintain count. If you mean characters, keep an array of size 256 to maintain count.
- alex July 16, 2014