Facebook Interview Question
Front-end Software EngineersCountry: United States
Interview Type: Phone Interview
IMO they want you to create an inverted index.
We have an array of words, and an array of strings. Here is a simple approach:
1. Create an hash table where key = word, and values = array of indices of strings which contain that word.
2. Parse each of the strings in the strings array and update the hash table.
Search:
Break down the search query into words and return the indices of strings that contain that each of those words.
The best match is the string which is returned maximum number of times.
karanAThealthtapDOTcom
Please can you clarify the quiestion? You've got array of strings, how would you search string in dictionary?
Check whether string contains only words from dict or not?
That's the problem, that's how they asked me. And got me rejected :-(
Assume, dictionary to be a array of words.
Ok,so you've got a lot of patterns that you know in advance and you receive strings, you need to answer whether that string are represented in dictionary or not. I think you should preproceed dict, build something like trie (Aho-Corasik algo). Second, to be sure that all words from string are in dict Aho Corasik will be preferable. Also i suggest you to read about suffix trie and Ukkonen algo.
You don't need to use trie data Structure, is too complex for the solution.
Considering that the dictionary of strings is something like:
var validWords = {"You": true, "have": true, "a": true, "dictionary": true, "which": true, "is": true, "an": true, "array": true, "of": true, "words": true, "and": true, "array": true, "of": true, "strings": true};
And that the array of strings is something like:
var stringArrayToCheck = [
"You have a dictionary which is an array of words and array of strings",
"Write two functions",
"Prepare the array of strings to be searched in the dictionary",
"Check if the string contains all valid words or not"
];
To prepare the array of strings, you just need to iterate on it and create an array of arrays by splitting each string by space. So you will have each word as an element in the array.
And to check if the string is valid, you just need to check if each element in the created array exists in the dictionary.
Unfortunately, this is what I did and faced the rejection. The interviewer hinted me about prefixes so I believe they were indeed looking for TRIE
Too complex? What does that even mean?
Implementing a good "dictionary", say with a hashtable is complex too. Just because your favorite programming language does not have support for tries does not mean it is too complex, or off-the-shelf solutions won't exist.
"To prepare the array of strings, you just need to iterate on it and create an array of arrays by splitting each string by space" - I think array of sets would be better because if two words are the same then they are both valid or invalid.
Aside from this, I absolutely agree with the rest. That's a linear time algorithm. Asymptotically there is nothing better than this. No need for trie or whatever...
struct TrieTrieNode {
TrieNode() : isWord(false) {
for(int i = 0;i < 26;++i)
children[i] = 0;
}
~TrieNode() {
for(int i = 0;i < 26;++i)
if(children[i])
delete children[i];
}
TrieNode*& child(char c) {
//For Simplicity, converts all characters to lowercase
return children[tolower(c)-'a'];
}
TrieNode* children[26];
bool isWord ;
};
class Trie {
public:
Trie() {
root = new TrieNode;
}
~Trie() {
delete root;
}
void addWord(string word) {
TrieNode* curr = root;
for(int i = 0;i < word.length();++i){
char& c = word[i];
if(!isalpha(c))
continue;
if(!curr->child(c))
curr->child(c) = new TrieNode;
curr = curr->child(c);
}
curr->isWord = true;
}
bool searchWord(string word) {
TrieNode* curr = root;
for(int i = 0;i < word.size();++i) {
char& c = word[i];
if(!isalpha(c))
continue;
if(!curr->child(c))
return false;
curr = curr->child(c);
}
return curr->isWord;
}
private:
TrieNode* root;
};
To Build the Dictionary of words from a txt file:
Trie trie;
string temp;
ifstream fdDict("/path/to/dictionary/words/file");
while(!fdDict.eof()) {
fdDict >> temp;
trie.addWord(temp);
}
public class trieNode{
public char key;
public Hashtable< char, trieNode> children;
public static trieNode root;
public boolean isWord;
public trieNode( char k ){
key = k;
children = new Hashtable< char, trieNode> ();
isWord = false;
}
public static void constructTrie( ArrayList<String> input ){
for( int i = 0; i < input.size(); i ++ ){
String st = input.at( i );
n = root;
for( int j = 0; j < st.size(); j ++ ){
trieNode newN;
if( ( newN = n.children.get( st.charAt( j ) ) )== null ){
newN = new trieNode( st.charAt( j ) );
n.children.add( newN );
}
n = newN;
}
n.isWord = true;
}
}
public static boolean search( String word ){
trieNode n = root;
trieNode newN;
for( int i = 0; i < word.size(); i ++ ){
if( ( newN = root.children.get( word.charAt( i ) ) ) == null )
return false;
n = newN;
}
if( n.isWord )
return true;
return false;
}
}
A simple trie implementation in Python, using nested dictionary objects.
def trie(*words):
root = {}
for word in words:
curr_node = root
for letter in word:
curr_node = curr_node.setdefault(letter, {})
curr_node.setdefault(None, None)
return root
def trie_contains(trie, word):
curr_node = trie
for letter in word:
if letter in curr_node:
curr_node = curr_node[letter]
else:
return False
if None in curr_node:
return True
return False
A simple trie implementation in Java assuming the charset as alphabets:
/**
*
* @author neerajks
*/
public class Trie {
public TrieNode root = null;
private class TrieNode {
public boolean isLeafnode = false;
public TrieNode[] childNodes = new TrieNode[26];
}
public Trie() {
this.root = new TrieNode();
}
public void insertString(String s) {
int index=0;
s = s.toUpperCase();
TrieNode cur = root;
while(index<s.length()) {
int i = s.charAt(index)-'A';
if(cur.childNodes[i] == null) {
cur.childNodes[i] = new TrieNode();
}
//Check if it is leaf node
if(index==s.length()-1) {
cur.childNodes[i].isLeafnode = true;
break;
}
cur = cur.childNodes[i];
index++;
}
}
public boolean searchString(String s) {
TrieNode cur = root;
s = s.toUpperCase();
int index=0;
while(index<s.length()) {
int i = s.charAt(index) - 'A';
if(cur.childNodes[i] == null)
return false;
if(index == s.length()-1) {
if(cur.childNodes[i].isLeafnode)
return true;
else
return false;
}
cur = cur.childNodes[i];
index++;
}
return false;
}
}
$arr = array("abc","jl");
$strings = array("hey hey","no go");
$trie = new Trie($arr);
foreach($strings as $string) {
var_dump($trie->validateSentence($string));
}
class Trie {
public $root;
function __construct($wordArr) {
$this->root = array('val'=>'','children'=>array());
foreach($wordArr as $word)
$this->pushWord($this->root,$word);
}
public function validateSentence($string) {
$res = true;
$words = explode(" ",$string);
foreach($words as $word) {
$res &= $this->validateWord($this->root,$word);
}
return $res;
}
public function validateWord($node,$str) {
if (empty($str)) {
return true;
}
foreach($node['children'] as $index=>$child) {
if ($child['val'] == $str[0]) {
return $this->validateWord($node['children'][$index],substr($str,1));
}
}
return false;
}
public function pushWord(&$node,$str) {
if (!empty($str)) {
$found = false;
foreach($node['children'] as $index=>$child) {
if ($child['val'] == $str[0]) {
$found = true;
$this->pushWord($node['children'][$index],substr($str,1));
}
}
if (!$found) {
$newNode = array('val'=>$str[0],'children'=>array());
$node['children'] []= $newNode;
$this->pushWord($node['children'][0],substr($str,1));
}
}
}
}
If dictionary is too big to use hashtable, you can use binary search tree.
Definitely, to use in production you should either unsure that the tree is balanced(shuffle input for example) or implement balanced tree.
function BinaryNode(key, value){
this.key = key;
this.value = value;
this.left = null;
this.right = null;
}
BinaryNode.prototype.add = function(key, value){
var node = this;
if(node.key == null ){
node.key = key;
node.value = value;
return;
}
while(node){
if(node.key <= key){
if(node.right == null){
node.right = new BinaryNode(key, value);
return;
}else{
node = node.right;
}
}else{
if(node.left == null){
node.left = new BinaryNode(key, value);
return;
}else{
node = node.left;
}
}
}
}
BinaryNode.prototype.find = function(key){
var root = this;
while(root){
if(key == root.key)
return root.value;
if(key < root.key){
root = root.left;
}
else{
root = root.right;
}
}
return -1;
}
var obj = { mama: "мама", papa: "тато", word: "слово", job:"робота", tree: "дерево" };
var tree = new BinaryNode(null, null);
for(prop in obj){
tree.add(prop, obj[prop]);
}
console.log(tree);
console.log(tree.find("mama"));
Is this the inverted index Karan talked about?
var query = ['facebook', 'apple', 'google']
var strings = [
'facebook is a company',
'google is good',
'facebook apple google'
]
function prepare(strings) {
var hash = {}
strings.forEach(function(item, i) {
var words = item.split(' ')
words.forEach(function(word) {
hash[word] = hash[word] || []
hash[word].push(i)
})
})
return hash
}
function fullsearch(strings, query) {
var hash = prepare(strings)
var intersects, indexes
for (var i = 0, l = query.length; i < l; i++) {
indexes = hash[query[i]]
if (!indexes) return null
if (!intersects) {
intersects = indexes
continue
}
if (intersects.length === 0) {
return null
}
var n = intersects.length, seen, m
while (n--) {
m = indexes.length
seen = false
while (m--) {
if (indexes[m] === intersects[n]) {
seen = true
break
}
}
if (!seen) {
intersects.splice(n, 1)
}
}
}
// return the index of strings that contains all words
return intersects
}
console.log(fullsearch(strings, query));
I think you should use trie data Structure in this.
- Adi December 11, 2013