## Microsoft Interview Question

Software Engineer / Developers**Country:**United States

**Interview Type:**In-Person

So, according to this approach, there will be n array of 10[for digits-0 to 9]probably an array of linked list. Whenever a phone number needs to be inserted, we will take out each digit & associate its pointer[through which we can display the phone number] to the linked list. Good thinking, But what to do when the phone number contains duplicates.

Also, don't you think that we will be storing the same phone numbers multiple times?

Duplicates: The inverted index should have tf scores associated with each phone number. So for example, phone# 898765432 will have tf score of 2 for index 8.

Yes, the same phone number will be stored at most 10 times and this will make the inverted index very large. We can avert this by increasing the size of the dictionary from 10 to say 100. Now the indices will be from 00 to 99 instead of 0 to 9.

I have thought one approach.More like as suggested by @eugene.

Let me know whether it is efficient.

Use TRIE data structure where phone numbers will be inserted in sorted order. In the leaf nodes, we will have pointers to the original phone number[the phone numbers will be in the form of strings right!].

In order to handle collisions[two different phone numbers after sorting may represent the same leaf node],

we can have linked list at the leaf nodes. Whenever a new phone number is being inserted & the leaf node is already having pointer to some other phone number, we will chain this phone number[create a new node in the linked list & insert its pointer there].

Now, During search operation, lets say 321, first sort the string to be searched[say 123 in this case].

After reaching 123, an inorder traversal would be sufficient.Whenever a leaf node is encountered, all the strings[phone numbers] can be accessed through single linked list.

Comments are welcome.

I think you are right..we will have to first sort all the numbers and storing actual numbers in the leaf nodes.

As far as I understand, trie is used for kind of pattern matching. If I understand your proposal correctly, it means that we can reach a list of numbers (stored in leaf node) following the digit pattern of phone number (sorted) from trie root.

Perfectly fine till now. But, what if the number is 1234567890 and we are supposed to find number which include 456. There is no path from root for digit '4'. In trie, you have to have a path from root for each key to reach to key's value (which is our case is a list of phone numbers containing key digits).

In case I have misunderstood, please help to clarify.

We will ignore all other digits and will keep searching till we find required digits or we reach the leaf node.

If we find all the digits till we reach the leaf or before, we will give all the numbers stored in that leaf node.

Since all numbers are sorted , we will have to search in all branches which have number less than the first digit of pattern..(pattern is also sorted.)

As we have 10 numbers in phone number. What's the point of making trie for it . All phone number will appear in leaf node of single link of trie i.e. 0->1->2->3....->9 -> all nodes of numbers. so we can display all numbers.

yes it is a right way.Also we can have each node in the trie such that they contain the position last matched number from the sequence and the subsequent nodes can either contain the position in the parent node or a next position depending on whether at that node we dont find the matching number of the required sequence or we do find matching number

@shondik

Can you please elaborate your answer.

Because you have told different phone numbers will have same leaf.. how is it possible? we can have linked list at the leaf nodes. Whenever a new phone number is being inserted & the leaf node is already having pointer to some other phone number, we will chain this phone number[create a new node in the linked list & insert its pointer there].? i am not clear with this..

suppose phone number is 9840404040 then 44 , 04, 84 etc all shud give this number..

it would be great if u can explain it with an example.

For every phone number create 10 bit value. Set in "1" those bits whose digit presents in the phone number.

For example number 9456 will have value 1001110000

For searching match numbers check following condition:

(A & B) == B

A - value for number from phone book

B - value for typed number

What about duplicates digits? Phone numbers can also contain duplicates,right.

Also , user may want to type number like 2324

to shondik:

Good question :-)

It is not clear in desctiption about duplicates.

But anyway I don't like this my solution. The second my solutions in this thread is better. It work fast and work good. But somebody vote "-1" without any explanation.

I think this could work if there is a restriction on size of input............

For Ex, if the input is assumed to be only 4 digit long.....Then your method can be modified by now using 2 bit instead of 1 bit for each digit....

2207

(00)(00)(01)(00)(00)(00)(00)(10)(00)(01)

I suppose a Ternary Search Tree would give a fitting solution, but I am not confident about its implementation.

@eugene, if the phone numbers be sorted then how the TRIE is going to be handy.

Say numbers are 9657 & 9756.

After sorting both reduce to same number. How you are going to store them in TRIE.

There is no need to sort the mobile number strings. We make an array of TrieNodes of size 10. The indices of this array would represent numbers from 0-9. Now each of these 9 slots in the array would have another array of TrieNodes of size 10 and so on.

So if 6 is entered then everything under the first array[5] would be displayed.

@eugene You mean to say that the mobile numbers are stored somewhere else as strings and the those mobile numbers are stored in the trie in sorted way.

And since every mobile number is 10 digits we can have the pointer to the string (actual mobile no.) in the last node itself.

@eugene I think u are correct since order is not important.......But I don't understand why u are storing the pointers to the actual strings......Instead u can just store a count in each node of tries

What do you mean by "any order " here ??

Do 69234567... also appear in the search result ??

string f = "926";

HashSet<string> data = new HashSet<string>();

data.Add("2916");

data.Add("932678");

data.Add("92678");

data.Add("222222");

data.Add("722629");

data.Add("9777726");

data.Add("926926");

bool isToDisplay;

int lastInx;

foreach (var d in data)

{

lastInx = 0;

isToDisplay = true;

foreach (var i in f.Select((v, i) => new { Value = v, Index = i }))

{

if (!d.Contains(i.Value) || (lastInx > d.IndexOf(i.Value))) { isToDisplay = false; break; }

lastInx = d.IndexOf(i.Value);

}

if (isToDisplay) Debug.WriteLine(d);

}

String NumbersList = "9934176,993424,9923206,9933736,9837236,9344636,9639236";

String SearchList = "926";

foreach (string match in NumbersList.Split(','))

{

bool matched = false;

foreach (char hasnum in SearchList.ToCharArray())

{

if (match.Contains(hasnum))

{

matched = true;

}

else

{

matched = false;

}

if (matched == false)

{

break;

}

}

if (matched == false)

{

continue;

}

Console.WriteLine("matched with " + match);

}

Console.Read();

1. A 10 bit array indicating whether digit is there in number or not

2. A 100 bit array indicating two consecutive digits. For convenience this can be shrunk to 50 bits by binning arbitrarily

So now we have 60 bits to represent the phone number, great it fits in a long int.

For input number we find what the first array of 10 bits should be, and what the second array should be, let's call this bit pattern B. AND with the input and check if identical to B, this is a candidate - now we can verify if the input number is there or not. For 1 digit no need to verify. For 2 digits take account of binning if using 50 bits. For more digits need to check each match which shouldn't be so difficult because potential number of matches will be small.

This problem can be solved using prime numbers

Let's say A is an array which contains first 10 prime numbers

A = { 2,3,5,7,11,13,17,19,23,29}

unsigned long int multnum =1, multsearchnum =1 ;

foreach digit of number

{

multnum *= A[digit];

}

foreach digit of (searchNumber)

{

multsearchnum *= A[digit];

}

if { multnum % multsearchnum == 0} {

print ( "all digits of search string are present in the number");

}

I didn't find any failing case till now. Please let me know if you find any failing case.

Create 10 linked lists of pointers at phone numbers.

Phone number 9456 will be presented in "9" and "4" and "5" and "6" linked lists at the same time.

When user press number "9" we print all items from "9" linked list.

Then user press number 6 we print all items from "9" and "6" linked lists.

For protect from printing the same number many times we can make variable "checked" for every phone number and don't print checked numbers. It work if linked list have only pointers at phone numbers.

use a inverted hash table. This is similar to how search engine works. for every digit(0-9) associate each digit with a set of phone nos(which have that digit). once the query is fired, just take the intersection of the sets.

- Nishanth June 26, 2012