Interview Question
Country: United States
Interview Type: Written Test
as any algorithm can not avoid to read through the file, I think the only item we can improve is the operation we do to record the repeat things.
we can use the bit the record the repeat history.
take a simpler number for example: the number is 1-10.
so we can using memery block 0000 0000 00 to record the repeat history,
if 1 is appear , then the first position of the block will be set to 1, so the history record block will be 0000 0000 01,
then anytime new number read in , check the block ,if the bit is 1, then it is repeated.
we can using another bit block to record the results...
We may want to use String-like hash function to calculate hash code of a number (omitting all dashes etc). In the first pass populate the map of hash code to a number of repeats. On the second pass find out the numbers matching unique hash codes. With low memory this can be done on multiple machines by creating a map on each machine which covers MAX_INTEGER%num_of_machines hashes only.
Solution-1: As described above, we can store the phone number in a hashmap that contains mapping as key=phoneNumber and value=boolean. Where boolean value represents whether we have exactly 1 (or) more than 1 occurrences of phone number.
-----
Solution-2: We can keep reading the phone numbers from the file and we can prepare a trie (or) prefix tree. For each phone number we can store its frequency in trie at the leaf level.
Maybe it's not best, but it is work - find each needed file and read it to string variables, then using regular expressions extract phone numbers and save to array. You need save in one of the formats(xxxxxxxxxx, xxxxx-xxxxx, xxx-xxx-xxxx). And if You choose xxxxxxxxxx, You need delete symbols "-". If You want I can write code using c#.net, send me examples only.
Better idea - all symbols(and 2 empty) that not number or "-" You replace with " "(empty)
then string[] strs = STRING_HTML.Split(" ");
then remove if(length != 10, 11, 12) and analyze.
I suspect they are looking for grep! :-)
- mag December 06, 2012