Kyle
BAN USERAlso worth noting you should get at least a warning in most compilers for implementation 2 for the conversion from a const to non const value.
Decided to verify:
populate.cpp: In function 'void populate(char**)':
populate.cpp:23:11: warning: deprecated conversion from string constant to 'char *' [Wwritestrings]
Here's what I came up with before looking up the mathematical direction to solve this problem:
This solution isn't perfect as it requires a sanity check when doing the final XOR checks. But unordered set lookup has a constant average time complexity on average.
C++11 Solution:
pair<int, int> findMissingTwo(const unordered_set<int> &numbers)
{
/*
Note:
1) XOR all the elements, let the result of XOR be X1.
2) XOR all numbers from 1 to n, let XOR be X2.
3) XOR of X1 and X2 gives the missing number.
*/
auto it = numbers.begin();
int n = 1; // Keep track of n.
int t1 = *it; // Running Total.
int x1 = *(it++); // XOR all elements.
int x2 = n++; // XOR 1 to n.
int missing = 0; // Total missing
while (it != numbers.end())
{
t1 += *it;
x1 ^= *(it++);
x2 ^= n++;
}
// Two numbers missing from set, XOR with x2.
x2 ^= n++;
x2 ^= n;
// Sum of missing values = Known total  running total.
// Note: We do have a potential for overflow here.
missing = (n * (n + 1) / 2)  t1;
// XOR from 1 to N and see if X1^X2 is the missing number.
for (int cur_test = 1; cur_test <= n; cur_test++)
{
int x1_t = x1^cur_test;
int needed_value = missing  cur_test;
// if value isn't larger than n
// and value isn't itself
// and XOR of X1 and X2 gives the missing number
if (
needed_value <= n &&
needed_value != cur_test &&
needed_value == ((x1_t)^x2)
)
// Way to eleminate check? Note, unordered_set lookup is very quick.
if (numbers.count(cur_test)  numbers.count(needed_value))
cout << "Bad Value Guessed (" << cur_test << "," << needed_value << ") Keep Looking..." << endl;
else
return {cur_test, (needed_value)};
}
// This should not hit, but need to return something.
return {1,1};
}

Kyle
April 06, 2013 I didn't see a solution in this direction.
Python:
def getDigit (index):
digits = 0
min = None
max = 1
while (True):
digits += 1
min = max + 1
max = int ('9' * digits)
size = (max  min + 1) * digits
if (size > index):
break
index = size
number = min + (index / digits)
digit = index % digits
return int (str(number)[digit])
Unit tested every index from 0 to 38890. aka numbers 0 to 9999 and it seems to be good.
 Kyle February 05, 2013This can probably be optimized, here is my first attempt:
bool isMatching (const string &str, const string &pattern)
{
vector <string> search;
size_t cur_pos = 0;
size_t prev_pos = 1;
while ((cur_pos = pattern.find ("*", cur_pos)) != string::npos)
{
if (prev_pos != 1 && prev_pos != cur_pos)
{
search.push_back (pattern.substr (prev_pos, cur_pos  prev_pos));
}
prev_pos = ++cur_pos;
}
cur_pos = 0;
for (const auto &x : search)
{
cur_pos = str.find (x, cur_pos);
if (cur_pos == string::npos)
return false;
cur_pos++;
}
return true;
}
Note: Code written in C++11.
 Kyle November 24, 2012@Mani  Correct powers of 10 would work also but, I chose base 2 because it allows you to use the shift operator instead, and if you have to worry about overflow you'll be able to index the largest number possible.
Slight adjustment should probably be 2^11, 2^21, ..., 2^321, ... This way you'll be able to check the top end, otherwise you'll over flow and miss the last 2^xx range.
Syed: The example only works if there is one odd number 2 and 3 both occur an odd number of times so it wouldn't work in this case.
This works because when a number xor's with it self the result is zero.
1^1 = 0, 2^2=0,...
So if all values occur in pairs except for 1, the result is the odd one out.
Since the array is 'unlimited' in size we need to find a valid 'end' before we start searching.
1. Find an end (check 2^1,2^1,2^2,2^3...) if it's bigger than the number searched or you get an exception that's your end.
2. Do a binary search range 2^(i1)  2^i for your number when you find the value save the index but keep searching via binary search. Cover examples like [1,2,(100,000 2's), 3,4,5,..................]
3. When your finished with all binary searches in the found range the last seen index is the largest location of that number.
Python Example using RegEx:
import re
all_words = ["HELL", "HILL", "HELP", "HALF", "NINE", "HELLO"] # Sample Words
def findMatches (search_string):
return [x for x in all_words if re.match (re.sub ("_", ".", search_string) + "$", x)]
matches = findMatches ("H_L_")
print matches

Kyle
November 20, 2012 I recently had an interview with Amazon, and was asked one simple question while we were trying to get the code site link working (about 23 mins to explain, very simple). Then one 'real' question close to the one asked in this discussion. I'm not sure how you would get more than two technical questions in an hour interview.
I was already 2030 minutes into the interview after I was done talking about my myself and what I have worked on in the past. Then after the technical questions we talked for another 15+ mins about the company, city, etc, etc etc.
So 1 hour, 15, 2 , 30, +12 (went over the hour a little bit), so I had around 25mins to come up with a solution, code the solution, talk about other directions that could be taken to solve the problem, and coding a 2nd solution to the problem.
If I was perfect (or knew the answer before hand) I could see getting more than one question, but it sounded like the phone interview was meant to have one question asked.
I'll found out in a couple days if I made it to the next step, good luck to you too.
Open Chat in New Window
1. I would ask the interviewer if they meant a binary search tree, and not a basic binary tree. A simple binary tree isn't used for searching it is simply a way to store data with left and right nodes, so really has no comparison with the other two types of trees.
 Kyle April 06, 2013I'll assume they clarified after my question that it is in fact a binary search tree to compare against.
Binary Search trees do not need to be balanced, so depending on the order of insertions the tree can become unbalanced quickly and search times could potentially vary a lot. This is the most basic search tree, and inserts and deletes have very simple logic.
AVL trees are always at the optimal balance point, but can slow down inserts and deletes because they never allow that tree height to be outside the 1 to 1 range. But you will have the fastest look times.
Red black trees are also self balancing but can become slightly imbalanced to improve insert and delete times, with a small potential hit to search times.
So if a system updates the tree often I would lean toward using a red black tree, and if the tree is updated very rarely then a AVL tree would be the better direction.
If the data will come in a "controlled manner" that we will not likely have a large imbalance than a BST could be an option but it would still be good to look at one of the self balancing tree structures.