## Microsoft Interview Question

Software Engineer / Developersif the root of T2, let say r is unique in T1, then this problem becomes simple. But since there could be thousands of r in T1, so we cannot afford to check every sub tree rooted at those r's. My algorithm given above first locate the only possible candidate and then check it. which is describe again below:

find the predecessor(with a different value) of r in T2, let's denote it s, the same way, find the successor(with a different value) of r in T2, let's denote it b. Then if T2 is a subtree of T1, then s and b must appear in T1, so we search s and b in T1, although there might be hundreds of s or b in T1, but that doesn't matter, because duplicates huddle togethor. So if we find s and b in T1, no matter which copy we find, their least common ancestor in T1 should be the same, let's denote it a, then if T1 contains T2, a must equals r, and T2 must be identical to the subtree rooted at a in T1.

Another way to restrict the possible candidates for r (and hence the subtrees in T1 to search) is to count the number of nodes in the subtree.

So, for tree T1, we compute the number of nodes in the subtree for each node in T1. This could be done by DFS. Then, we could find the number of nodes in T2, again using DFS. In T1, for each occurrence of r (the root value of T2), we check if it the counts computed for that node in T1 equals the that computed for the root of T2. If equals, then we need to perform a complete check; else, ignore.

Possibly, this could be combined with the filtering mechanism described in the first post to further cull down the candidate set.

Since, (inorder+preorder)or (inorder+postorder) traversals uniquely define a tree, we can use it to simplify the problem.

So, compute inorder(T1), preorder(T1), inorder(T2) and preorder(T2). Now, the problem is reduced to string matching i.e. we test to see if inorder(T2) is a substring of inorder(T1) and similarly for preorder traversal.

This should work for duplicates in the tree. But, there is a glitch, considering the following trees:

inorder(T1) - 3, 2, 4, 1

preorder(T1) - 1, 2, 3, 4

inorder(T2) - 3, 2

preorder(T2) - 2, 3

The method described says that T2 is a subtree of T1, however, the tree looks like:

T1-

1

2

3 4

T2-

2

3

Clearly T2 is not a subtree of T1!! So, to overcome this scenario, we could count the number of nodes in the subtree for each node in the tree T1 and T2, compare these values when a match is obtained from the above procedure.

Any comments, suggestions??

I don't agree with you. Since T1 has millions of nodes, it is impoosible to store or print the full path.

find the string representation of T2 ("flatten"). use a special character to represent a missing child - in this way, it will be unique. traverse T1. the idea is to flatten T1 similar to that of T2, but in this case, retain only the last hundred characters (size of T2) of the string representation of T1. when u hit a match, then u got the answer.

since we are doing partial string match, it is possible that the representation of a set of leaf nodes + another subtree to be equal to T2, when its really not. so use some other special characters after every leaf node in our "flattening" scheme. In that way, the leaf nodes + other partial subtree will not be equal to T2. Hope its not confusing

bool IssubTree(node *root1,node *root2)

{

if(root1 == NULL || root2 == NULL)

return false;

bool success = false;

if(root1->data == root2->data)

success = CheckNodes(root1,root2);

if(success)

return true;

success = IssubTree(root1->left,root2);

if(success)

return true;

return IssubTree(root1->right,root2);

}

bool CheckNodes(node *root1, node* root2)

{

if(root1 == NULL || root2 == NULL || root1->data != root2->data)

return false;

CheckNodes(root1->left,root2->data);

CheckNodes(root1->right,root2->right);

}

why are you writing the code? with out atlease giving a hint of what it is doing? Bcoz, even u dont know what it is doing right? see how dumb ppl are !

he is actually pretty clear. he is using a recursive method to check equality of nodes.

also the CheckNodes is wrong as pointed by the return conditions here is what works. Also not sure if your functions handle cases when a tree T2 is bigger than a subtree T1 which match upto certain nodes

T1.

X

\

13

/ \

11 14

T2

13

/ \

11 14

\

15

My implementation uses a FindTreeNoePtr function and a LevelOrder checking. I will post the solution after the break.

//0. Assume Result is true by default then you setup cases where it fails

//1. If my current NodePtrs are null exit don't even check anything or you get memory corruption for obvious reasons

//2. If my NodePtrs Data and Duplicate cnts do not match then your subtree fails so your result = FALSE.

//3. If i did not have any data mismatches at this point, now I need to recursivly check my children but before that I need to do the following.

// a. Need to Check if T1 has the left child when T2 has the left child, and also similarly for the right child

// If T1's subtree is missing a child that T2 has then it is an automatic fails so you result = FALSE

//4. If you result is still true keep going and check the children left followed by right. Again here if your

// left child fails then exit out, no need to check the right.

//5. Finally return the result down the recursive function call stack till it is sent back to the caller.

bool CheckNodesTrees(TreeNode *NodePtrT1, TreeNode *NodePtrT2){

bool Result = true;

if((NodePtrT1!=0 && NodePtrT2!=0)){

if((NodePtrT1->Data!=NodePtrT2->Data)||

(NodePtrT1->Duplicates!=NodePtrT2->Duplicates)){

Result = false;

}

if(Result){

if((NodePtrT1->Left==0 && NodePtrT2->Left!=0)||(NodePtrT1->Right==0 && NodePtrT2->Right!=0)){

Result = false;

}

if(Result){

Result = CheckNodesTrees(NodePtrT1->Left,NodePtrT2->Left);

if(Result){

Result = CheckNodesTree(NodePtrT1->Right,NodePtrT2->Right);

}

}

}

}

return Result;

}

first traverse T2 to find how many level and nodes in the tree; then traverse T1 for each node, calculate how many level and nodes the subtree rooted at this node has. if both match, then check if all the nodes in both tree (or subtree) are the same. If yes, one is found; otherwise, continue.

evaluate(root,root2,false,root2);

bool evaluate(struct node * root, struct node * root2, bool found, struct node * original_root2)

{

if(root == NULL && root2 == NULL) return true;

if(root == NULL && root2 != NULL) return false;

if(root != NULL && root2 == NULL) return true;

if(root->data == root2->data)

{

return ( evaluate(root->lptr,root2->rptr,true,original_root2) && evaluate(root->rptr,root2->rptr,true,originat_root2));

}

else if(found)

{

root2 = original_root2;

}

return (evaluate(root->lptr,root2,false,original_root2) || evaluate(root->rptr,root2,false,original_root2));

}

bool evaluate(struct node * root, struct node * root2, bool found, struct node * original_root2)

{

if(root == NULL && root2 == NULL) return true;

if(root == NULL && root2 != NULL) return false;

if(root != NULL && root2 == NULL) return true;

if(root->data == root2->data)

{

return ( evaluate(root->lptr,root2->lptr,true,original_root2) && evaluate(root->rptr,root2->rptr,true,original_root2));

}

else if(root->data == original_root2->data)

{

return ( evaluate(root->lptr,original_root2->lptr,true,original_root2) && evaluate(root->rptr,original_root2->rptr,true,original_root2));

}

else if(found)

{

root2 = original_root2;

}

return (evaluate(root->lptr,root2,false,original_root2) || evaluate(root->rptr,root2,false,original_root2));

}

one small question..

if T2 is sub tree of T1 then why do we need to compare data which is costly.. why not see if T2 is in T1.. simple.. use some traversal and check if T2 is in T1.

I don't understand your question ? if T2 < T1, how do you know this, other than acutally traversing the T1 with values of T2. That is what Swati's code above does, there are couple of cases she does not take into account which I pointed out.

I have a solution. If it is a repetition of one of the previous algos, please forgive.

1. Flatten T1, Flatten T2, (some traversal algo).

2. Now you have a big string and a small string.

3. Precompute skip array for the smaller string and do a string comparison. This is almost O(strlen(T2));

Step 3 is actually Knuth Morris Pratt algorithm. You can find it on Wikipedia.

Suggestions/comments/criticisms welcome.

Thanks,

Rohith Menon.

I think this is the solution: Traverse both the trees in-order and then check if the sequence generated by the smaller tree is a part of that generated the bigger tree. If that is the case, then the smaller one can be considered to be a part of the bigger tree.

Please correct me if I am wrong.

The linear data structure (string, list or array) obtained by visiting the tree in-order does not describe the tree unambiguously, i.e. two different trees can yield the same sequence.

But I think that the pair of sequences obtained from visiting in-order and then in pre-order uniquely describes the tree (the tree can be unambiguously reconstructed from the 2 sequences).

First of all, is it binary search tree?

I think we need to use recursive algo

bool isPartOf(Tree* tree1,Tree* tree2)

{

if(!tree2)

{

return 1;

}

if(!tree1)

{

return 0;

}

if(tree1->data=tree2->data)

{

if(isPartOf(tree1->left,tree2->left)&&(isPartOf(tree1->right,tree2->right))

return true;

}

else

return (isPartOf(tree1->left,tree2)||isPartOf(tree1->right,tree2));

}

Compute the Merkle hash tree of T1. This is a one time operation (I assume many T2s subtrees will be looked to see if they are subtrees of T1). Also, for each node in T1, compute the height of the subtree having the node as root. Build a hashtable which maps heigths of subtrees to a linked list which contains pointers to the roots of the subtrees of T1 which have the same height.

For each T2, compute the Merkle hash tree and store the hash of the root. Also, compute its height. Get the linked list which contains pointers to the trees of height height(T2). Iterate though the list, get the nodes in T1 which have the same height as T2, and compare the Merkle hash values with the Merkle hash value of the root of T2.

If the hash codes match, then T2 is a subtree of T1, with high probability. More precisely, if the hash values do not match, then T2 is not a subtree of T1. If the hash codes match, then you may check that the trees overlap to avoid false positives.

Since question says "character data", so considering 8-bit data (0-255). For each T1 and T2 we can construct a two hash table with keys from (0 - 255) using inorder(). Now compare the values of both hash tables such that hash value of T1 is always >= T2.

Value of Hash->key is the number of times a character appears in a given tree.

Check as you do a simultaneous preorder traversal.

Time: O(n)

Space: O(log(n))

Where n is the size of T1.

```
public static boolean isSubtree(TreeNode t1, TreeNode t2) {
Stack<TreeNode> pre1 = new Stack<>();
Stack<TreeNode> pre2 = new Stack<>();
pre1.push(t1);
pre2.push(t2);
while(!pre1.isEmpty()) {
TreeNode current1 = pre1.pop();
if(current1 != null) {
pre1.push(current1.left);
pre1.push(current1.right);
}
TreeNode current2 = pre2.pop();
if(match(current1, current2)) {
if(current2 != null) {
pre2.push(current2.left);
pre2.push(current2.right);
}
}else{
pre2.clear();
if(!match(current1, t2)) {
pre2.push(t2);
}else if(t2 != null) {
pre2.push(t2.left);
pre2.push(t2.right);
}
}
if(pre2.isEmpty()) return true;
}
return false;
}
public static boolean match(TreeNode t1, TreeNode t2) {
return t1 == null && t2 == null || (t1 != null && t2 != null && t1.val == t2.val);
}
```

This is the most time & space efficient solution I could come up with. Can anyone think of a way to solve it using constant space?

let r be the root of T2, from the root, we keep following the left child pointer downward, and find the first node which is not a duplicate of r, let it be s, the same way, from the root we keep following the right child pointer downward, and find the first node which is not a duplicate of r, let it be b. Then find the least common ancestor of s and b in T1, let it be a, then check the subtree rooted at a and T2 to tell whether they are identical.

- Lee December 20, 2007