Google Interview Question
Software Engineer / DevelopersCountry: United States
Nice solutions!
Minor point being your solution would fail if last node of the tree has the max frequency. In this case, if(curFreqVal != node->val) condition will not be hit and hence the maxFreqVal will not be updated.
Hi, Sid.
Thank you for your pointing!!
Yes, you are right. The solution didn't control the special case.
Thus I modified 'saving the result' part in the 'inorder' method.
And I added more Node to test this case. You can test it! Thanks!!
Space is definitely not O(1) - you are doing recursive calls there.
If you really want a O(1) space solution, then it cant be achieved in O(n) time it will be n**2
You may even think in terms of augmented data structures, recursion, hashtable, there is no o(n) + o(1) solution
it's O(n) time and O(logn) space given that it's a balanced BST. Recursion's depth is the height of BST
You could pass in a hashmap which counts up the occurrences of each element, and then select the element with the largest count, or you could just get the in-order traversal and then count the longest run.
Once you simplify the complexity function it is the same amount of space, but the hashmap takes up twice the amount of space as the recursive inorder traversal. If you are dealing with a tree which is tens of thousands of nodes, the space complexity can cause an issue.
I think the problem needs some clarification - if there are 2 or more elements with the biggest frequency do you need to return them all or any? Do you need to return the element(s) or just the frequency? If you just need to return the frequency, an in order traversal, and keep track of the highest frequency at any given step would be enough and would be O(N) time, O(1) memory.
If this is one time process then the any of above solution would suffice , but if its recurring thing then we can augment the BST with node storing the frequency count for that node element , the frequency count would be incremented the decremented during the insertion and deletion in the BST. To get frequency count just do the BST traversal , find the element and output the frequency count from the node.
Your alternate approach will be good if we need to find the frequency of given node. The reason being a node can be present in both left or right sub-tree. So you need to add both the frequencies to get the total count. To find the node with max frequency dont you think this approach will be too complex to handle
Finding the most frequent item in a sorted list is O(n) in time and O(1) in space. (We just iterate through the list and keep track of the current most-frequent number & its frequency.)
An in-order traversal of a BST will give a sorted list.
So, at worst we should have O(n) time and O(1) space...
void list::Count_Occuerence_BST(node *temp)
{
if(temp==NULL)
return;
if(temp->lchild ==NULL && temp->rchild ==NULL)
{
return;
}
if(temp->rchild ==NULL && temp->lchild!=NULL)
{
if(temp->lchild->data ==temp->data)
{
Increment(temp->data);
Count_Occuerence_BST(temp->lchild);
}
}
if(temp->lchild==NULL && temp->rchild !=NULL)
{
if(temp->rchild ->data==temp->data)
{
Increment(temp->data);
Count_Occuerence_BST(temp->rchild);
}
}
if(temp->lchild && temp->rchild)
{
if((temp->lchild->data< temp->data) && ( temp->rchild ->data > temp->data))
{
}
if(temp->lchild->data ==temp->data && temp->rchild->data >temp->data)
{
Increment(temp->data);
}
if(temp->lchild->data < temp->data && temp->rchild->data ==temp->data)
{
Increment(temp->data);
}
if(temp->lchild->data ==temp->data && temp->rchild->data ==temp->data)
{
Increment(temp->data);
Increment(temp->data);
}
Count_Occuerence_BST(temp->lchild);
Count_Occuerence_BST(temp->rchild);
}
}
void list::Increment(int val)
{
int arr[20][2];
if(len==0)
{
arr[0][0]=val;
arr[0][1]=2;
return;
}
for(int i=0;i<len;i++)
{
if(arr[i][0]==val)
{
++arr[i][1];
}
}
arr[++len][0]=val;
arr[len][1]=2;
}
do a in-order traversal of the given tree, each node in the BST will also contain the frequency of occurence. We can create a max heap heap out of BST elements and give the required k most frequently elements of BST tree.
Time complexity (nlogn)
Space complexity O(n)
As the question said, "in this BST we can have leftnode<=rootnode<=rightnode.". It implies that one node in this BST may have nodes of same value in its right child node or left child node or both.
So, I use post-order traversal for this BST.
In each steps in recursion, we first count the most frequently value of nodes in left child nodes and right child nodes of current node. Also, count the node which has the same value of current node. After that, comparing the counts of most frequently values in both left and right child nodes with the counts of current node value.
After comparison, this node return an array to his parent node which contains three elements:
result=[counts of value in parent node, count of most frequently value in this sub BST, count of most frequently value in this sub BST]
class Element(object):
def __init__(self, value, lChild,rChild):
self.value = value
self.lChild = lChild
self.rChild = rChild
def findMostFrequentlyElement(node,parentValue):
value = node.value
count = 1
#assume 0 is not in array. If not, use other element instead.
subLResult = [0,0,0]
subRResult = [0,0,0]
if(node.lChild == None and node.rChild == None):
if(parentValue == node.value):
return [1,0,0]
else:
return [0,1,node.value]
if(node.lChild != None):
subLResult = findMostFrequentlyElement(node.lChild,node.value)
if(node.rChild != None):
subRResult = findMostFrequentlyElement(node.rChild,node.value)
result=[subLResult[0] + count + subRResult[0], subLResult[1],subLResult[2], subRResult[1], subRResult[2]]
parentValueCount = 0
if(parentValue == node.value):
parentValueCount = result[0]
if(result[1] >= result[3]):
return [parentValueCount, result[1],result[2]]
else:
return [parentValueCount, result[3],result[4]]
elif(parentValue == result[2]):
parentValueCount == result[1]
if(result[0] >= result[3]):
return [parentValueCount, result[0],node.value]
else:
return [parentValueCount, result[3], result[4]]
elif(parentValue == result[4]):
parentValueCount == result[3]
if(result[1] >= result[0]):
return [parentValueCount, result[1],result[2]]
else:
return [parentValueCount, result[0], node.value]
else:
if(result[0] >= result[1]):
if(result[0] >= result[3]):
return [0, result[0],node.value]
else:
return [0, result[3],result[4]]
else:
if(result[1] >= result[3]):
return [0, result[1],result[2]]
else:
return [0, result[3],result[4]]
You can use inorder traverse first and store all of the nodes in an array, and then traverse the array again from left to right. Since we have a BST, I think when we use the inorder traverse we can create an sorted array. Then we just need to count the elements and find out the most frequently occurring one. I think time complexity is O(n) since we need to traverse twice. Space is O(n) since we need to use an array to store all elements. For the better algorthim we can use probably O(1) space, which means we can check and count the element while we traversing the tree.
Code in C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace G2013041003
{
class Program
{
static void Main(string[] args)
{
int k=3;
int[] array0 = { 4, 10, 15, 24, 26 };
int[] array1 = { 0, 9, 12, 20 };
int[] array2 = { 5, 18, 22, 30 };
LinkedList<int>[] list = new LinkedList<int>[k];
for(int i=0;i<k;i++)
{
if(i==0)
{
list[i] = new LinkedList<int>(array0);
}
if(i==1)
{
list[i] = new LinkedList<int>(array1);
}
if(i==2)
{
list[i] = new LinkedList<int>(array2);
}
}
LinkedListNode<int>[] p = new LinkedListNode<int>[k];
for(int i=0;i<k;i++)
{
p[i] = list[i].First;
}
int minrange = -1;
int minindex= -1;
int maxindex = -1;
do
{
minindex = -1;
maxindex = -1;
for (int i = 0; i < k; i++)
{
if (minindex == -1 || p[i].Value < p[minindex].Value)
{
minindex = i;
}
if (maxindex == -1 || p[i].Value > p[maxindex].Value)
{
maxindex = i;
}
}
if (minrange == -1 || minrange > p[maxindex].Value - p[minindex].Value)
{
minrange = p[maxindex].Value - p[minindex].Value;
}
p[minindex] = p[minindex].Next;
}
while (p[minindex] != null);
Console.WriteLine(minrange);
Console.Read();
}
}
}
A simpler and more efficient recursive solution, written in Python (but any OO language would do, with a few tricks to wrap multiple return values).
This solution is O(n) for time and O(1) for space [O(h), where h is max the height of the tree, if we consider the space required by the stack].
Even if both left and right subtrees might contain the same value as their parent, for a given node either its left [right] child holds the same value, or that value won't be found in its left subtree.
So for each node the number of occurrencies of its value in the subtree rooted in that node is equal to 1 plus the sum of the occurrencies of that value in the left subtree (either 0 if node->left->value != node->value or the number of occurrencies for left subtree otherwise) plus the occurrencies of the same value in the right subtree:
freq(node) = 1 + (node->left->value == node->value ? freq(node->left) : 0) + (node->right->value == node->value ? freq(node->right) : 0)
Using recursion when we compute freq(node) we take advantage of the values of freq() already computed for left and right subtrees.
def most_frequent_element(self):
def rec_mFE(node):
if node is None:
return None, 0, 0
freq_node = 1
left_child = node.get_left()
if left_child is None:
mFE_left, freq_mFE_left, freq_child_left = None, 0, 0
else:
mFE_left, freq_mFE_left, freq_child_left = rec_mFE(node.get_left())
if left_child.get_value() == node.get_value():
freq_node += freq_child_left
right_child = node.get_right()
if right_child is None:
mFE_right, freq_mFE_right, freq_child_right = None, 0, 0
else:
mFE_right, freq_mFE_right, freq_child_right = rec_mFE(node.get_right())
if right_child.get_value() == node.get_value():
freq_node += freq_child_right
if freq_mFE_left > freq_mFE_right:
mFE = mFE_left
freq_mFE = freq_mFE_left
else:
mFE = mFE_right
freq_mFE = freq_mFE_right
if freq_node > freq_mFE:
return node.get_value(), freq_node, freq_node
else:
return mFE, freq_mFE, freq_node
return rec_mFE(self.__root)
int maxfreq,maxcount=0,count=0,freq;
void freqNode(node *root){
if(!root)
return;
static node *inpred=NULL;
freqNode(root->left);
if(!inpred){
freq=root->data;
count=1;
}
else{
if(inpred->data==root->data)
count++;
else{
if(maxcount<count){
maxcount=count;
maxfreq=freq;
}
freq=root->data;
count=1;
}
}
inpred=root;
freqNode(root->right);
}
Minor Doubt:
most frequently occurring number means same number is repeating in Binary Search Tree.
Here my doubt is if we have 10,10,10 to insert in BST
we will insert first 10 as a root
then will process second 10 and will match with first 10
10 not< 10
10 not>10
so we can not insert second 10 in left or right as per these 2 above rule
Is there any rule for 10==10 to insert in BST ???
I agree with brighama's saying. If we traverse this BST with Inorder, we can consider this is sorted array. Then we can get the most frequent node in [ Time : O(n), Space : O(1) ]. This is the code using C++.
- jinil April 18, 2013