Find the most frequently occur

Google Interview Question for Software Engineer / Developers

0

of 0 votes

27
Answers
Find the most frequently occurring element in a BST. In this BST we can have leftnode<=rootnode<=rightnode.
- iwanna April 09, 2013 in United States | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 9 vote

I agree with brighama's saying. If we traverse this BST with Inorder, we can consider this is sorted array. Then we can get the most frequent node in [ Time : O(n), Space : O(1) ]. This is the code using C++.

#include <iostream>
using namespace std;

struct TreeNode{
    int val;
    TreeNode *left;
    TreeNode *right;
    TreeNode(int val_)
    :val(val_), left(NULL), right(NULL){}
};

int maxFreqVal;
int maxFreqCount;

void inorder(TreeNode *node, int &curFreqVal, int &curFreqCount){
    if(node == NULL)
	return;

    inorder(node->left, curFreqVal, curFreqCount);
    if(curFreqVal != node->val){
	curFreqVal = node->val;
	curFreqCount = 1;
    }else{
	curFreqCount++;
	if(curFreqCount > maxFreqCount){
	    maxFreqVal = curFreqVal;
	    maxFreqCount = curFreqCount;
	}
    }
    inorder(node->right, curFreqVal, curFreqCount);
}

int getFreq(TreeNode *root){
    if(root == NULL)
	return -1;
	
    maxFreqVal = root->val;
    maxFreqCount = 1;
    int curFreqVal = root->val;
    int curFreqCount = 1;
    inorder(root, curFreqVal, curFreqCount);
    return maxFreqVal;
}

int main(){
    TreeNode *root = new TreeNode(10);
    root->right = new TreeNode(12);
    root->right->left = new TreeNode(12);
    root->right->left->left = new TreeNode(12);
    root->right->left->right = new TreeNode(12);
    root->right->right = new TreeNode(12);
    root->right->right->left = new TreeNode(12);
    root->right->right->right = new TreeNode(12);
    root->left = new TreeNode(6);
    root->left->left = new TreeNode(6);
    root->left->left->left = new TreeNode(4);
    root->left->left->right = new TreeNode(6);
    root->left->right = new TreeNode(6);
    root->left->right->left = new TreeNode(6);
    root->left->right->right = new TreeNode(6);
    root->left->right->right->right = new TreeNode(7);
    int result = getFreq(root);
    cout << "result : " << result << ", frequency : " << maxFreqCount << endl;
    return 0;
}

- jinil April 18, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Nice solutions!
Minor point being your solution would fail if last node of the tree has the max frequency. In this case, if(curFreqVal != node->val) condition will not be hit and hence the maxFreqVal will not be updated.

- Sid April 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Hi, Sid.
Thank you for your pointing!!
Yes, you are right. The solution didn't control the special case.
Thus I modified 'saving the result' part in the 'inorder' method.
And I added more Node to test this case. You can test it! Thanks!!

- jinil April 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Space is definitely not O(1) - you are doing recursive calls there.
If you really want a O(1) space solution, then it cant be achieved in O(n) time it will be n**2

You may even think in terms of augmented data structures, recursion, hashtable, there is no o(n) + o(1) solution

- EK MACHCHAR May 10, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

it's O(n) time and O(logn) space given that it's a balanced BST. Recursion's depth is the height of BST

- Anonymous May 25, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Could you tell me why the most frequent element 6 in your output has frequency 3? 6 appears 7 times.

- chandeepsingh85 October 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 7 vote

You could pass in a hashmap which counts up the occurrences of each element, and then select the element with the largest count, or you could just get the in-order traversal and then count the longest run.

- Anonymous April 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Using hash map will consume too much additional space.

- Anonymous April 12, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

hashmap will consume the same space as your recursive inorder

- EK MACHCHAR May 10, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Once you simplify the complexity function it is the same amount of space, but the hashmap takes up twice the amount of space as the recursive inorder traversal. If you are dealing with a tree which is tens of thousands of nodes, the space complexity can cause an issue.

- Andrew Jones May 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I think the problem needs some clarification - if there are 2 or more elements with the biggest frequency do you need to return them all or any? Do you need to return the element(s) or just the frequency? If you just need to return the frequency, an in order traversal, and keep track of the highest frequency at any given step would be enough and would be O(N) time, O(1) memory.

- EugenDu May 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

If this is one time process then the any of above solution would suffice , but if its recurring thing then we can augment the BST with node storing the frequency count for that node element , the frequency count would be incremented the decremented during the insertion and deletion in the BST. To get frequency count just do the BST traversal , find the element and output the frequency count from the node.

- gags April 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Your alternate approach will be good if we need to find the frequency of given node. The reason being a node can be present in both left or right sub-tree. So you need to add both the frequencies to get the total count. To find the node with max frequency dont you think this approach will be too complex to handle

- spider April 10, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Finding the most frequent item in a sorted list is O(n) in time and O(1) in space. (We just iterate through the list and keep track of the current most-frequent number & its frequency.)

An in-order traversal of a BST will give a sorted list.
So, at worst we should have O(n) time and O(1) space...

- brighama April 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Again, there is no O(n) in time and O(1) in space algo.
You are not tracking lets say second most frequent item and then suddenly you have the second most frequent item a lot of times.
You dont know now how many times it has occurred in the past, so now what will your algo do?

- EK MACHCHAR May 10, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

void list::Count_Occuerence_BST(node *temp)

{



	if(temp==NULL)

		return;

		if(temp->lchild ==NULL && temp->rchild ==NULL)
		{
			return;
		}
		if(temp->rchild ==NULL && temp->lchild!=NULL)
		{
			if(temp->lchild->data ==temp->data)
			{
			Increment(temp->data);
			Count_Occuerence_BST(temp->lchild);

			}
		}
		if(temp->lchild==NULL && temp->rchild !=NULL)
		{
			
				if(temp->rchild ->data==temp->data)
				{
				Increment(temp->data);
				Count_Occuerence_BST(temp->rchild);

				}
			

		}
		
				
		if(temp->lchild && temp->rchild)
		{


	if((temp->lchild->data< temp->data) && ( temp->rchild ->data > temp->data))
			{
				

			}	
	if(temp->lchild->data ==temp->data && temp->rchild->data >temp->data)
		{
			Increment(temp->data);
		}
		
	if(temp->lchild->data < temp->data && temp->rchild->data ==temp->data)
		{
			Increment(temp->data);
			
		}	
		
	if(temp->lchild->data ==temp->data && temp->rchild->data ==temp->data)
		{
			Increment(temp->data);
			Increment(temp->data);
			
		}

		   Count_Occuerence_BST(temp->lchild);
			Count_Occuerence_BST(temp->rchild);
		}
	
}


void list::Increment(int val)
{
	
	int arr[20][2];
	
	if(len==0)
	{
		arr[0][0]=val;
		arr[0][1]=2;
		return;
	}
	for(int i=0;i<len;i++)
	{
		if(arr[i][0]==val)
		{
			++arr[i][1];
		}

	}
	arr[++len][0]=val;
	arr[len][1]=2;
}

- Vidya April 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

@vidya: what is the logic?

- aka April 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

It's wrong.

- Vidya April 12, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

do a in-order traversal of the given tree, each node in the BST will also contain the frequency of occurence. We can create a max heap heap out of BST elements and give the required k most frequently elements of BST tree.

Time complexity (nlogn)
Space complexity O(n)

- googlybhai April 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

True only if we are allowed to modify the data structure.

- Anonymous April 12, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

As the question said, "in this BST we can have leftnode<=rootnode<=rightnode.". It implies that one node in this BST may have nodes of same value in its right child node or left child node or both.

So, I use post-order traversal for this BST.
In each steps in recursion, we first count the most frequently value of nodes in left child nodes and right child nodes of current node. Also, count the node which has the same value of current node. After that, comparing the counts of most frequently values in both left and right child nodes with the counts of current node value.
After comparison, this node return an array to his parent node which contains three elements:
result=[counts of value in parent node, count of most frequently value in this sub BST, count of most frequently value in this sub BST]

class Element(object):
    def __init__(self, value, lChild,rChild):
        self.value = value
        self.lChild = lChild
        self.rChild = rChild

def findMostFrequentlyElement(node,parentValue):
    value = node.value
    count = 1
    #assume 0 is not in array. If not, use other element instead.
    subLResult = [0,0,0]
    subRResult = [0,0,0]
    if(node.lChild == None and node.rChild == None):
        if(parentValue == node.value):
            return [1,0,0]
        else:
            return [0,1,node.value]
    if(node.lChild != None):
        subLResult = findMostFrequentlyElement(node.lChild,node.value)
    if(node.rChild != None):
        subRResult = findMostFrequentlyElement(node.rChild,node.value)
    result=[subLResult[0] + count + subRResult[0], subLResult[1],subLResult[2], subRResult[1], subRResult[2]]
    parentValueCount = 0
    if(parentValue == node.value):
        parentValueCount = result[0]
        if(result[1] >= result[3]):
            return [parentValueCount, result[1],result[2]]
        else:
            return [parentValueCount, result[3],result[4]]
    elif(parentValue == result[2]):
        parentValueCount == result[1]
        if(result[0] >= result[3]):
            return [parentValueCount, result[0],node.value]
        else:
            return [parentValueCount, result[3], result[4]]
    elif(parentValue == result[4]):
        parentValueCount == result[3]
        if(result[1] >= result[0]):
            return [parentValueCount, result[1],result[2]]
        else:
            return [parentValueCount, result[0], node.value]
    else:
        if(result[0] >= result[1]):
            if(result[0] >= result[3]):
                return [0, result[0],node.value]
            else:
                return [0, result[3],result[4]]
        else:
            if(result[1] >= result[3]):
                return [0, result[1],result[2]]
            else:
                return [0, result[3],result[4]]

- Paragonlight April 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

You can use inorder traverse first and store all of the nodes in an array, and then traverse the array again from left to right. Since we have a BST, I think when we use the inorder traverse we can create an sorted array. Then we just need to count the elements and find out the most frequently occurring one. I think time complexity is O(n) since we need to traverse twice. Space is O(n) since we need to use an array to store all elements. For the better algorthim we can use probably O(1) space, which means we can check and count the element while we traversing the tree.
Code in C#:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace G2013041003
{
    class Program
    {
        static void Main(string[] args)
        {
            int k=3;

            int[] array0 = { 4, 10, 15, 24, 26 };
            int[] array1 = { 0, 9, 12, 20 };
            int[] array2 = { 5, 18, 22, 30 };

            LinkedList<int>[] list = new LinkedList<int>[k];

            for(int i=0;i<k;i++)
            {
                if(i==0)
                {
                    list[i] = new LinkedList<int>(array0);
                }
                if(i==1)
                {
                    list[i] = new LinkedList<int>(array1);
                }
                if(i==2)
                {
                    list[i] = new LinkedList<int>(array2);
                }
            }

            LinkedListNode<int>[] p = new LinkedListNode<int>[k];

            for(int i=0;i<k;i++)
            {
                p[i] = list[i].First;
            }

            int minrange = -1;
            int minindex= -1;
            int maxindex = -1;
            do
            {
                minindex = -1;
                maxindex = -1;
                for (int i = 0; i < k; i++)
                {
                    if (minindex == -1 || p[i].Value < p[minindex].Value)
                    {
                        minindex = i;
                    }
                    if (maxindex == -1 || p[i].Value > p[maxindex].Value)
                    {
                        maxindex = i;
                    }
                }
                if (minrange == -1 || minrange > p[maxindex].Value - p[minindex].Value)
                {
                    minrange = p[maxindex].Value - p[minindex].Value;
                }
                p[minindex] = p[minindex].Next;
            }
            while (p[minindex] != null);
            Console.WriteLine(minrange);
            Console.Read();
        }
    }
}

- Akira6 April 11, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

{{{ int CountSame (TreeNode *node) { }}} {{{ int count = 1; }}} {{{ if (node->Left && node->Left->Data == node->Data) { }}} {{{ count += CountSame(node->Left); }}} {{{ } }}} {{{ if (node->Right && node->Right->Data == node->Data) { }}} {{{ count += CountSame(node->Right); }}} {{{ } }}} {{{ node->Visited = true; }}} {{{ } }}} {{{}}} {{{ int MaxFreq(TreeNode *root) { }}} {{{ int maxCount = CountSame(root); }}} {{{ int maxVal = root->Data; }}} {{{ queue <TreeNode *> nodeQ; }}} {{{ if (root->Left) { }}} {{{ nodeQ.push(root->Left); }}} {{{ } }}} {{{ if (root->right) { }}} {{{ nodeQ.push(root->Right); }}} {{{ } }}} {{{ while (!nodeQ.Empty()){ }}} {{{ TreeNode *curr = nodeQ.pop(); }}} {{{ if (!curr->Visited) { }}} {{{ int count = CountSame(curr); }}} {{{ if (count > maxCount) { }}} {{{ maxCount = count; }}} {{{ maxVal = curr->Data; }}} {{{ } }}} {{{ } }}} {{{ curr->Visited = false; }}} {{{ if (curr->Left) { }}} {{{ nodeQ.push(curr->Left); {{{ } }}} {{{ if (curr->Right) { }}} {{{ nodeQ.push(curr->Right); }}} {{{ } }}} {{{ } }}} {{{ return maxVal; }}} - Peevee April 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

A simpler and more efficient recursive solution, written in Python (but any OO language would do, with a few tricks to wrap multiple return values).
This solution is O(n) for time and O(1) for space [O(h), where h is max the height of the tree, if we consider the space required by the stack].

Even if both left and right subtrees might contain the same value as their parent, for a given node either its left [right] child holds the same value, or that value won't be found in its left subtree.
So for each node the number of occurrencies of its value in the subtree rooted in that node is equal to 1 plus the sum of the occurrencies of that value in the left subtree (either 0 if node->left->value != node->value or the number of occurrencies for left subtree otherwise) plus the occurrencies of the same value in the right subtree:

freq(node) = 1 + (node->left->value == node->value ? freq(node->left) : 0) + (node->right->value == node->value ? freq(node->right) : 0)

Using recursion when we compute freq(node) we take advantage of the values of freq() already computed for left and right subtrees.

def most_frequent_element(self):
    def rec_mFE(node):
      if node is None:
        return None, 0, 0
      
      freq_node = 1
      left_child = node.get_left()
      if left_child is None:
        mFE_left, freq_mFE_left, freq_child_left = None, 0, 0
      else:
        mFE_left, freq_mFE_left, freq_child_left = rec_mFE(node.get_left())
        if left_child.get_value() == node.get_value():
          freq_node += freq_child_left
      
      right_child = node.get_right()
      if right_child is None:
        mFE_right, freq_mFE_right, freq_child_right = None, 0, 0
      else:
        mFE_right, freq_mFE_right, freq_child_right = rec_mFE(node.get_right())
        if right_child.get_value() == node.get_value():
          freq_node += freq_child_right   
          
      if freq_mFE_left > freq_mFE_right:
        mFE = mFE_left
        freq_mFE = freq_mFE_left
      else:
        mFE = mFE_right
        freq_mFE = freq_mFE_right
        
      if freq_node > freq_mFE:
        return node.get_value(), freq_node, freq_node
      else:
        return mFE, freq_mFE, freq_node
    
    return rec_mFE(self.__root)

- Anonymous April 24, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Choices - O(1) space + O(n**2) time OR O(n) in both

- EK MACHCHAR May 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

int maxfreq,maxcount=0,count=0,freq;
void freqNode(node *root){
     if(!root)
              return;
     static node *inpred=NULL;
     freqNode(root->left);
     if(!inpred){
                freq=root->data;
                count=1;
     }
     else{
          if(inpred->data==root->data)
                                      count++;
          else{
               if(maxcount<count){
                                 maxcount=count;
                                 maxfreq=freq;
               }
               freq=root->data;
               count=1;
          }
     }
     inpred=root;
     freqNode(root->right);
}

- Anonymous May 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Minor Doubt:

most frequently occurring number means same number is repeating in Binary Search Tree.
Here my doubt is if we have 10,10,10 to insert in BST
we will insert first 10 as a root
then will process second 10 and will match with first 10
10 not< 10
10 not>10
so we can not insert second 10 in left or right as per these 2 above rule
Is there any rule for 10==10 to insert in BST ???

- PKT June 13, 2013 | Flag Reply

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews