Need to write a efficient algo

Interview Question

1

of 1 vote

11
Answers
Need to write a efficient algo. how to find Max repeated word from a Given String for Example "How do you do" the result would be "2 times repeated word is "do""
- sunny gupta July 22, 2013 in India | Report Duplicate | Flag | PURGE
Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: Written Test

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

Use a hashtable with words as keys and occurrence word-count as values. Also make use of 2 running variables 1) max_repeated_word and 2)max_count.

0. Every time a word is read from the input, check if it is already present in the hash table. If not add to it with word-count =1.
1. If the word is already present, increment the word-count. Now check this word-count against max_count. If word-count > max_count, max_count := word-count and max_repeated_word := current word.
2. Repeat steps 1 and 2 until all the input is read.
3. print out max_repeated_word

- Murali Mohan July 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Hi Ayahuasca ,
Is there any possibility to put the entire string in array and then we can write an algo.

- Anonymous July 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Anonymous,

Without using extra space, or by using a constant amount of extra space, I can think of a naive algorithm, that given a word, compares it with all other words to find out the one with maximum occurrence count. Time complexity for this would be O(n^2).

Alternatively, in order to be more space-efficient, we can use a trie to save the words and maintain a count variable at each node. This saves space in the sense, all the words that have a common prefix will have the prefix characters specified only once in the trie.

While building the trie, keep track of the word that has maximum occurrence count in the list of words seen so far.

- Murali Mohan July 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Here is the code to solve this problem using array with O(n) time complexity:
a. First for all the words calculate the key value and make an array of the key value. Here key value is calculated by multiplying the integer value of character by 10. Also if the character is uppercase then it is converted to lower case as ARe and are are same words.
b. Now if the words are repeated then they will have the same key value. So now problem reduces to find the number in the array repeated maximum number of times.
c. This is done by taking the count array and initializing it to 0 and further incrementing the count value for every key value of the array taken as index.
d. Then take that key value and then using that key find the corresponding index in the string array. (here it is found by using the space count)
e. Then take that start index and then using that start index print the word till the space is not encountered.

#include <stdio.h>
#include <conio.h>
#include <string.h>
#include <math.h>
#include <limits.h>
#define MAX_SIZE 32768
void findRepeatedWord(char str[],int n)
{
	int i,k=0;int num=0,numArray[MAX_SIZE];
	for(i=0;str[i]!='\0';i++)
	{
		if(str[i]>=65&&str[i]<=90)
		{
			str[i]=str[i]+32;
		}
		if(str[i]!=' ')
		{
			num=num+10*str[i]-str[i+1];
		}
		else
		{
			numArray[k]=num;
			k=k+1;
			num=0;
		}
	}
	numArray[k]=num;
	k=k+1;
	int count[MAX_SIZE];
	int max_count=0;
	for(i=0;i<MAX_SIZE;i++)
	{
		count[i]=0;
	}
	for(i=0;i<k;i++)
	{
		count[numArray[i]]++;
	}
	int max=count[0],max_index,idx;
	for(i=0;i<MAX_SIZE;i++)
	{
		if(count[i]>max)
		{
			max=count[i];
			max_index=i;	
		}
	}
	for(i=0;i<k;i++)
	{
		if(max_index==numArray[i])
		{
			idx=i;
			break;
		}
	}
	int start_index,count_space=0;
	for(i=0;i<n;i++)
	{
		if(idx==count_space)
		{
			start_index=i;
			break;
		}
		if(str[i]==' ')
		{
			count_space=count_space+1;
		}
	}
	printf(" Maximum word repeated is ");
	for(i=start_index;str[i]!=' ';i++)
		printf("%c",str[i]);
	printf(" with %d number of count ",max);
}
int main()
{
	char str[]="Are are you are my boy are";
	int n=strlen(str);
	findRepeatedWord(str,n);
}

- vgeek July 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

>> a. First for all the words calculate the key value and make an array of the key value. Here key value is calculated by multiplying the integer value of character by 10. Also if the character is uppercase then it is converted to lower case as ARe and are are same words.

Your method will not work if, along with "are", there are words like "ear" and "era" in the sentence, as they all compute to the same value. Even if you change the computation of key to something like multiplying each character by 10^x(or by using some radix(r), r^x) based on the position of the character in the word, the chances of collision are high.

For that matter, no matter what mapping function you use to map the strings to (a finite set of) numbers, there are going to collisions and to resolve them, you will have to go for a data-structure like a hashtable. A hashtable uses hash-functions(which do better distribution of numbers) and readily provides you a collision resolution mechanism.

- Murali Mohan July 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Thanks for pointing this out. I made a foolish mistake by not considering the repeated values. But now i have updated the code. It will work for every string. Please tell me where it will not work. And the updation is done on the key value.

- vgeek July 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Use a Trie , store the count of the words at the leaves, keep track of max count.

- EOF July 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Storing count at the leaves may not work. How do you track the count if the sentence contains words as "bet"and "better"?

- Murali Mohan July 22, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

sort the words... and count the repeated words by comparing adjacent words... time complexity O(nlogn)...

- Anonymous July 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Build a suffix tree in linear time with respect to length of string.

Check for words starting at the root beginning with a non alpha character. The word ends when you hit another alpha char. Check for special cases obviously. If you detect a word, count how many end chars there are further down the tree. The branch starting with a complete word and has the most end leaves is your winner.

In your example, the tree will have a branch that goes
<space> <d> <o> <end>
<space><y><o><u><space><d><o><end>

CareerCup

Interview Question

Books

Videos

Resume Review

Mock Interviews