Given a string and print the w

Microsoft Interview Question for Software Engineer in Tests

0

of 0 votes

27
Answers
Given a string and print the words which have the same exact letters. Space is the only seperator.
For example, abc bbe bca derwe eeb
it will print abc bca.
abc bbe abcd derwe eeb will print nothing.
- Nina November 18, 2008 | Report Duplicate | Flag | PURGE
Microsoft Software Engineer in Test Algorithm

Email me when people comment.

An error occurred in subscribing you.

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 0 vote

Basically, this question is to find all anagrams in a given sentence.
I'll present only the idea here:
(1) for each word generate a key- for eg., for eeb, the key is "bee", i.e., sorted. The sorting takes o(n) time, where n is the length of the word. This can be achieved by using a count array of size 26 (assuming only a-z), where each slot gives how many times an alphabet occurs.

(2) After computing the key, add it to a hash map, where the key is the above key, and value is the list of words (or a pointer to it). Add the word to the value list for this key.

(3) After repeating (2) for each word in the sentence, iterate the hash map, and for each key such that the value list has more than 1 element, print the list.

SPACE: O(len) where len is the number of characters. Basically this space is needed for the hash map. But typically hash maps from library preallocate space, so the actual space used depends on the library characteristics.

TIME: O(len). this is because keys are generated on the fly while traversing the sentency from left to right. The count array is reset at the start of the next word, which is after a space.

- acoader@gmail.com November 19, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Group all words of equal size and add the elements and determine the total. Words with equal total will have the same distribution of words. ASCII value of a=97, b=98, c=99, d=100, e=101
abc => a + b + c => 97 + 98 + 99 => 294
bbe => b + b + e => 98 + 98 + 101 => 298
bca => b + c + a => 98 + 99 + 97 => 294
eeb => e + e + b => 101 + 101 + 99 = 301

Since can store the above in a table and all those with the same total will have the same words.

- Krish November 20, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

This technique might not work as the total could be easily recounted for some other set of alphabets ... for example, ada/aad(97+97+100) would yield the same total as that of abc, which do not contain same letters. So I guess this approach might not work.

- Anonymous November 20, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Can you please explain your code?

- Anonymous November 20, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1) Associate a Prime number with each letter.
2) Hash by product of the word letter.
3) Maintain a list of words for each key
4) Iterate and print all lists having more than one word

- Anonymous November 20, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Great idea!

- XYZ November 20, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

however, will not work if the word is very long. The product of all characters of this word will overflow easily.

- XYZ November 20, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

For each word, compute an array letter[26]={}.
For example abc 's array is { 1, 1, 1, 0, ..... 0}.
Then view each array as a binary number and compute the corresponding decimal number.
Sort all the decimal result and print out the words have the same decimal numbers.

- jordan November 21, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

what if the word as repeating alphabets? The binary notation may fail.

- Anonymous November 28, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Ze code:

#include <map>
#include <string>
#include <vector>
#include <iostream>
using namespace std;

void Tokenize(const string& str,
vector<string>& tokens,
const string& delimiters = " ")
{
// Skip delimiters at beginning.
string::size_type lastPos = str.find_first_not_of(delimiters, 0);
// Find first "non-delimiter".
string::size_type pos = str.find_first_of(delimiters, lastPos);

while (string::npos != pos || string::npos != lastPos)
{
// Found a token, add it to the vector.
tokens.push_back(str.substr(lastPos, pos - lastPos));
// Skip delimiters. Note the "not_of"
lastPos = str.find_first_not_of(delimiters, pos);
// Find next "non-delimiter"
pos = str.find_first_of(delimiters, lastPos);
}
}

void printSame(string sent) {
map<string, vector<string> > anagrams;
string hash_key;
vector<string> words;
Tokenize(sent, words, " ");
for (int i = 0; i < words.size(); i++) {
hash_key = words[i];
sort(hash_key.begin(), hash_key.end());
anagrams[hash_key].push_back(words[i]);
}
map<string, vector<string> >::iterator itr;
vector<string> curAnas;
for (itr = anagrams.begin(); itr != anagrams.end(); itr++) {
curAnas = itr->second;
if (curAnas.size() > 1) {
for (int i = 0; i < curAnas.size(); i++) {
cout << curAnas[i] << ' ';
}
}
}
}

int main(int argc, char** argv) {
string sentence(argv[1]);
printSame(sentence);
return 0;
}

- Anonymous November 21, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

A better solution
for each letter assign a digit > 0 e.g. a=1, b=2 ...

now compute the sum,multiplication and length triplet for each word

e.g. abc => (6, 6, 3)
cba => (6, 6, 3)
aad => (6, 8, 3)

use triplet as a key in Hashmap and let list of string be the value stored against it.

Traverse the Hashmap and print all the values.

- Anonymous November 21, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

multiplication - can cause overflow, so this solution will not work in those cases..(even addition can cause overflow).

- acoader November 22, 2008 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Algorithm:
struct word
{
int len; // length of word;
int alpha[26]; // count a to z numbers
int iStart; // start char index in string
}

1. go through the whole string; find every word and fill the data of every word.
O(n) where n is the length of the string.
Space: sizeOf(word) x m, where m is the number of word.
2. Sort on m word by word.len, O(m);
Ignore those lens that do duplicates. Compare their alpha[i], i=0 to 25 for those have the same length. If same, print out.
This is the tough part.

- kulang December 02, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

void sortstr(char *str)
{
int i=0,j=0;
char ch;
int len=strlen(str);
for(i=0;i<len;i++)
{
for(j=0;j<len-1-i;j++)
if(str[j]<str[j+1])
{
ch=str[j];
str[j]=str[j+1];
str[j+1]=ch;
}
}
}

void anagram(char *str)
{
char *tmp;
char arr[100][100]={'\0'};
int i=0,j=0,k=0,len=0,spaces=0,flag=0;
while(str[i]!='\0')
{
if(str[i++]==' ')
spaces++;
}
len=strlen(str);
if(len>0)
{
tmp=(char *)malloc(len);
memset(tmp,'\0',len);
for(i=0,k=0;str[i];i++)
{
j=0;
while(str[i]!=' '&&str[i]!='\0')
{
tmp[j++]=str[i++];
}
sortstr(tmp);
strcpy(arr[k],tmp);
k++;
memset(tmp,'\0',len);
}
for(i=0;i<spaces+1;i++)
{
for(j=i+1;j<spaces+1;j++)
{
if(!strcmp(arr[i],arr[j]))
flag=1;
}
if(flag)
{
flag=0;
printf("%s\n",arr[i]);
}

}
}
}

int main()
{
char str[] = "abc hello bca elloh dell till hir";
anagram(str);
getch();
return 0;
}

- santosh December 03, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

O(n) algo:
=========use std::set<string>=====
1. traverse the sentence. For each word, reform the word as follows
: "aab"->"a2b1". "ccaa"->"a2c2".

For each char c in the every word, Use int alphabet[26]: alphabe[c-'a']++. Then convert alphabet[26] to string like "a2c2". Put the modified string into the set.

O(n) traversal in total.
==========
2. for the next word, if set.find() is true, then it is an anagram.
The set is usually implemented by binary tree, so O(logN) for find. You can use hashtable as well, it would be O(1).

- XYZ December 04, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Sort all words in string. Then use trie to find matches.

- Anonymous January 19, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

How about this?
1. go through the sentence, for each word, sort it to the standard form. That is, abc->abc, bca->abc, eeb->bee, etc. But always keep a pointer to its original form. For each word, we need O(m) by using counting sort. Totally we need O(mn), n is # of words.
2. We let a->1, b->2, etc., then we convert each word to a decimal number. For example, abc->123. aad->114. It takes O(mn) in total.
3. Sort the array. For repeated numbers, print the original form. O(nlogn).

- summer February 17, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

probably this is a better "step 3" in your algorithm. Once we mapped every word to a decimal number at "step 2", we just need to scan the whole list, and use a hash table to check whether there are duplicates.

for instance,
input: abz eeb bza
step1: abz eeb abz
step2: 1126 552 1126

- XW December 25, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Can it be done in more simpler way: O(n * 2)
Read all words in array 'inputWords'

for i = 0 to inputWords.length {
    for j = 0 to inputWords.length {
        if(i == j || inputWords[i].length != inputWords[j].length) 
           continue;
        String lhs = inputWordsp[i];
        String rhs = inputWordsp[j];
        // Fetch each character of lhs, search the first occurence 
        // and replace it with blank
        // When all characters of lhs are exhausted, if rhs is blank
        // then lhs and rhs are anagrams.

    }
}

- Anonymous March 03, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

sort the words .. i.e. acb --> abc etc..

then do aX (101)^1 + bX(101)^2 + cC(101)^3 and so on...
put the products into hashmaps..

if any two have a hash collision they match..
this something similar to Rabin Karp algo
http://en.wikipedia.org/wiki/Rabin-Karp_string_search_algorithm

- Anonymous April 15, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

sort the words .. i.e. acb --> abc etc..

then do aX (101)^1 + bX(101)^2 + cX(101)^3 and so on...
put the products into hashmaps..

if any two have a hash collision they match..
this something similar to Rabin Karp algo
http://en.wikipedia.org/wiki/Rabin-Karp_string_search_algorithm

- Anonymous April 15, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

There are some improvements that can be made.. like conversion between stringstream to string once etc..

string finddups(string in){
	string out="";
	map<string,string> sign,matches;
	
	if(in.size()==0)
		return out;
	
	stringstream ss(in);
	string word;
	while(ss >> word){
		string sorted(word);
		sort(sorted.begin(),sorted.end());
		if(sign.find(sorted) != sign.end())
			matches.insert(pair<string,string>(sorted,word));
		else
			sign.insert(pair<string,string>(sorted,word));
	}
	if(matches.size()==0)
		return out;
	
	word="";
	stringstream ss1(in);
	while(ss1 >> word){
		string sorted(word);
		sort(sorted.begin(),sorted.end());
		if(matches.find(sorted) != matches.end())
			out +=word +" ";		
	}
   
	
	return out;
}

- prolificcoder April 23, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

The idea is to use two hashes(kind of expensive but in interview this is straight forward to code) one- 'sign' to hold the sortedorder -> word. Another checks with the sortedorder hash and if present inserts into 'match' hash. Now again iterate over the string and if the sorted order of that string is present in match spit it out.

Idea is based on non-repeated chars from PIE

- prolificcoder April 23, 2009 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

1.sort the letters in each words...n-words...max m-letters...O(nm)
2.sort the words with sorted letters lexicographic order...O(nlogn)
3.scan the sorted words...find the repeated words.O(n)..
total...O(nm+nlogn)..

- sd December 05, 2009 | Flag Reply

Comment hidden because of low score. Click to expand.

-2

of 0 vote

// Solved this way. Is there a better approach (without STL and other libraries)?

#define INPUT_STR "abc sjhd oahuf iad p pofdf ohdc pod oijsdf cba oidsf dai oif p iodf d oijsdf upls osofs psdf pjscp sfdo psj cpojd sdfop sjiof pncdis adi siojf k psfj o oisjdf e osdjf p ojsdf lkioodf"

template <class T>
void qsort(T *ap_arr, int a_start, int a_end) // Quick sorting algorithm
{
if (a_start < a_end)
{
int i, j, m;
T c;
for (j = a_end, i = a_start, m = 1; i < j; m > 0 ? j-- : i++)
{
if (ap_arr[i] > ap_arr[j])
{
c = ap_arr[i];
ap_arr[i] = ap_arr[j];
ap_arr[j] = c;
m = -m;
}
}
qsort(ap_arr, a_start, i - 1);
qsort(ap_arr, i + 1, a_end);
}
}

struct WordDesc // A struct to remember the found words
{
WordDesc *mp_nextDesc;
unsigned int m_checkSum; // A chack sum of a word
char *mp_start; // Points to the beginning of the word.
unsigned int m_len; // The lenght of the word (stored for the sake of speed).
};

int _tmain(int argc, _TCHAR* argv[])
{
char inputStr[] = INPUT_STR;
char *strBeg = inputStr;
char *strIdx = strBeg;
WordDesc *p_listHead = NULL, *p_listTail = NULL; // List pointers
while (' ' == *strIdx) strIdx++; // Skip spaces in the beginning

while (*strIdx != 0)
{
strBeg = strIdx;
unsigned int len = 0;
while ((' ' != *strIdx) && (0 != *strIdx)) // Get the length of the first word
{
strIdx++;
len++;
}

char *p_tmpStr = new char[len]; // Since we can not change words, we have to create a copy for sorting
if (p_tmpStr)
{
for (int i = 0; i < len; i++) p_tmpStr[i] = strBeg[i];
qsort(p_tmpStr, 0, len - 1); // Sort it

unsigned int checkSum = 0;
for (int i = 0; i < len; i++)
{
checkSum += p_tmpStr[i] * (i + 1); // "Unique" check sum
}
checkSum += len;

delete [] p_tmpStr;

WordDesc *p_newItm = new WordDesc; // Create a struct
if (p_newItm)
{
if (NULL == p_listTail)
{
p_listHead = p_newItm; // Init the head
}
else
{
p_listTail->mp_nextDesc = p_newItm; // Add to the list
}

p_listTail = p_newItm; // Init the struct
p_newItm->mp_nextDesc = NULL;
p_newItm->m_len = len;
p_newItm->m_checkSum = checkSum;
p_newItm->mp_start = strBeg;
}
else
{
break; // The result is likely to be wrong due to no memory for proper analysis
}
}
else
{
break; // The result is likely to be wrong due to no memory for proper analysis
}
while (' ' == *strIdx) strIdx++;
}
// Now we have to analyse the result and while we traverse it, we also clean it
while (p_listHead)
{
unsigned int doPrint = 0;
p_listTail = p_listHead;

while (p_listTail) // Is there a struct with the same check sum as the first one in the list?
{
p_listTail = p_listTail->mp_nextDesc;
if (p_listTail)
{
if (p_listHead->m_checkSum == p_listTail->m_checkSum)
{
doPrint = p_listHead->m_checkSum; // Yes, we found one.
break;
}
}
}

p_listTail = p_listHead;
if (doPrint) // There are structs with equal check summs, so we print those and delete
{
WordDesc *p_listPrev = NULL;
while (p_listTail)
{
if (p_listTail->m_checkSum == doPrint)
{
for (int i = 0; i < p_listTail->m_len; i++) printf("%c", p_listTail->mp_start[i]);
printf("\n");

if (p_listPrev)
{
p_listPrev->mp_nextDesc = p_listTail->mp_nextDesc;
delete p_listTail;
p_listTail = p_listPrev->mp_nextDesc;
}
else
{
p_listHead = p_listTail->mp_nextDesc;
delete p_listTail;
p_listTail = p_listHead;
}
}
else
{
p_listPrev = p_listTail;
p_listTail = p_listTail->mp_nextDesc;
}
}
}
else
{
p_listHead = p_listHead->mp_nextDesc; // The chack summ in the first structure of the list is unique, so we delete it.
delete p_listTail;
p_listTail = p_listHead;
}
}
return 0;
}

- AL November 20, 2008 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

do you really expect people to go through this?

- NK June 01, 2009 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@AL whats wrong with you ?

who will read this code? you could have posted the idea alone.

- peace November 12, 2009 | Flag

CareerCup

Microsoft Interview Question for Software Engineer in Tests

Books

Videos

Resume Review

Mock Interviews