Microsoft Interview Question
Software Engineer / DevelopersNot necessary to use KMP;
From my ponint of view, just traverse two strings from the beginning,
if not match, jump to the first char of next string in objective string and set the position of pattern string back to 0 and then compare again.
Multistring is composed of strings, so each of them should start after '\0' except the first one and end before next '\0'. That's my understanding.
It will find the required string in minimum no of iterations.
void getStringFromMultiString(char *source, char* dest)
{
char *temp = source;
int index = 0;
while(*temp != '\0')
{
if(strcmp(temp,dest))
{
printf("%s\n",temp);
index += strlen(temp) + 1;
temp = (temp + strlen(temp) + 1);
}
else
{
printf("%s found at Index : %d\n",dest, index);
return;
}
}
}
This is not a KMP problem, and applying it will give you the wrong answer.
If you are searching for 'dog', you would not accept 'catdog' --- KMP is a substring search, and it would.
This does assume that there are no zero-length strings, as their representation in this language would be ambiguous (an array of two zero-length strings would be \0\0\0)
char *position(char *multi, char *search) {
cursor = 0;
while(multi[cursor] != '\0') {
if(multi[cursor] != search[cursor]) {
while(multi++ != '\0');
cursor = 0;
}
else cursor++;
}
if(cursor > 0) return multi;
return -1;
}
In English:
cursor represents the length of the longest prefix matched. When we encounter a non-matching character, skip to the first character of the next string. Otherwise, increment that cursor by one. When we get a null, we've either reached the end of both strings (and cursor is nonzero) or we reached a zero-length string (and cursor is zero).
I haven't tested this yet, but it seems sound.
This solution is somewhat more efficient than the previous version that relied on str...() functions. Both run in linear time.
I have programmed the following just for fun.
// This function finds the first occurrence of "string" in
// multistring "mstr";
// An example is "mstr = cat\0dog\0monkey\0\0";
char *Find_In_MultiString( char *mstr, char *string )
{
int pos1 = 0; // 'pos1' is initialized to the start of the multistring.
int pos2 = 0; // 'pos2' is initialized to the start of the string;
int index = 0; // 'index' records the position of the first occurrence of 'string' in 'mstr';
while ( 1 ){
while ( '\0' != mstr[ pos1 ] ){
if ( mstr[ pos1 ] == string[ pos2 ]){
++pos1;
++pos2;
}
else{
break;
}
}
if ( mstr[ pos1 ] == '\0' && string[ pos2 ] == '\0' )
return &mstr[ index ];
else{
while( '\0' != mstr[ pos1++ ] );
if ( mstr[ pos1 ] != '\0' ){
index = pos1;
pos2 = 0;
}
else
return 0;
}
}
}
I don't quite get the above algo so I came up with my own:
int Find_In_MultiString( char *mstr, char *str )
{
int i=0, j=0;
while (!((mstr[i]=='\0')&&(mstr[i+1]=='\0'))){//not end of mstr
while(str[j] != '\0'){ //search for str in mstr
if (mstr[i+j] != str[j])
break;
}
if (str[j] != '\0') return i; //found
}
j=0;
i++;
}
There is a typo and alignment problem in my previous post.
int Find_In_MultiString( char *mstr, char *str )
{
..int i=0, j=0;
..while (!((mstr[i]=='\0')&&(mstr[i+1]=='\0'))){//not end of mstr
....while(str[j] != '\0'){ //search for str in mstr
......if (mstr[i+j] != str[j])
........break;
....}
....if (str[j] == '\0') return i; //found
..}
..j=0;
..i++;
}
int FindInMultiString(const char* mstr, const char* str)
{
int i = 0, j = 0;
while (true) {
while ((str[j] != '\0') && (mstr[i + j] != '\0')) {
if (mstr[i + j] != str[j]) {
break;
}
++j;
}
if (str[j] == '\0') {
return i;
}
if (!(mstr[i + j] == '\0' && mstr[i + j + 1] == '\0') {
i = i + j + 1;
j = 0;
} else {
return -1;
}
}
return -1;
}
I have tested it and it has passed all the test cases
}
One can use Rabin-Karp algorithm or Knuth-Morris-Pratt algorithm for string matching.
- srihrari January 06, 2006Let m be the length of the pattern and let n be the length of the searchable text.
Rabin-Karp O(m+n) average
O(mn) worst
KMP O(m+n) worst
Please visit
http://groups.google.com/group/cse-interview-questions/browse_frm/thread/d57c93fe9977155c
for more information