Facebook Interview Question
Software Engineer InternsCountry: United States
Interview Type: Phone Interview
your algorithm returns match for string that doesn't have "*" or "." ex, sA="abcdg" and sB="aaag",it returns match,because you only check the last char for the strings that dont have "*" and ".".
Simple recursive approach :
static bool isMatch(string regex, string s)
{
return IsMatch(regex, s, regex.Length - 1, s.Length - 1);
}
static bool IsMatch(string regex, string s, int i, int j)
{
if (i < 0)
{
return j < 0;
}
bool result = false;
if (j >= 0)
{
if (regex[i] == '.' || regex[i] == s[j]) result |= IsMatch(regex, s, i - 1, j - 1);
}
if (regex[i] == '*') result |= IsMatch(regex, s, i - 2, j) || IsMatch(regex, s, i - 1, j);
return result;
}
Does * also mean any repetitions of the character before it ? So for instance
isMatch("a*", "aaa") //True
+ (BOOL)isMatch:(NSString *)expression :(NSString *)string;
{
if(expression==nil || string==nil) return NO;
if(!expression.length && !string.length) return YES;
if([expression hasPrefix:@"*"] || [expression hasPrefix:@"."]) return NO;
NSMutableString *expreCopy = [NSMutableString stringWithString:expression];
for(int i=0;i<expression.length;i++){
unichar c = [expression characterAtIndex:i];
unichar ch = '*';
if (c == ch) {
[expreCopy replaceCharactersInRange:NSMakeRange(i-1, 2) withString:@"**"];
}
}
expression = [expreCopy stringByReplacingOccurrencesOfString:@"**" withString:@""];
if(expression.length != string.length)return NO;
for(int i=0;i<expression.length;i++){
BOOL match = ([expression characterAtIndex:i] == '.') || ([expression characterAtIndex:i] == [string characterAtIndex:i]);
if(!match) return NO;
}
return YES;
}
s1 contains special chars, s2 is only a-z.
Iterative:
boolean isMatch(String s1, String s2) {
// i belongs to s1, j belongs to s2
int i, j;
for (i = s1.length()-1, j = s2.length-1; i >= 0 && j >= 0; i--, j--) {
// first check if it is *
if (s1.charAt(i).equals('*')) {
i--;
j++; // after next condition j will be same but i will be - 2 times
} else if (!s1.charAt(i).equals('.'))
if (!s1.charAt(i).equals(s2.charAt(j))
return false;
}
// If i and j are both not zero at the end, it is not a match, else match
return i != j;
}
Recursive:
boolean isMatch(String s1, String s2) {
// recursive implementation keeps shortening both strings from the right to left
if (s1.length() == 0 && s2.length() == 0) return true;
if (s1.length() != 0 && s2.length() == 0) return false;
if (s1.length() == 0 && s2.length() != 0) return false;
String s1next = s1.substring(0,s1.length()-1);
String s2next = s2.substring(0,s2.length()-1);
boolean result = true;
// check '*'
if (s1.charAt(s.length()-1).equals('*')) {
s2next = s2;
s1next = s1.substring(0,s1.length()-2);
} else { // if not '*', evaluate current result
result = s1.charAt(s1.length()-1).equals(s2.charAt(s2.length()-1);
}
return result && isMatch(s1next, s2next);
}
public class RegularExpression {
// s1 regex s2 match
public static boolean isMatch(String s1, String s2) {
// recursive implementation keeps shortening both strings from the right to left
if (s1.length() == 0 && s2.length() == 0) return true;
int i, j;
boolean point = false;
boolean star = false;
for(i = s1.length() - 1, j = s2.length() - 1;i >= 0 && j >= 0;) {
if(s1.charAt(i) == s2.charAt(j)) {
i--;
j--;
continue;
}
if(s1.charAt(i) == '.') {
i--;
j--;
point = true;
continue;
}
if(s1.charAt(i) == '*') {
j -= 2;
star = true;
continue;
}
if(s1.charAt(i) != s2.charAt(j)) {
if(star) {
if(s1.charAt(i) == s2.charAt(j + 1)) {
i--;
star = false;
continue;
}else
return false;
}
return false;
}
}
return true;
}
public static void main(String []argc) {
System.out.println(isMatch("a*", ""));
System.out.println(isMatch("Co.s*", "Cos"));
System.out.println(isMatch("Co.s", "Coas"));
System.out.println(isMatch("Co.sb*", "Coss"));
System.out.println(isMatch("a*abc", "abc"));
}
}
Otherwise this would be simple edit distance with minor modifications : substitutions not allowed, if one of the characters is * then deletion cost is 0 otherwise 1, if one of the characters is . then insertion cost is 0 otherwise 1, in the end if the edit distance returns a non-zero distance then false, otherwise true.
yep, but I'm looking for some linear solution, because with edit distance I only know a dp solution, or is the case that exists a non-dp approach?
IsMatch(string s1, string s2)
{
int n1 = s1.Length;
int n2 = s2.Length;
int j = 0;
while(i < n1 && j <n2 )
{
if (i < n1-1 && s1[i+1] == '*')
{
i +=2;
if (i == n1) break;
}
if (s2[j] == s1[i] || s1[i] == '.')
j++;
else
break;
}
if (i == n1 && j == n2) return true;
return false;
}
I'm not sure if your algorithm works correctly, in the comparison between s2[j] and s1[i], why don't you increase i?. And why do you always skip the character before *?, some other examples:
isMatch("a*abc", "abc") = true;
isMatch("a*a*a*", "a") = true;
isMatch("a*bd*c","abc") = true;
Adjust code, this should work
bool IsMatch(string s1, string s2)
{
int n1 = s1.Length;
int n2 = s2.Length;
int j = 0;
while(i < n1 && j <n2 )
{
if (s1[i] == s[j] || s[i] == '.')
{
i++;
j++;
}
else if (i < n-1 && s1[i+1] == '*')
{
i+=2;
}
else if (s1[i] == '*')
{
i++;
}
else
{
return false;
}
}
if (j < n2) return false;
if (i < n1)
{
if (s1[i] == '*') i++;
while (i < n1-1 && s1[i+1] == '*')
{
i+=2;
}
}
if (i == n1 && j == n2) return true;
return false
}
FYI: Qus filler has commented missed clause in qus:
[I forgot to clarify that * can only delete the character before it, but * can be skipped. So
"a*" is "a" or is an empty string ""]
Approach:
-Traverse both of the strings from right side
-In case found char [a-z] in str_1 just match this particular char in str_2
-In case found [.] in str_1 skip one char in str_2
-In case found [*] in str_1, check the char before [*] and check also no. of time it is repeating consecutively say N.
[For ex: abbb*c : Here before [*] char 'b' is repeating 3 times]
-Now check in str_2 it should have either (N) or (N-1) no. of time char 'b'
[As in case of abbb*c valid str are: {abbc} & {abbbc}]
-Repeat above steps until anything unmatch not occurs or string ends.
#include<stdio.h>
#include<string.h>
void ismatch(char str1[], char str2[])
{
int i, j, len= strlen(str1), flag=0;
for(i=0; i<len; i++)
{
if(str1[i]=='*')
{
str1[i-1]= str1[i+1];
}
for(j=0; j<len; j++)
{ str1[i+2] = str1[i];
}
len= len-2;
}
for(i=0; i<len; i++)
{
if(str1[i]== str2[i])
flag=0;
else if(str1[i]=='.')
flag=0;
else
{
flag=1;
printf("False");
break;
}
}
if (flag==0)
printf("true!!");
}
int main()
{char str1[10], str2[10];
printf("Enter two strings... \n");
scanf("%s", str1);
scanf("%s", str2);
ismatch(str1, str2);
return 0;
}
javacode:
public class MatchStr {
public static void main(String[] args) {
// String s1 = "a*abc";
// String s2 = "abc";
// String s1 = "a*a*a";
// String s2 = "a";
// String s1 = "a.";
// String s2 = "ab";
// String s1 = "a*bd*c";
// String s2 = "abdc";
String s1 = "ada*d*a";
String s2 = "ada";
System.out.println(isMatch(s1,s2,0,0));
}
public static boolean isMatch(String s1,String s2,int n1,int n2){
int l1 = s1.length();
int l2 = s2.length();
System.out.println("n1:"+n1+"-----n2:"+n2);
if(n1 > l1 || n2 > l2 ){
return false;
}else if(n1 == l1 && n2 == l2){
return true;
}else{
if(n2 < l2 && n1 < l1 && ((s1.charAt(n1) == s2.charAt(n2)) || s1.charAt(n1) == '.')){
return isMatch( s1, s2, n1+1, n2+1);
}else if( n1 < l1 && s1.charAt(n1) == '*'){
if(n2>0){
return isMatch( s1, s2, n1+1, n2-1) || isMatch( s1, s2, n1+1, n2);
}else{
return isMatch( s1, s2, n1+1, n2);
}
}else{
if(n1+1 < l1 && s1.charAt(n1+1) == '*'){
return isMatch( s1, s2, n1+2, n2);
}else{
return false;
}
}
}
}
}
public static boolean match(String s1, String s2) {
if (s1.equals(s2))
return true;
if (isHeadWithAsterisk(s1) && match(s1.substring(2), s2))
return true;
if (isHeadWithAsterisk(s2) && match(s1, s2.substring(2)))
return true;
if (!s1.isEmpty() && !s2.isEmpty()) {
if (s1.charAt(0) == '*') return match(s1.substring(1), s2);
if (s2.charAt(0) == '*') return match(s1, s2.substring(1));
return matchChars(s1.charAt(0), s2.charAt(0)) &&
match(s1.substring(1), s2.substring(1));
} else {
return false;
}
}
private static boolean isHeadWithAsterisk(String s1) {
return s1.length() > 1 && s1.charAt(1) == '*';
}
private static boolean matchChars(char c1, char c2) {
return c1 == c2 || c1 == '.' || c2 == '.';
}
I think you missed the case where its like b*dc and bdc, the person who posted the question forgot to mention this in the question and then specified it later,in pne of the below comments
how about 2 counters i&j going from m and n.
while(i<m&&j<n){
if (str1.charAt(i)==str2.charAt(j) || str1.charAt(i)=='.') i++,j++
else(str1.charAt(i+1)=='*') i+=2, j++;
else return false;
}
if(str1.length()==i and str2.length==j)
this is O(m+n)
-(BOOL)isTheMatch:(NSMutableString *)first withString:(NSString *)second{
BOOL result = YES;
for (NSUInteger i = [second length]-1; i<= 0; i--) {
NSString *firstChar = [first substringFromIndex:i];
NSString *secondChar = [second substringFromIndex:i];
if ([firstChar isEqualToString:@"*"]) {
[first deleteCharactersInRange:NSMakeRange(i-1, 1)];
continue;
}
else if ([firstChar isEqualToString:secondChar]){
continue;
}
else if ([firstChar isEqualToString:@"."]){
continue;
}
else{
result = NO;
break;
}
}
return result;
}
Is recursion an inefficient way to handle this? Here's a fully functional php script which reverses both strings for comparison and then recurses only if the it finds a successful "*" match, in order to determine whether to skip or match it. I also assume that ".*" is not legal.
<?php ;
$testCases = array(
array('abc', 'abc'),
array('a.c', 'abc'),
array('a*bc', 'abc'),
array('a*bc', 'abcd'),
array('a*a*abc', 'abc'),
array('a*a*bc', 'abc'),
array('abc*', 'abc'),
array('abc*def', 'abcdef'),
);
function re($re, $str) {
return re_rev(strrev($re), strrev($str));
}
function re_rev($re, $str) {
$ri = 0;
$si = 0;
while (true) {
$rs = $ri < strlen($re) ? $re[$ri] : NULL;
$ss = $si < strlen($str) ? $str[$si] : NULL;
//Are either empty?
if (! ($rs && $ss)) {
//Yep. is $str left?
if ($ss) return false;
//Clear out remaining '*'
while ($rs == '*') {
$ri += 2;
$rs = $ri < strlen($re) ? $re[$ri] : NULL;
}
//$re left?
if ($rs) return false;
//Nope. Good!
return true;
}
//simple match?
if (($rs == '.') || $rs == $ss) {
$ri += 1;
$si += 1;
continue;
}
if ($rs == '*') {
//Match?
if ($re[$ri+1] != $ss) {
//No. Skip.
$ri += 1;
continue;
}
//A match. Try skipping it first.
$newRe = substr($re, 1);
$newStr = substr($str, 1);
if (re($newRe, $newStr)) {
//Matched fully with skip. Good!
return true;
}
//Skip didn't match. Continue with this matched.
$ri += 2;
$si += 1;
continue;
}
//Fail.
return false;
}
}
foreach ($testCases as $arr) {
$ret = re($arr[0], $arr[1]);
echo ($ret ? 'y' : 'n') . " -- " . $arr[0] . ' / ' . $arr[1] . "\n";
}
die("DONE");
?>
Finite state machine. Sudo code:
{{
fun IsMatch(s1, s2):
pos1 = 0; pos2 = 0;
while (pos1 < s1.Length && pos2 < s2.Length)
match = false;
if (s1[pos1] == s2[pos2] || s1[pos1]=='.') //A charachter match
match = true;
pos1++; pos2 ++;
if (s1[pos1] == '*') //Ignorable *
match = true;
pos1++; //pos2 doesn't change
if (!match && pos < s1.Length - 1 && s1[pos+1] == '*')
//Deleted charachter match
match = true;
pos1+=2; pos2 ++;
if (!match)
break;
if (match) print "Match"
}}
I have never been on a technical interview, so please don't laugh at me... But how do I understand if I can use something already built-in in the language or whether I have to write everything from scratch?
For example, in this situation, why can't we just write something like this:
public boolean isMatch(String s1, String s2){
s1 = s1.replaceAll("\\*", "?");
Pattern p = Pattern.compile(s1);
Matcher m = p.matcher(s2);
return m.matches();
}
?
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
/*
Given a regular expression with characters a-z, ' * ', ' . '
the task was to find if that string could match another string with characters from: a-z
where ' * ' can delete the character before it, and ' . ' could match whatever character.
* ' * ' always appear after a a-z character.
Example:
isMatch("a*", "") = true;
isMatch(".", "") = false;
isMatch("ab*", "a") = true;
isMatch("a.", "ab") = true;
isMatch("a", "a") = true;
*/
namespace fb_regex
{
class Program
{
static bool isMatch(string pattern, string input)
{
if (pattern == null)
throw new ArgumentNullException("pattern");
if (input == null)
throw new ArgumentNullException("input");
if (pattern.Length >= input.Length)
{
int i = 0, j = 0;
while (i < pattern.Length && j < input.Length)
{
if (pattern[i] != '.' && pattern[i] != input[j])
{
// Ignore the current pattern character
if (i + 1 < pattern.Length && pattern[i + 1] == '*')
{
i += 2;
}
else
{
// The pattern and the input just don't match
return false;
}
}
// characters match
else
{
i++;
j++;
// pattern: a*ab (same character before and after star)
if (i+1 < pattern.Length && pattern[i] == '*' &&
pattern[i-1] == pattern[i+1])
{
// try matching with 1) consuming the * 2) ignoring the *
return isMatch(pattern.Substring(i + 1), input.Substring(j - 1)) ||
isMatch(pattern.Substring(i + 1), input.Substring(j));
}
// character match, ignore the *
else if (i < pattern.Length && pattern[i] == '*')
{
i++;
}
}
}
// pattern that matches the empty string
for (; i < pattern.Length; i += 2)
{
if (i + 1 == pattern.Length || pattern[i + 1] != '*')
return false;
}
// no more pattern, but still some input left
if (j < input.Length)
{
return false;
}
return true;
}
else
{
return false;
}
}
static void Main(string[] args)
{
Debug.Assert(isMatch("abcd", "abcd"));
Debug.Assert(isMatch("a*", ""));
Debug.Assert(isMatch("a", "a"));
Debug.Assert(isMatch("a*", "a"));
Debug.Assert(isMatch("", ""));
Debug.Assert(isMatch("aa*b", "aab"));
Debug.Assert(isMatch("aa*b", "ab"));
Debug.Assert(isMatch("a*a*a*.", "b"));
Debug.Assert(!isMatch("a*a*a*.c", "b"));
Debug.Assert(isMatch("a*a*a*.c", "bc"));
Debug.Assert(isMatch("a*a*a*.c", "abc"));
Debug.Assert(isMatch("a*a*a*.c", "aaabc"));
Debug.Assert(!isMatch("a*a*a*.c", "aaaabc"));
Debug.Assert(isMatch("a*b*c*d*", "c"));
Debug.Assert(isMatch("a.", "ab"));
Debug.Assert(!isMatch("", "a"));
Debug.Assert(!isMatch(".", ""));
Debug.Assert(isMatch("b*.b*", "bab"));
Debug.Assert(isMatch("b*.b*", "ab"));
Debug.Assert(isMatch("b*.b*", "ba"));
Debug.Assert(!isMatch("b.b*", "babc"));
Debug.Assert(isMatch("a*abc", "abc"));
Debug.Assert(isMatch("ab*c*d.", "abdg"));
Debug.Assert(isMatch("abc*def", "abcdef"));
}
}
}
+ (BOOL)isMatch:(NSString *)expression :(NSString *)string;
{
if(expression==nil || string==nil) return NO;
if(!expression.length && !string.length) return YES;
if([expression hasPrefix:@"*"] || [expression hasPrefix:@"."]) return NO;
NSMutableString *expreCopy = [NSMutableString stringWithString:expression];
for(int i=0;i<expression.length;i++){
unichar c = [expression characterAtIndex:i];
unichar ch = '*';
if (c == ch) {
[expreCopy replaceCharactersInRange:NSMakeRange(i-1, 2) withString:@"**"];
}
}
expression = [expreCopy stringByReplacingOccurrencesOfString:@"**" withString:@""];
if(expression.length != string.length)return NO;
for(int i=0;i<expression.length;i++){
BOOL match = ([expression characterAtIndex:i] == '.') || ([expression characterAtIndex:i] == [string characterAtIndex:i]);
if(!match) return NO;
}
return YES;
}
We can achieve the O(n) solution without using recursion
def isMatch(reg, str):
if not reg and not str:##both reg and str are NULL
return True
if len(reg) == 2 and reg[1]=='*':##reg='a*' and b=''
return str==''
i = 0
j = 0
while i < len(reg) and j < len(str):
if i+2<len(reg) and reg[i+2]=='*':
if reg[i] == str[j]:
i = i + 3
j = j + 1
else:
return False
elif reg[i] == '.':
i += 1
j += 1
else:
if reg[i]==str[j]:
i += 1
j += 1
else:
return False
return (i == len(reg) and j==len(str))
Run over the chars in the regexp and try find matches in the string itself.
Pseudo code:
boolean match (String regexp, String s) {
if (regexp.equals("") && s.eqauls("")) return true;
if (regexp.equals("") && !s.eqauls("")) return false;
if (regexp.size() > 1 && regexp.getChar(1) == '*') { // Astrix
boolean match = match(regexp.substring(2), s);
if (match) return true;
boolean match = match(regexp.getChar(0), s);
if (!match) return false;
return match(regexp.substring(2), s.substring(1));
}
boolean match = match(regexp.getChar(0), s);
if (!match) return false;
return match(regexp.substring(1), s.substring(1))
}
boolean match(char regexpChar, String s) {
if (s.equals("")) return false;
char c = s.getChar(0);
if (c == '.') return true;
if (c == regexpChar) return true;
return false;
}
If the usage of Java's Pattern object is allowed, this is a simple Regular Expression question - each 'a*' turns into "(a)?" regex string, and each '.' turns into '.' regex string.
For example, "ab*..d" turns into "a(b)?..d" regex string. Compile the pattern and match away.
static boolean isMatch(String pat, String match) {
StringBuilder sb = new StringBuilder();
sb.append("^");
for (int i = 0; i < pat.length(); i++) {
if (i + 1 < pat.length() && pat.charAt(i + 1) == '*') {
sb.append('(').append(pat.charAt(i)).append(")?");
i++;
}
else {
sb.append(pat.charAt(i));
}
}
sb.append("$");
Pattern r = Pattern.compile(sb.toString());
return r.matcher(match).matches();
}
public class RegularExpression {
// s1 regex s2 match
public static boolean isMatch(String s1, String s2) {
// recursive implementation keeps shortening both strings from the right to left
if (s1.length() == 0 && s2.length() == 0) return true;
int i, j;
boolean point = false;
boolean star = false;
for(i = s1.length() - 1, j = s2.length() - 1;i >= 0 && j >= 0;) {
if(s1.charAt(i) == s2.charAt(j)) {
i--;
j--;
continue;
}
if(s1.charAt(i) == '.') {
i--;
j--;
point = true;
continue;
}
if(s1.charAt(i) == '*') {
j -= 2;
star = true;
continue;
}
if(s1.charAt(i) != s2.charAt(j)) {
if(star) {
if(s1.charAt(i) == s2.charAt(j + 1)) {
i--;
star = false;
continue;
}else
return false;
}
return false;
}
}
return true;
}
public static void main(String []argc) {
System.out.println(isMatch("a*", ""));
System.out.println(isMatch("Co.s*", "Cos"));
System.out.println(isMatch("Co.s", "Coas"));
System.out.println(isMatch("Co.sb*", "Coss"));
System.out.println(isMatch("a*abc", "abc"));
}
}
Is this ok?
It should be better than the recursive solution.
#include<iostream>
#include<string>
using namespace std;
bool isMatch(string s1, string s2)
{
if(s1.length()==0 && s2.length()==0)
return true;
string s1_next = s1.substr(0, s1.length()-1);
string s2_next = s2.substr(0, s2.length()-1);
if(s1[s1.length()-1]=='*')
{
string s1_next_next = s1.substr(0, s1.length()-2);
return isMatch(s1_next, s2) || isMatch(s1_next_next, s2);
}
else if(s1[s1.length()-1]=='.' && s2[s2.length()-1]!='\0')
{
return isMatch(s1_next, s2_next);
}
else
{
return s1[s1.length()-1]==s2[s2.length()-1] && isMatch(s1_next, s2_next);
}
}
DP solution.
use array to record previous match.
when s[i] == p[j] or p[j] is '.' current match is true only when p's previous match s's previous
but when p[j] is '*', there are two conditions:
1) ignore the '*', current match is true only when p's previous match current s
2) delete previous p[j-1], current match is true only when p's previous'previous match with current s
bool isMatch(string s, string p)
{
int n = s.length();
int m = p.length();
vector<bool> pv(n+1, false); // row -2
vector<bool> v(n+1, false); // row - 1
v[0] = true;
for(int i=0; i<m; i++)
{
vector<bool> nv(n+1, false);
for(int j=0; j<n; j++)
{
if (p[j] == s[i] || p[j] == '.') // 'a' == 'a'
{
nv[j+1] = v[j]; // only when previous match
}
else if (p[j] == '*')
{
nv[j+1] = v[j+1] || pv[j+1]; // ignore * || delete previous c
}
}
pv = v; // row i-2 for next round
v = nv; // row i-1 for next round
}
return v[n];
}
bool isMatch(const char* s, const char* p){
if(*p == '\0') return *s == '\0';
if(*(p+1) == '*'){
if(*p == *s || (*s != '\0' && *p == '.')) return isMatch(s+1, p+2) || isMatch(s, p+2);
else return isMatch(s, p+2);
}
else{
if(*p == *s || (*s != '\0' && *p == '.')) return isMatch(s+1, p+1);
else return false;
}
}
Ruby 2.0.0 implementation
def isMatch string, test
string.each_char.with_index do |char, index|
if char == '*'
string.slice! index
string.slice! index-1 if index-1 >=0
end
end
regex = string.gsub('.', '[a-z]')
if test.match regex
return true
else
return false
end
end
puts isMatch("a*", "") #true
puts isMatch(".", "") #false
puts isMatch("ab*", "a") #true
puts isMatch("a.", "ab") #true
puts isMatch("a", "a") #true
puts isMatch("a*a*a*.c", "b") #false
puts isMatch("a*a*a*.c", "abc") #true
puts isMatch("b*.b*", "bab") #true
puts isMatch("ab*c*d.", "abdg") #true
boolean isMatch(String pattern, String word)
{
int i=0,j=0;
while(i < pattern.length())
{
if(i+1 < pattern.length() && pattern.charAt(i+1)=='*')
i+=2;
else{
if(j==word.length())
return false;
if(pattern.charAt(i)==word.charAt(j))
{
i++;j++;
}else
return false;
}
}
if(j==word.length())
return true;
return false;
}
bool isMatch(const char *s, const char *p) {
if (*p == '\0') return *s == '\0';
// next char is not '*', then must match current character
if (*(p + 1) != '*') {
if (*p == *s || (*p == '.' && *s != '\0'))
return isMatch(s + 1, p + 1);
else
return false;
}
else {
if(*p == *s || (*p == '.' && *s != '\0'))
return isMatch(s, p + 2) || isMatch(s+1,p+2);
else return false;
}
}
recursive solution.
public static boolean isMatch(String regex, String sentence) {
if (regex.isEmpty()) {
return true;
}
if (regex.charAt(0) == '.') {
// if sentence is not empty match and continue to next character
return (!sentence.isEmpty() && isMatch(regex.substring(1),
sentence.substring(1)));
}
// it must be a letter since the code skips *
if (regex.length() > 1 && regex.charAt(1) == '*') {
// lets try to skip this char
if (isMatch(regex.substring(2), sentence)) {
return true;
}
}
// no match with out the current letter lets try with it
if (sentence.isEmpty() || !(regex.charAt(0) == sentence.charAt(0))) {
return false;
}
return isMatch(regex.substring(1), sentence.substring(1));
}
Below is code for "+","*",. in C++
// Regex.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
using namespace std;
class StackNode
{
public:
void *item;
StackNode *next;
StackNode(void *item, StackNode *next)
{
this->item = item;
this->next = next;
}
};
class Stack
{
private:
StackNode *top;
public:
Stack()
{
top = NULL;
}
~Stack()
{
if (!isStackEmpty())
{
cout << "Stack is Not Empty" << endl;
}
}
void push(void *item)
{
top = new StackNode(item, top);
}
void* pop()
{
void *retval = NULL;
StackNode *tmp;
// If top is non NULL...
if (tmp = top)
{
retval = top->item;
top = top->next;
delete tmp;
}
return retval;
}
bool isStackEmpty()
{
return (top == NULL);
}
};
class RegexNode
{
public:
char *str;
int strlen1;
char *regex;
int regexlen;
bool plusasstar;
RegexNode(char *str, int strlen1, char *regex, int regexlen, bool plusasstar)
{
this->str = str;
this->strlen1 = strlen1;
this->regex = regex;
this->regexlen = regexlen;
this->plusasstar = plusasstar;
}
};
bool strmatchregex(char *string, int strlen1, char *regex, int regexlen)
{
//Use stack for checking all the options that needs to be checked.
Stack *stack = new Stack();
char curregexchar, nextregexchar, curchar;
bool popstack = true, nextregexcharvalid = false;
RegexNode *tmp = NULL;
stack->push(new RegexNode(string, strlen1, regex, regexlen, false));
//Add initial call to stack as a regexnode.
while (!stack->isStackEmpty())
{
if (tmp)
delete tmp;
nextregexcharvalid = false;
tmp = (RegexNode *)stack->pop();
if ((tmp->strlen1) > 0)
{
curchar = tmp->str[0];
}
else
{
if (tmp->regexlen == 0)
{
//complete match.
goto clearandreturntrue;
}
continue;
}
if (tmp->regexlen > 0)
{
curregexchar = tmp->regex[0];
if (tmp->regexlen > 1)
{
nextregexcharvalid = true;
nextregexchar = tmp->regex[1];
}
}
else
{
continue;
}
popstack = false;
if (nextregexcharvalid)
{
//handle + or *
if (nextregexchar == '+')
{
if (curchar == '.')
{
cout << "Unsupported regex" << endl;
}
if (tmp->plusasstar)
{
nextregexchar = '*';
goto regexstar;
}
if (curregexchar == curchar)
{
//It is a match.
//We count that character out.
stack->push(new RegexNode(tmp->str + 1, tmp->strlen1 - 1, tmp->regex, tmp->regexlen, true));
continue;
}
else
{
//the string doesnot follow this rule.
//end of this path
continue;
}
}
regexstar:
if (nextregexchar == '*')
{
//Add a possibility that this char is not part of our current "character*".
stack->push(new RegexNode(tmp->str, tmp->strlen1, tmp->regex + 2, tmp->regexlen - 2, false));
if (curregexchar == curchar)
{
//It is a match.
stack->push(new RegexNode(tmp->str + 1, tmp->strlen1 - 1, tmp->regex, tmp->regexlen, false));
continue;
}
else
{
//it is not a match.
//we already added that probability
continue;
}
}
}
if ((curregexchar == '.') || (curchar == curregexchar))
{
stack->push(new RegexNode(tmp->str + 1, tmp->strlen1 - 1, tmp->regex + 1, tmp->regexlen - 1, false));
}
}
return false;
clearandreturntrue:
if (tmp)
delete tmp;
while (tmp = (RegexNode *)stack->pop())
{
delete tmp;
}
delete stack;
return true;
}
int main()
{
char str1[100];
char regex[100];
while (1)
{
cout << "Please enter regular expression" << endl;
cin >> regex;
cout << "Please enter string to check to match " << endl;
cin >> str1;
if (strmatchregex(str1, strlen(str1), regex, strlen(regex)))
{
cout << "String " << str1 << "Matches regex " << regex << endl;
}
else
{
cout << " NO match found" << endl;
}
}
return 0;
}
Code for checking regex support of '+','*" and '.'
// RegexCheck.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
using namespace std;
class StackNode
{
public:
void *item;
StackNode *next;
StackNode(void *item, StackNode *next)
{
this->item = item;
this->next = next;
}
};
class Stack
{
private:
StackNode *top;
public:
Stack()
{
top = NULL;
}
~Stack()
{
if (!isStackEmpty())
{
cout << "Stack is Not Empty" << endl;
}
}
void push(void *item)
{
top = new StackNode(item, top);
}
void* pop()
{
void *retval = NULL;
StackNode *tmp;
// If top is non NULL...
if (tmp = top)
{
retval = top->item;
top = top->next;
delete tmp;
}
return retval;
}
bool isStackEmpty()
{
return (top == NULL);
}
};
class RegexNode
{
public:
char *str;
int strlen1;
char *regex;
int regexlen;
bool plusasstar;
RegexNode(char *str, int strlen1, char *regex, int regexlen, bool plusasstar)
{
this->str = str;
this->strlen1 = strlen1;
this->regex = regex;
this->regexlen = regexlen;
this->plusasstar = plusasstar;
}
};
bool strmatchregex(char *string, int strlen1, char *regex, int regexlen)
{
//Use stack for checking all the options that needs to be checked.
Stack *stack = new Stack();
char curregexchar, nextregexchar, curchar;
bool popstack = true, nextregexcharvalid = false;
RegexNode *tmp = NULL;
stack->push(new RegexNode(string, strlen1, regex, regexlen, false));
//Add initial call to stack as a regexnode.
while (!stack->isStackEmpty())
{
if (tmp)
delete tmp;
nextregexcharvalid = false;
tmp = (RegexNode *)stack->pop();
if ((tmp->strlen1) > 0)
{
curchar = tmp->str[0];
}
else
{
if (tmp->regexlen == 0)
{
//complete match.
goto clearandreturntrue;
}
continue;
}
if (tmp->regexlen > 0)
{
curregexchar = tmp->regex[0];
if (tmp->regexlen > 1)
{
nextregexcharvalid = true;
nextregexchar = tmp->regex[1];
}
}
else
{
continue;
}
popstack = false;
if (nextregexcharvalid)
{
//handle + or *
if (nextregexchar == '+')
{
if (curchar == '.')
{
cout << "Unsupported regex" << endl;
}
if (tmp->plusasstar)
{
nextregexchar = '*';
goto regexstar;
}
if (curregexchar == curchar)
{
//It is a match.
//We count that character out.
stack->push(new RegexNode(tmp->str + 1, tmp->strlen1 - 1, tmp->regex, tmp->regexlen, true));
continue;
}
else
{
//the string doesnot follow this rule.
//end of this path
continue;
}
}
regexstar:
if (nextregexchar == '*')
{
//Add a possibility that this char is not part of our current "character*".
stack->push(new RegexNode(tmp->str, tmp->strlen1, tmp->regex + 2, tmp->regexlen - 2, false));
if (curregexchar == curchar)
{
//It is a match.
stack->push(new RegexNode(tmp->str + 1, tmp->strlen1 - 1, tmp->regex, tmp->regexlen, false));
continue;
}
else
{
//it is not a match.
//we already added that probability
continue;
}
}
}
if ((curregexchar == '.') || (curchar == curregexchar))
{
stack->push(new RegexNode(tmp->str + 1, tmp->strlen1 - 1, tmp->regex + 1, tmp->regexlen - 1, false));
}
}
return false;
clearandreturntrue:
if (tmp)
delete tmp;
while (tmp = (RegexNode *)stack->pop())
{
delete tmp;
}
delete stack;
return true;
}
int main()
{
char str1[100];
char regex[100];
while (1)
{
cout << "Please enter regular expression" << endl;
cin >> regex;
cout << "Please enter string to check to match " << endl;
cin >> str1;
if (strmatchregex(str1, strlen(str1), regex, strlen(regex)))
{
cout << "String " << str1 << "Matches regex " << regex << endl;
}
else
{
cout << " NO match found" << endl;
}
}
return 0;
}
I had that question on FB interview. I had two alternative ideas on how to do it, and I did it using regular substring matching. The other idea was about splitting regex and matching nodes separately is below.
#include <assert.h>
#include <regex> // for testing
#include <string>
using std::vector;
using std::string;
struct rx_node
{
int count_min, count_max;
char c;
};
typedef vector<rx_node> regex_t;
static bool match(const regex_t& rx, int i, const string &s, int pos);
static bool match_next(const regex_t& rx, int i, const string &s, int pos)
{
i++;
pos++;
if (i < rx.size())
return match(rx, i, s, pos);
else
return pos == s.size();
}
static bool match(const regex_t& rx, int i, const string &s, int pos)
{
const rx_node &r = rx[i];
if (r.count_min==0 && match_next(rx, i, s, pos-1))
return true;
int pos_max = r.count_max==-1 ? s.size() : pos+r.count_max;
if (pos_max>s.size())
pos_max = s.size();
for (int count=1; pos<s.size(); ++pos, ++count)
{
if (r.c!='.' && s[pos]!=r.c)
return false;
if (count>=r.count_min && match_next(rx, i, s, pos))
return true;
}
return false;
}
void regex_compile(const string &rx, regex_t &r)
{
for (int i=0; i<rx.size(); ++i)
{
rx_node n = {1, 1, rx[i]};
if (i<rx.size()-1)
{
if (rx[i+1]=='*')
{
n.count_min = 0;
n.count_max = -1;
i++;
}
else if (rx[i+1]=='?')
{
n.count_min = 0;
n.count_max = 1;
i++;
}
}
r.push_back(n);
}
}
bool regex_match(const string &str, const string& rx)
{
regex_t r;
regex_compile(rx, r);
bool match = match_next(r, -1, str, -1);
assert(std::regex_match(str, std::regex(rx)) == match);
return match;
}
How about comparing the string and the regex from the end.
The non-trivial case of '*', we need to follow both paths of deleting and not deleting the previous character.
Can be solved using recursion.
#include <iostream>
#include <string>
using namespace std;
bool isMatch(string regex, string input){
for (int i = regex.length()-1, j=input.length()-1; i >= 0;) {
if(regex[i] == input[j]) {i--; j--; continue;}
else if(regex[i]=='.') {i--; j--;continue;}
else if(regex[i] =='*') {
if(i >= 0 && isMatch(regex.substr(0,i), j >= 0 ? input.substr(0,j+1) : ""))
return true;
if(i-1 >=0)
return isMatch(regex.substr(0,i-1), j >=0 ? input.substr(0,j+1): "");
else return false;
}
else return false;
}
return true;
}
int main(){
string regex, pattern;
cin >> regex;
cin >> pattern;
cout << "Match = " << isMatch(regex, pattern) << endl;
return 0;
}
public bool isMatch(char* text, char* pattern){
if(text == NULL && pattern == NULL){
return true;
}
if(pattern == NULL){
return false;
}
if(text == NULL && *(pattern + 1) != '*'){
return false;
}
if(*(pattern + 1) == '*'){
return isMatch(text, pattern + 2)
|| (*text == *pattern || *pattern == '.') && isMatch(text + 1, pattern);
}
return (*text == *pattern || *pattern == '.') && isMatch(text + 1, pattern + 1);
}
try this?
inspired the the recursive approach.Processing the string from the end.
- godricly.li October 14, 2013