Google Interview Question
Software Engineer InternsCountry: Europe
Interview Type: Phone Interview
step-1 : Use some pattern matching alogo like KMP
output of step-1:
short position_t1[strlen(str)];
short position_t2[strlen(str)];
short position_t3[strlen(str)];
step-2:
short sliding_window[3];
int final_answer = INT_MAX;
for (i = 0 ; i < strlen(str); ++i) {
if (pos_t1[i]) {
sliding_window[0] = i;
}
if (pos_t2[i]) {
sliding_window[1] = i;
}
if (pos_t3[i]) {
sliding_window[2] = i;
}
if (sliding_window[0] != 0 && sliding_window[1] != 0 && sliding_window[2] != 0) {
possible_answer = cal_width(sliding_window);
if (possible_answer < final_answer) {
final_answer = possible_answer;
}
sliding_window[lesser_position_of_3(sliding_window)] = 0;
}
}
printf("%d\n", final_answer);
could you provide testable solution?
What you wrote here it is unclear, whether it works or not.
Hi zr.zoman,
Thanks showing interest in my solution. Its simply sliding window algo...
1) (pos_t1[i] = 1) => string t1 starts at pos-i in original string
2) sliding_window[0] => Latest starting position of t1 in original string when scanning from left to right
sliding_window[1] => Latest starting position of t2 in original string when scanning from left to right
sliding_window[2] => Latest starting position of t3 in original string when scanning from left to right
3) cal_width(sliding_window):
Example:
sliding_window = { 12, 17, 19} and len(t1) = 3, len(t2) = 9 and len(t3) = 7;
then width of current window = 12 to max (12 + 3, 17 + 9, 19 + 7}
4) sliding_window[lesser_position_of_3(sliding_window)] = 0;
Remove leftmost substring from window
Its SLIDING WINDOW algorithm => Hope it clears
For Example, let us say starting positions are
t1 t1 t2 t2 t3 t2 t1 t3
0 1 2 3 4 5 6 7
first sliding window {1, 3, 4} => width = 4 - 1 + 1 = 4
second sliding window {6, 5, 4} => width = 6 - 4 + 1 = 3
third sliding window {6, 5, 7} => width = 7 - 5 + 1 = 3
sliding window is quite a broad range of algorithms depending on the case you program for.
Let's assume that we already given 3 arrays of start indexes.
Could you provide the based on sliding window solution that takes these 3 arrays + initial string and outputs the desired substring ?
why do you remove the leftmost substring from window? wouldn't this happen automatically as you continue?
In the example you gave above, after you find the first sliding window, you'll set sliding_window[0] = 0 (because 1 is the smallest value in the array, representing the leftmost substring). However, when you find t1 again in the long string at index 6, you write over the value at sliding_window[0] anyway. It doesn't seem necessary to set this value equal to zero after finding a new sliding window.
Why do you remove the leftmost sliding window after finding a new sliding window? It seems like this would happen automatically.
Removing leftmost ti is equivalent to moving window to right so that window does not contain all t-i s and now search for next window that contains all t-i s....so in.next iterations condition
if (sliding_window[0] != 0 && sliding_window[1] != 0 && sliding_window[2] != 0)
Becomes false
I agree merlam ...setting leftmost to 0 is just optimization...without it also we should get right answer
For my own understanding I coded up a working example of sameer's code in java. Nice solution sameer, very clever use of arrays.
public class TextMatching {
public static void main(String[] args) {
//Got distance of 12
String text = "00as0de00000as00bp1000de000bp000as000de120bp";
String p1 = "as";
String p2 = "bp1";
String p3 = "de";
//Got distance of 4.
// String text = "Mississippi";
// String p1 = "is";
// String p2 = "si";
// String p3 = "ssi";
int[] vals = calc_size(text, p1, p2, p3);
System.out.println("distance: " + vals[0] + ", min loc: " + vals[1] + ", max loc: " + vals[2]);
}
static int[] calc_size(String text, String p1, String p2, String p3)
{
int[] retVals = new int[3];
int[] match_locations = new int[3];
int[] p1_array = slow_matcher(text, p1);
int[] p2_array = slow_matcher(text, p2);
int[] p3_array = slow_matcher(text, p3);
char[] text_array = text.toCharArray();
int small_dist = Integer.MAX_VALUE;
for(int i = 0; i < text_array.length; i++)
{
if(p1_array[i] == 1)
match_locations[0] = i;
if(p2_array[i] == 1)
match_locations[1] = i;
if(p3_array[i] == 1)
match_locations[2] = i;
if(match_locations[0] != 0 && match_locations[1] != 0 && match_locations[2] != 0)
{
int min = Math.min(match_locations[0], Math.min(match_locations[1], match_locations[2]));
int max = Math.max(match_locations[0]+ p1.length(), Math.max(match_locations[1]+ p2.length(), match_locations[2]+ p3.length()));
if(small_dist > max-min)
{
small_dist = max-min;
retVals[0] = small_dist;
retVals[1] = min;
retVals[2] = max;
}
}
}
return retVals;
}
//O(nm): This can definitely be improved with KMP algorithm.
static int[] slow_matcher(String text, String pattern)
{
int[] ret = new int[text.length()];
char[] text_array = text.toCharArray();
for(int i = 0; i < text_array.length; i++)
{
String temp = new String(Arrays.copyOfRange(text_array, i, i+pattern.length()));
if(temp.equalsIgnoreCase(pattern))
{
ret[i] = 1;
i = i+pattern.length()-1;
}
}
return ret;
}
}
Found a small bug in the slow_matcher method. Instead of
i = i+pattern.length();
it should read:
i = i+pattern.length()-1;
Here is my solution based on Joe's solution and with KMP algo for step 1
/* package whatever; // don't place package name! */
import java.util.Arrays;
class TextMatching {
public static void main(String[] args) {
//Got distance of 12
//String text = "00as0de00000as00bp1000de000bp000as000de120bp";
//String p1 = "as";
//String p2 = "bp1";
//String p3 = "de";
//Got distance of 4.
String text = "Mississippi";
String p1 = "is";
String p2 = "si";
String p3 = "ssi";
int[] vals = calc_size_kmp(text, p1, p2, p3);
System.out.println("distance: " + vals[0] + ", min loc: " + vals[1] + ", max loc: " + vals[2]);
}
private static void print_array(int[] p1_array) {
for (int k = 0; k < p1_array.length; k++) {
System.out.print(p1_array[k]);
System.out.print(" ");
}
System.out.println(" ");
}
static int[] calc_size_kmp(String text, String p1, String p2, String p3) {
int[] retVals = new int[3];
int[] match_locations = new int[3];
char[] text_chrs = text.toCharArray();
int[] p1_array = slow_matcher_kmp(text_chrs, p1.toCharArray());
int[] p2_array = slow_matcher_kmp(text_chrs, p2.toCharArray());
int[] p3_array = slow_matcher_kmp(text_chrs, p3.toCharArray());
int small_dist = Integer.MAX_VALUE;
for (int i = 0; i < text_chrs.length; i++) {
if (p1_array[i] == 1)
match_locations[0] = i;
if (p2_array[i] == 1)
match_locations[1] = i;
if (p3_array[i] == 1)
match_locations[2] = i;
if (match_locations[0] != 0 && match_locations[1] != 0 && match_locations[2] != 0) {
int min = Math.min(match_locations[0], Math.min(match_locations[1], match_locations[2]));
int max = Math.max(match_locations[0] + p1.length(), Math.max(match_locations[1] + p2.length(), match_locations[2] + p3.length()));
if (small_dist > max - min) {
small_dist = max - min;
retVals[0] = small_dist;
retVals[1] = min;
retVals[2] = max;
}
}
}
return retVals;
}
private static int[] get_prefix(char[] text) {
int[] r = new int[text.length];
int index = 0;
for (int i = 1; i < r.length; i++) {
while (index >= 0 && text[index] != text[i])
index--;
index++;
r[i] = index;
}
return r;
}
static int[] slow_matcher_kmp(char[] text, char[] pattern) {
int[] res = new int[text.length];
int[] pf = get_prefix(pattern);
int index = 0;
for (int i = 0; i < text.length; i++) {
while (index > 0 && text[i] != pattern[index])
index = pf[index - 1];
if (text[i] == pattern[index]) index++;
if (index == pattern.length)
{
res[i - index + 1] = 1;
index = 0;
}
}
return res;
}
}
Step 2 of the original solution may produce a wrong answer when occurrences of a substring t_i overlap. In below example, answer should be {0,9} but it will produce {4, 16} instead.
t_1: {0, 9} {7, 16}
t_2: {4, 5}
t_3: {8, 9}
This still can be solved in O(n) by having two pointers left/right and track if the current [left, right] covers at least one occurrence of each of t_i or not.
import java.util.ArrayList;
import java.util.Scanner;
public class FindShortestSubstring
{
public static void main(String[] args)
{
@SuppressWarnings("resource")
Scanner sc = new Scanner(System.in);
System.out.println("Enter the long string s");
String longString = sc.nextLine();
System.out.println("Enter the shorter strings t1, t2 and t3");
String t1 = sc.nextLine();
String t2 = sc.nextLine();
String t3 = sc.nextLine();
ArrayList<Integer> pos1 = new ArrayList<Integer>();
ArrayList<Integer> pos2 = new ArrayList<Integer>();
ArrayList<Integer> pos3 = new ArrayList<Integer>();
int pos = -1;
pos = longString.indexOf(t1);
while (pos != -1)
{
System.out.println("pos is " + pos);
pos1.add(pos);
pos = longString.indexOf(t1, pos + 1);
}
System.out.println();
pos = longString.indexOf(t2);
while (pos != -1)
{
System.out.println("pos is " + pos);
pos2.add(pos);
pos = longString.indexOf(t2, pos + 1);
}
System.out.println();
pos = longString.indexOf(t3);
while (pos != -1)
{
System.out.println("pos is " + pos);
pos3.add(pos);
pos = longString.indexOf(t3, pos + 1);
}
System.out.println();
if ((pos1.size() != 0) && (pos2.size() != 0) && (pos3.size() != 0))
{
int min_dist = Integer.MAX_VALUE;
int min_t1 = -1;
int min_t2 = -1;
int min_t3 = -1;
for (int i = 0; i < pos1.size(); i++)
{
for (int j = 0; j < pos2.size(); j++)
{
if (pos2.get(j) > pos1.get(i)) // sequence is t1,t2
{
for (int k = 0; k < pos3.size(); k++)
{
int dist = -1;
if (pos3.get(k) > pos2.get(j)) // sequence is t1,t2,t3
{
dist = pos3.get(k) + t3.length() - pos1.get(i);
}
else if (pos3.get(k) > pos1.get(i)) // sequence is t1,t3,t2
{
dist = pos2.get(j) + t2.length() - pos1.get(i);
}
else
// sequence is t3,t1,t2
{
dist = pos2.get(j) + t2.length() - pos3.get(k);
}
if (dist <= min_dist)
{
min_dist = dist;
min_t1 = pos1.get(i);
min_t2 = pos2.get(j);
min_t3 = pos3.get(k);
}
}
}
else
// sequence is t2,t1
{
for (int k = 0; k < pos3.size(); k++)
{
int dist = -1;
if (pos3.get(k) > pos1.get(i)) // sequence is t2,t1,t3
{
dist = pos3.get(k) + t3.length() - pos2.get(j);
}
else if (pos3.get(k) > pos2.get(j)) // sequence is t2,t1,t3
{
dist = pos3.get(k) + t3.length() - pos2.get(j);
}
else
// sequence is t3,t2,t1
{
dist = pos1.get(i) + t1.length() - pos3.get(k);
}
if (dist <= min_dist)
{
min_dist = dist;
min_t1 = pos1.get(i);
min_t2 = pos2.get(j);
min_t3 = pos3.get(k);
}
}
}
}
}
if (min_t1 > min_t2)
{
if (min_t2 > min_t3)
{
System.out.println("Shortest substring: " + longString.substring(min_t3, min_t1 + t1.length()));
}
else if (min_t3 < min_t1)
{
System.out.println("Shortest substring: " + longString.substring(min_t2, min_t1 + t1.length()));
}
else
{
System.out.println("Shortest substring: " + longString.substring(min_t2, min_t3 + t3.length()));
}
}
else
{
if (min_t1 > min_t3)
{
System.out.println("Shortest substring: " + longString.substring(min_t3, min_t2 + t2.length()));
}
else if (min_t3 > min_t2)
{
System.out.println("Shortest substring: " + longString.substring(min_t1, min_t3 + t3.length()));
}
else
{
System.out.println("Shortest substring: " + longString.substring(min_t1, min_t2 + t2.length()));
}
}
System.out.println("min_t1: " + min_t1);
System.out.println("min_t2: " + min_t2);
System.out.println("min_t3: " + min_t3);
}
else
{
System.out.println("no such substring");
}
}
}
c# implementation.
O(n).
using System;
using System.Collections.Generic;
namespace OneSubstringOfThree {
class Program {
static private string GetSubstr( string s, string t1, string t2, string t3 ) {
int[] arrT1; int[] arrT2; int[] arrT3;
GetListsOfStartIndexes(out arrT1, out arrT2, out arrT3, s, t1, t2, t3 );
Array.Sort(arrT1); Array.Sort(arrT2); Array.Sort(arrT3);
int index1 = 0; int index2 = 0; int index3 = 0;
int[] target = new int[3];
int minSum = int.MaxValue;
int length = 0;
while ( index1 < arrT1.Length && index2 < arrT2.Length && index3 < arrT3.Length ) {
int sum = Math.Abs(arrT1[index1] - arrT2[index2])
+ Math.Abs(arrT1[index1] - arrT3[index3])
+ Math.Abs(arrT3[index3] - arrT2[index2]);
var i1 = arrT1[index1]; var i2 = arrT2[index2]; var i3 = arrT3[index3];
if (sum < minSum) {
minSum = sum;
target[0] = i1;
target[1] = i2;
target[2] = i3;
var minVal = Math.Min(i1, Math.Min(i2, i3));
int maxVal = 0;
int tLen = 0;
GetMaxValAndTLen( ref maxVal, ref tLen, i1, i2, i3, t1, t2, t3 );
length = maxVal + tLen - minVal;
}
if (arrT1[index1] > arrT2[index2]) { index2++; continue; }
if (arrT1[index1] > arrT3[index3]) { index3++; continue; }
index1++;
}
var res = s.Substring( Math.Min(target[0], Math.Min(target[1], target[2])), length );
return res;
}
static private void GetMaxValAndTLen(ref int maxVal , ref int tLen, int i1, int i2, int i3, string t1, string t2, string t3 ) {
if (i1 > i2 && i1 > i3) { tLen = t1.Length; maxVal = i1; return; }
if (i2 > i1 && i2 > i3) { tLen = t2.Length; maxVal = i2; return; }
if (i3 > i1 && i3 > i2) { tLen = t3.Length; maxVal = i3; return; }
if (i1 == i2 && i1 == i3) { maxVal = i1; tLen = Math.Max(t1.Length, Math.Max(t2.Length, t3.Length)); return; }
if (i1 == i2) { maxVal = i1; tLen = Math.Max(t1.Length, t2.Length); return; }
if (i1 == i3) { maxVal = i1; tLen = Math.Max(t1.Length, t3.Length); return; }
if (i2 == i3) { maxVal = i2; tLen = Math.Max(t2.Length, t3.Length); }
}
// use KMP instead, but this func is still O(n)
private static void GetListsOfStartIndexes(out int[] arrT1 , out int[] arrT2, out int[] arrT3, string s, string t1, string t2, string t3 ) {
var listT1 = new List<int>();
var listT2 = new List<int>();
var listT3 = new List<int>();
for ( int i = 0; i < s.Length; i++ ) {
if (s[i] == t1[0]) { CheckAndAdd( s, t1, i, listT1 ); }
if (s[i] == t2[0]) { CheckAndAdd( s, t2, i, listT2 ); }
if (s[i] == t3[0]) { CheckAndAdd( s, t3, i, listT3 ); }
}
arrT1 = listT1.ToArray();
arrT2 = listT2.ToArray();
arrT3 = listT3.ToArray();
}
private static void CheckAndAdd(string s, string t, int i, List<int> _list ) {
if (t.Length <= s.Length - i && s.Substring(i, t.Length) == t) {
_list.Add(i);
}
}
static void Main(string[] args) {
var res = GetSubstr("00as0de00000as00bp1000de000bp000as000de120bp", "as", "bp1", "de");
Console.WriteLine( res );
Console.ReadLine();
}
}
}
int main() {
string s = "moss wasgreatman a great mawasgreatmann";
string t[3] = { "was", "great", "manxxx" }; // assumed input was or could be easily converted to an int array
bool blnMatch = true;
int len = sizeof(t) / sizeof(*t);
for (int i = 0; i < len; i++) //iterate through array of substrings
{
for (int j = 0; j < s.length(); j++) //check for start in long string
{
if (t[i][0] == s[j])
{
blnMatch = true; // initialize boolean
for (int x = 1; x < t[i].length(); x++) //Loop through substring, matching letters in both strings.
{
if ((s[j + x] != t[i][x]))
blnMatch = false; //sets flag to false when no match.
}
}
if ((j >= (s.length() - t[i].length()) - 1) && (blnMatch == false)) // if end of string and no match, prints.
{
cout << t[i] << ": substring not found" << endl;
break;
}
}
}
}
solution does not work.
string s = "00as0de00000as00bp1000de000bp000as000de120bp";
string t[3] = { "bp1", "de", "as" };
It says : "bp1: substring not found".
I forgot to short circuit when found match:
int main() {
string s = "00as0de00000as00bp1000de000bp000as000de120bp";
string t[3] = { "bp1", "de", "as" }; // assumed input was or could be easily converted to an int array
bool blnMatch = true;
int len = sizeof(t) / sizeof(*t);
for (int i = 0; i < len; i++) //iterate through array of substrings
{
for (int j = 0; j < s.length(); j++) //check for start in long string
{
if (t[i][0] == s[j])
{
blnMatch = true; // initialize boolean
for (int x = 0; x < t[i].length(); x++) //Loop through substring, matching letters in both strings.
{
char v = t[i][x];
if ((s[(j + x)] != t[i][x]))
blnMatch = false; //sets flag to false when no match.
}
}
if ((j >= (s.length() - t[i].length()) - 1) && (blnMatch == false)) // if end of string and no match, prints.
{
cout << t[i] << ": substring not found" << endl;
break;
}
else if (blnMatch == true)
{
break;
}
}
}
}
#include <iostream>
#include <string>
std::string findShortestSubString(std::string longString, std::string * shortStrings, int lengthOfShortStrings) {
std::string::size_type start, end;
start = longString.length() - 1;
end = 0;
for (int i = 0; i < lengthOfShortStrings; i++) {
if (longString.find(shortStrings[i]) != std::string::npos) {
if (longString.find(shortStrings[i]) < start) {
start = longString.find(shortStrings[i]);
}
if (longString.find(shortStrings[i]) + shortStrings[i].length() - 1 > end) {
end = longString.find(shortStrings[i]) + shortStrings[i].length() - 1;
}
} else {
start = std::string::npos;
end = std::string::npos;
break;
}
}
if (start != std::string::npos && end != std::string::npos) {
return longString.substr(start, end - start + 1);
} else {
return std::string("There is no sub string containing all short strings.");
}
}
int main(int argc, const char * argv[]) {
std::string longString;
std::string shortStrings[3];
longString = std::string("Mississippi");
shortStrings[0] = std::string("is");
shortStrings[1] = std::string("si");
shortStrings[2] = std::string("ssi");
std::cout << findShortestSubString(longString, shortStrings, sizeof(shortStrings) / sizeof(std::string)) << std::endl;
return 0;
}
there maybe more than just on possible shortest string, maybe i should keep their indexes
/**
* this algorithm will find one shortest solution,but maybe more candidates
* @param text
* @param t1
* @param t2
* @param t3
* @return
*/
public List<String> findShortestString(String text, String t1, String t2, String t3){
int start = 0, slen = text.length();
List<String> ans = new LinkedList<>();
int [] index1 = getSubIndex(text, t1);
int [] index2 = getSubIndex(text, t2);
int [] index3 = getSubIndex(text, t3);
int matched[] = new int[]{-1, -1, -1};
int len = text.length();
for(int i = 0; i < len; ++ i){
if(index1[i] == 1) matched[0] = i;
if(index2[i] == 1) matched[1] = i;
if(index3[i] == 1) matched[2] = i;
if(matched[0] > -1 && matched[1] > -1 && matched[2] > -1){
int maxEnd = Math.max(matched[0] + t1.length() - 1, matched[1] + t2.length() - 1);
maxEnd = Math.max(maxEnd, matched[2] + t3.length() - 1);
int minStart = Math.min(matched[0], matched[1]);
minStart = Math.min(matched[2], minStart);
int currLen = maxEnd - minStart + 1;
if(currLen <= slen){
slen = currLen;
start = minStart;
if(currLen < currLen) {
if (!ans.isEmpty()) {
ans.clear();
}
}
ans.add(text.substring(start, start + slen));
}
}
}
return ans;
}
/**
* find all possible start index for t in text
* @param text
* @param t
* @return
*/
public int [] getSubIndex(String text, String t){
int ret[] = new int[text.length()];
for(int i = 0; i < text.length();){
int k = text.indexOf(t, i);
if(k == -1) break;
ret[k] = 1;
i = k + 1;
}
return ret;
}
This is a single pass O(n*m) where n is the lengths of the big string and m is the combined length of the small strings
import java.util.*;
public class MinStringSpan
{
static class SmallString
{
public SmallString(String t)
{
this.t=t;
}
public String t;
public int matchIndex=-1;
public LinkedList<Integer> indexes=new LinkedList<Integer>();
public void checkChar(int i,char ch)
{
if(t.length()==1)
{
if(t.charAt(i)==ch)
matchIndex=i;
return;
}
Iterator<Integer> it=indexes.iterator();
while(it.hasNext())
{
int ind=it.next();
if(t.charAt(i-ind)==ch)
{
if(i-ind==t.length()-1)
{
matchIndex=ind;
it.remove();
}
}
else
it.remove();
}
if(ch==t.charAt(0))
{
indexes.add(i);
}
}
}
static public String minSpan(String s,String t1,String t2,String t3)
{
ArrayList<SmallString> small=new ArrayList<SmallString>();
for(String t:new String[]{t1,t2,t3})
{
if(t.length()>0)
small.add(new SmallString(t));
}
int n=s.length();
int minIndex1=-1;
int minIndex2=-1;
for(int i=0;i<n;i++)
{
char ch=s.charAt(i);
boolean found=true;
int thisi1=-1;
int thisi2=-1;
for(SmallString ss:small)
{
ss.checkChar(i,ch);
if(ss.matchIndex==-1)
found=false;
else if(found)
{
int i1=ss.matchIndex;
int i2=ss.matchIndex+ss.t.length();
if(thisi1==-1 || i1<thisi1)
thisi1=i1;
if(thisi2==-1 || i2>thisi2)
thisi2=i2;
}
}
if(found)
{
if(minIndex1==-1 || (thisi2-thisi1<minIndex2-minIndex1))
{
minIndex1=thisi1;
minIndex2=thisi2;
}
}
}
if(minIndex1==-1)
return "";
return s.substring(minIndex1,minIndex2);
} public static void main(String [] args)
{
String x=MinStringSpan.minSpan("abbcdefbc","bc","fbc","ef");
System.out.println(x);
}
}
// A simple implementation in golang.
// Comment your improvements.
package main
import (
"fmt"
)
// Return a slice of starting indices of the substring sub in string s
func getIndices(s, sub string) []int {
indices := []int{}
for i := 0; i < len(s); {
if s[i] == sub[0] {
matched := false
index := i
for j := 1; j < len(sub); j++ {
i++
if i >= len(s) {
break
}
if s[i] == sub[j] {
matched = true
} else {
matched = false
}
}
if matched == true {
indices = append(indices, index)
}
} else {
i++
}
}
return indices
}
// Return min of 3 ints
func min3(a, b, c int) int{
min := a
if b < min {
min = b
}
if c < min {
min = c
}
return min
}
// Return max of 3 ints
func max3 (a, b, c int) int {
max := a
if b > max {
max = b
}
if c > max {
max = c
}
return max
}
func main() {
s := "stored in a map and persisted in a gob. All you need to do is replace the gob wit"
t1, t2, t3 := "to", "in", "gob"
indices1 := getIndices(s, t1)
indices2 := getIndices(s, t2)
indices3 := getIndices(s, t3)
// Assumes the substring is present atleast once
min := min3(indices1[0], indices2[0], indices3[0])
max := max3(indices1[0]+len(t1), indices2[0]+len(t2), indices3[0]+len(t3))
fmt.Println(s[min:max])
}
// Output:
// tored in a map and persisted in a gob
// What about this C# Code .. It seems to work
struct found
{
public int loc;
public string str;
public found(int l, string s)
{
loc = l;
str = s;
}
}
public static void FindShortesSubstring(string S, string t1, string t2, string t3)
{
System.Collections.Generic.Dictionary<string,int> tbl = new Dictionary<string,int> ();
tbl.Add(t1, 0);
tbl.Add(t2, 0);
tbl.Add(t3, 0);
System.Collections.Queue Q = new System.Collections.Queue();
int MinWindow = 0;
int MaxWindow = S.Length;
int MaxEnd = 0;
for (int i = 0; i < S.Length; i++)
{
if (i+t1.Length <= S.Length && S.Substring(i, t1.Length) == t1)
{
Q.Enqueue(new found(i, t1));
tbl[t1]++;
MaxEnd = (MaxEnd < i + t1.Length) ? i + t1.Length : MaxEnd;
}
if (i + t2.Length <= S.Length && S.Substring(i, t2.Length) == t2)
{
Q.Enqueue(new found(i, t2));
tbl[t2]++;
MaxEnd = (MaxEnd < i + t2.Length) ? i + t2.Length : MaxEnd;
}
if (i + t3.Length <= S.Length && S.Substring(i, t3.Length) == t3)
{
Q.Enqueue(new found(i, t3));
tbl[t3]++;
MaxEnd = (MaxEnd < i + t3.Length) ? i + t3.Length : MaxEnd;
}
if (tbl[t1] > 0 && tbl[t2] > 0 && tbl[t3] > 0)
{
found f = (found) Q.Dequeue();
if (MaxEnd - f.loc < MaxWindow - MinWindow)
{
MinWindow = f.loc;
MaxWindow = MaxEnd;
}
tbl[f.str]--;
}
}
System.Console.WriteLine( S + "\n" + t1 + "\n" + t2 + "\n" + t3 + " \nMin Window : " + S.Substring(MinWindow, MaxWindow - MinWindow ) + " \nStart at: " + MinWindow + " End at: " + MaxWindow );
}
struct found
{
public int loc;
public string str;
public found(int l, string s)
{
loc = l;
str = s;
}
}
public static void FindShortesSubstring(string S, string t1, string t2, string t3)
{
System.Collections.Generic.Dictionary<string,int> tbl = new Dictionary<string,int> ();
tbl.Add(t1, 0);
tbl.Add(t2, 0);
tbl.Add(t3, 0);
System.Collections.Queue Q = new System.Collections.Queue();
int MinWindow = 0;
int MaxWindow = S.Length;
int MaxEnd = 0;
for (int i = 0; i < S.Length; i++)
{
if (i+t1.Length <= S.Length && S.Substring(i, t1.Length) == t1)
{
Q.Enqueue(new found(i, t1));
tbl[t1]++;
MaxEnd = (MaxEnd < i + t1.Length) ? i + t1.Length : MaxEnd;
}
if (i + t2.Length <= S.Length && S.Substring(i, t2.Length) == t2)
{
Q.Enqueue(new found(i, t2));
tbl[t2]++;
MaxEnd = (MaxEnd < i + t2.Length) ? i + t2.Length : MaxEnd;
}
if (i + t3.Length <= S.Length && S.Substring(i, t3.Length) == t3)
{
Q.Enqueue(new found(i, t3));
tbl[t3]++;
MaxEnd = (MaxEnd < i + t3.Length) ? i + t3.Length : MaxEnd;
}
if (tbl[t1] > 0 && tbl[t2] > 0 && tbl[t3] > 0)
{
found f = (found) Q.Dequeue();
if (MaxEnd - f.loc < MaxWindow - MinWindow)
{
MinWindow = f.loc;
MaxWindow = MaxEnd;
}
tbl[f.str]--;
}
}
System.Console.WriteLine( S + "\n" + t1 + "\n" + t2 + "\n" + t3 + " \nMin Window : " + S.Substring(MinWindow, MaxWindow - MinWindow ) + " \nStart at: " + MinWindow + " End at: " + MaxWindow );
}
struct found
{
public int loc;
public string str;
public found(int l, string s)
{
loc = l;
str = s;
}
}
public static void FindShortesSubstring(string S, string t1, string t2, string t3)
{
System.Collections.Generic.Dictionary<string,int> tbl = new Dictionary<string,int> ();
tbl.Add(t1, 0);
tbl.Add(t2, 0);
tbl.Add(t3, 0);
System.Collections.Queue Q = new System.Collections.Queue();
int MinWindow = 0;
int MaxWindow = S.Length;
int MaxEnd = 0;
for (int i = 0; i < S.Length; i++)
{
if (i+t1.Length <= S.Length && S.Substring(i, t1.Length) == t1)
{
Q.Enqueue(new found(i, t1));
tbl[t1]++;
MaxEnd = (MaxEnd < i + t1.Length) ? i + t1.Length : MaxEnd;
}
if (i + t2.Length <= S.Length && S.Substring(i, t2.Length) == t2)
{
Q.Enqueue(new found(i, t2));
tbl[t2]++;
MaxEnd = (MaxEnd < i + t2.Length) ? i + t2.Length : MaxEnd;
}
if (i + t3.Length <= S.Length && S.Substring(i, t3.Length) == t3)
{
Q.Enqueue(new found(i, t3));
tbl[t3]++;
MaxEnd = (MaxEnd < i + t3.Length) ? i + t3.Length : MaxEnd;
}
if (tbl[t1] > 0 && tbl[t2] > 0 && tbl[t3] > 0)
{
found f = (found) Q.Dequeue();
if (MaxEnd - f.loc < MaxWindow - MinWindow)
{
MinWindow = f.loc;
MaxWindow = MaxEnd;
}
tbl[f.str]--;
}
}
System.Console.WriteLine( S + "\n" + t1 + "\n" + t2 + "\n" + t3 + " \nMin Window : " + S.Substring(MinWindow, MaxWindow - MinWindow ) + " \nStart at: " + MinWindow + " End at: " + MaxWindow );
}
struct found
{
public int loc;
public string str;
public found(int l, string s)
{
loc = l;
str = s;
}
}
public static void FindShortesSubstring(string S, string t1, string t2, string t3)
{
System.Collections.Generic.Dictionary<string,int> tbl = new Dictionary<string,int> ();
tbl.Add(t1, 0);
tbl.Add(t2, 0);
tbl.Add(t3, 0);
System.Collections.Queue Q = new System.Collections.Queue();
int MinWindow = 0;
int MaxWindow = S.Length;
int MaxEnd = 0;
for (int i = 0; i < S.Length; i++)
{
if (i+t1.Length <= S.Length && S.Substring(i, t1.Length) == t1)
{
Q.Enqueue(new found(i, t1));
tbl[t1]++;
MaxEnd = (MaxEnd < i + t1.Length) ? i + t1.Length : MaxEnd;
}
if (i + t2.Length <= S.Length && S.Substring(i, t2.Length) == t2)
{
Q.Enqueue(new found(i, t2));
tbl[t2]++;
MaxEnd = (MaxEnd < i + t2.Length) ? i + t2.Length : MaxEnd;
}
if (i + t3.Length <= S.Length && S.Substring(i, t3.Length) == t3)
{
Q.Enqueue(new found(i, t3));
tbl[t3]++;
MaxEnd = (MaxEnd < i + t3.Length) ? i + t3.Length : MaxEnd;
}
if (tbl[t1] > 0 && tbl[t2] > 0 && tbl[t3] > 0)
{
found f = (found) Q.Dequeue();
if (MaxEnd - f.loc < MaxWindow - MinWindow)
{
MinWindow = f.loc;
MaxWindow = MaxEnd;
}
tbl[f.str]--;
}
}
System.Console.WriteLine( S + "\n" + t1 + "\n" + t2 + "\n" + t3 + " \nMin Window : " + S.Substring(MinWindow, MaxWindow - MinWindow ) + " \nStart at: " + MinWindow + " End at: " + MaxWindow );
}
public class G1One {
public static void main(String[] args) {
String s = "abcdefghijklmnop";
String t1 = "cde";
String t2 = "mn";
String t3 = "efg";
findSortest(s, t1, t2, t3);
}
public static void findSortest(String s, String t1, String t2, String t3) {
int index4eacht[] = { s.indexOf(t1.charAt(0)), s.indexOf(t2.charAt(0)), s.indexOf(t3.charAt(0)) };
Arrays.sort(index4eacht);
int lastIndex4eacht[] = { s.lastIndexOf(t1.charAt(t1.length() - 1)), s.lastIndexOf(t2.charAt(t2.length() - 1)),
s.lastIndexOf(t3.charAt(t3.length() - 1)) };
Arrays.sort(lastIndex4eacht);
System.out.println(s.substring(index4eacht[0], lastIndex4eacht[2] + 1));
}
}
public class G1One {
public static void main(String[] args) {
String s = "abcdefghijklmnop";
String t1 = "cde";
String t2 = "mn";
String t3 = "efg";
findSortest(s, t1, t2, t3);
}
public static void findSortest(String s, String t1, String t2, String t3) {
int index4eacht[] = { s.indexOf(t1.charAt(0)), s.indexOf(t2.charAt(0)), s.indexOf(t3.charAt(0)) };
Arrays.sort(index4eacht);
int lastIndex4eacht[] = { s.lastIndexOf(t1.charAt(t1.length() - 1)), s.lastIndexOf(t2.charAt(t2.length() - 1)),
s.lastIndexOf(t3.charAt(t3.length() - 1)) };
Arrays.sort(lastIndex4eacht);
System.out.println(s.substring(index4eacht[0], lastIndex4eacht[2] + 1));
}
}}
Generified version that can take any number of substrings to look for. Time complexity becomes O(nm), where n is the length of the text and m is the number of substrings.
public static String findSmallestSubstring(String s, String... subs) {
if (s == null || subs == null || subs.length == 0) {
return null;
}
if (subs.length == 1) {
return subs[0] == null || !s.contains(subs[0]) ? null : subs[0];
}
List<String> subList = Lists.newArrayList();
for (String subStr : subs) {
Optional.ofNullable(subStr).ifPresent(subList::add);
}
Map<String, int[]> startPositions = Maps.newHashMap();
for (String sub : subList) {
startPositions.put(sub, findMatchPositions(s, sub));
}
int smallest = Integer.MAX_VALUE;
int startPos = 0, endPos = 0;
Map<String, Integer> matchPositions = new HashMap<>(subList.size());
Set<Map.Entry<String, int[]>> entries = startPositions.entrySet();
for (int i = 0; i < s.length(); i++) {
final int index = i;
if (entries.stream().noneMatch(entry -> entry.getValue()[index] == 1)) {
continue;
}
entries.stream().filter(entry -> entry.getValue()[index] == 1).forEach(
entry -> matchPositions.put(entry.getKey(), index));
if (matchPositions.size() == subList.size()) {
Integer minStart = matchPositions
.values()
.stream()
.min(Comparator.comparingInt(value -> value))
.orElse(Integer.MIN_VALUE);
Integer maxEnd = matchPositions
.entrySet()
.stream()
.max(Comparator.comparingInt(ShortestSubstring::getEndOfSub))
.map(ShortestSubstring::getEndOfSub)
.orElse(Integer.MAX_VALUE);
if (maxEnd - minStart < smallest) {
smallest = maxEnd - minStart;
startPos = minStart;
endPos = maxEnd;
}
}
}
if (matchPositions.size() < subList.size()) {
return null;
}
return s.substring(startPos, endPos);
}
O(N) computation complexity and O(1) space complexity solution.
Details: cpluspluslearning-petert.blogspot.co.uk/2015/12/find-shortest-sub-string-containing-s1.html
Test
void TestFindTheShortestSubStringContainS1S2S3()
{
{
const std::string input("abc0123456789aksdfjasd");
const std::string s1("0123");
const std::string s2("3456");
const std::string s3("");
std::string result = FindTheShortestSubStringContainS123(input, s1, s2, s3);
assert(result.empty() == true);
}
{
const std::string input("abc0123456789aksdfjasd");
const std::string s1("0123456");
const std::string s2("3456");
const std::string s3("1234");
std::string result = FindTheShortestSubStringContainS123(input, s1, s2, s3);
assert(result == std::string("0123456"));
}
{
const std::string input("abc0123456789aksdfjasd");
const std::string s1("0123");
const std::string s2("3456");
const std::string s3("6789");
std::string result = FindTheShortestSubStringContainS123(input, s1, s2, s3);
assert(result == std::string("0123456789"));
}
{
const std::string input("sdfa01234ad23456dfad6789abc0123456789aksdfjasd");
const std::string s1("0123");
const std::string s2("3456");
const std::string s3("6789");
std::string result = FindTheShortestSubStringContainS123(input, s1, s2, s3);
assert(result == std::string("0123456789"));
}
{
const std::string input("sdfa01234ad23456dfad6789abc0123456789aksdfjasd0123skd3456kjsd6789jhs");
const std::string s1("0123");
const std::string s2("3456");
const std::string s3("6789");
std::string result = FindTheShortestSubStringContainS123(input, s1, s2, s3);
assert(result == std::string("0123456789"));
}
}
package interview.datastruc.strings;
import java.util.ArrayList;
import java.util.Arrays;
public class Question1
{
public void computeIndexes(String str, String s, ArrayList<Integer> indList)
{
int ind = -1;
while(true)
{
ind = str.indexOf(s,ind+1);
if (ind!=-1)
indList.add(ind);
else
break;
}
}
public int[] getMinIndxRange(String str, String[] substr)
{
ArrayList<ArrayList<Integer>> indexes = new ArrayList<ArrayList<Integer>>();
for(String s:substr)
{
ArrayList<Integer> ind = new ArrayList<Integer>();
computeIndexes(str, s, ind);
indexes.add(ind);
}
int[] slidingWindows = new int[substr.length];
Arrays.fill(slidingWindows,-1);
boolean newVals = false;
int[] result = new int[2];
result[0]=result[1]=-1;
int resLen=Integer.MAX_VALUE;
for(int i=0; i<str.length(); i++)
{
int nonZeros=0;
for(int j=0; j<indexes.size(); j++)
{
ArrayList<Integer> ind = indexes.get(j);
if (ind.contains(i))
{
slidingWindows[j]=i;
newVals=true;
}
if (slidingWindows[j]>=0)
nonZeros++;
}
if (nonZeros==indexes.size()&& newVals==true)
{
int min=slidingWindows[0];
int max=slidingWindows[0];
int len = substr[0].length()-1;
for(int tempi=1; tempi<slidingWindows.length;tempi++)
{
if (slidingWindows[tempi]<min)
min=slidingWindows[tempi];
if (slidingWindows[tempi]>max)
{
max=slidingWindows[tempi];
len=substr[tempi].length()-1;
}
}
int fullLen = (max-min+1+len);
if (fullLen<resLen)
{
result[0]=min;
result[1]=max+len;
resLen=fullLen;
}
newVals=false;
}
}
return result;
}
public static void main(String[] args)
{
int[] res=new Question1().getMinIndxRange("00as0de00000as00bp1000de000bp000asde120bp1", new String[]{"as","bp1","de"});
System.out.println(res[0]+" to "+res[1]);
res=new Question1().getMinIndxRange("Mississippi", new String[]{"is","si"});
System.out.println(res[0]+" to "+res[1]);
}
}
The complexity is O(n), where n is the lenght of s.
def max_substr(s, t1, t2, t3):
res = ''
for i in s:
res += i
if (not(t1 in res or t2 in res or t3 in res or
res in t1 or res in t2 or res in t3)):
res = res[1::] #remove the first element
if (t1 in res and t2 in res and t3 in res):
return res
return res
Use Two passes:
- sameer November 28, 20151. In first pass find starting position of all occurences of t1, t2, t3 in string in O(n) time
2. Use sliding window to find minimum window using input from step-(1) - O(n) time.
I will provide code below