Microsoft Interview Question
Team: Application insight
Country: Israel
Interview Type: In-Person
I think it can be solved with hash[26].
Bigger string: T
Shorter string: P
Make a hash of P: hash[26] -> {no.ofA,no.ofB, no.ofC,.., no.ofZ}
for example: mimic=>
0,0,c=1,0,0,0, 0,0,i=2,0,0,0, m=2,0,0,0,0,0, 0,0,0,0,0,0, 0,0
Now,
matchcnt = 0;
for i=0 to strlen(T)
{ if (hash[T[i] == 0) { matchcnt = 0; continue; }
else { hash[T[i]] --; matchcnt++;
if (matchcnt == strlen(P) return (i-l);
}
}
C#
following sample returns 1, 2, 3, 4 for "BANANA" and "BA"
public void Main1()
{
String str1 = "BANANA";
String str2 = "AN";
//get indices of all permutations to a list
List<int> perms = GetPerms(str1, str2);
foreach(int i in perms)
{
Console.WriteLine(i);
}
Console.ReadLine();
}
private List<int> GetPerms(String s1, String s2)
{
List<int> matchIndices = new List<int>();
String small, large;
int smallLen, largeLen;
if ((s1 != null) && (s2 != null))
{
//1. Identify which string is small and which is large
smallLen = s1.Length;
largeLen = s2.Length;
if (smallLen > largeLen)
{
//swap
int tmp = smallLen;
smallLen = largeLen;
largeLen = tmp;
small = s2;
large = s1;
}
else
{
small = s1;
large = s2;
}
Int16[] smallFreq = new Int16[Int16.MaxValue];
Int16[] largeFreq = new Int16[Int16.MaxValue];
//get char frequency of small String
for(int i = 0; i < smallLen; i++)
{
smallFreq[small[i]] = (Int16) (smallFreq[small[i]] + 1);
}
int charCount = 0;// Count of chars in small
int matchStartIndex = -1; // The index where a match was found
//get char frequency of large String
for (int i = 0; i < largeLen; i++)
{
if (smallFreq[large[i]] > largeFreq[large[i]]) //a character match found
{
if (matchStartIndex == -1)
{
matchStartIndex = i; //remember the start index of possible permutation
}
largeFreq[large[i]] = (Int16)(largeFreq[large[i]] + 1);
charCount++;
if (charCount == smallLen) //permutation found
{
matchIndices.Add(matchStartIndex);
i = matchStartIndex; //start from the next location
matchStartIndex = -1; //flag that we are not within a match
charCount = 0;
Array.Clear(largeFreq, 0, largeFreq.Length);
}
}
else if ((smallFreq[large[i]] > 0) && (smallFreq[large[i]] == largeFreq[large[i]])) //too many instances of a character found within a possible permutation
{
//skip one index and try again
i = matchStartIndex; //start from the next location
matchStartIndex = -1; //flag that we are not within a match
charCount = 0;
Array.Clear(largeFreq, 0, largeFreq.Length);
}
else if (matchStartIndex != -1)
{
//mismatch found while in the middle of a possible permutation
matchStartIndex = -1;
charCount = 0;
Array.Clear(largeFreq, 0, largeFreq.Length);
}
}
}
return matchIndices;
}
This is java version of the code with sliding window approach. Feel free to add comments on how to optimize this code.
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
private static int matchPermutedSubString(String str, String substr){
if(str == null || substr == null){
return 0;
}
if(str.length() < substr.length() || str.length() == 0 || substr.length() == 0){
return 0;
}
Map<Character, Integer> mapSubStr = buildHashMap(substr);
Map<Character, Integer> mapWindow = null;
int windowLen = substr.length();
for(int i = 0; i <= str.length(); i++){
if(i + windowLen <= str.length()){
int startIndex = i;
String pat = str.substring(i, i + windowLen);
mapWindow = buildHashMap(pat);
if(areMapsEqual(mapSubStr, mapWindow)){
return startIndex;
}
else{
mapWindow = null;
}
}
else{
return 0;
}
}
return 0;
}
private static boolean areMapsEqual(Map<Character, Integer> map1, Map<Character, Integer> map2){
Set<Entry<Character, Integer>> entry1 = map1.entrySet();
Iterator<Entry<Character, Integer>> itr1 = entry1.iterator();
Iterator<Entry<Character, Integer>> itr2 = map2.entrySet().iterator();
while(itr1.hasNext() && itr2.hasNext()){
if(!itr1.next().equals(itr2.next())){
return false;
}
}
return true;
}
private static Map<Character, Integer> buildHashMap(String str){
Map<Character, Integer> map = new HashMap();
for(char c: str.toCharArray()){
if(map.containsKey(c)){
map.put(c, map.get(c) + 1);
}
else{
map.put(c, 1);
}
}
return map;
}
Java Solution with 2 maps
a short str-O( N)
b long strO( M)
each letter in b can be accessed 2 times at most ( 1- add to map 2- delete from map)
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Hashtable;
import java.util.Map;
public class Sol {
public static void main (String [] args){
String a= "sis";
String b = "mississippi";
ArrayList<Integer> ans = getFirstIndPerm(a,b);
System.out.println(ans);
}
private static ArrayList<Integer> getFirstIndPerm(String a, String b) {
ArrayList<Integer> ans = new ArrayList<Integer>();
HashMap<Character,Integer > charMap = new HashMap<Character,Integer > ();
HashMap<Character,Integer > checkMap = new HashMap<Character,Integer > ();
int start = -1;
int l=0;
int shortLen=a.length();
updateMap(a.toCharArray(),charMap);
for(int i=0; i< b.length();++i){
char cur = b.charAt(i);
if(!charMap.containsKey(cur))
{
l=0;
checkMap.clear();
start=-1;
continue;
}
if(charMap.get(cur)==checkMap.get(cur)){
for(int j=0;j<l;j++){
char ch= b.charAt(start+j);
checkMap.put(ch, checkMap.get(ch)-1);
l--;
if(l==0)
{
start=-1;
break;
}
if(ch==cur){
start+=j+1;
break;
}
}
}
if(start==-1)
start=i;
if(!checkMap.containsKey(cur))
checkMap.put(cur, 1);
else
checkMap.put(cur,checkMap.get(cur)+1);
l++;
if(l==shortLen){
ans.add(start);
char ch= b.charAt(start);
checkMap.put(ch, checkMap.get(ch)-1);
l--;
start++;
}
}
return ans;
}
private static void updateMap(char[] cs, HashMap<Character, Integer> charMap) {
for(char ch : cs){
if(charMap.containsKey(ch))
charMap.put(ch,charMap.get(ch)+1);
else
charMap.put(ch,1);
}
}
}
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Hashtable;
import java.util.Map;
public class Sol {
public static void main (String [] args){
String a= "sis";
String b = "mississippi";
ArrayList<Integer> ans = getFirstIndPerm(a,b);
System.out.println(ans);
}
private static ArrayList<Integer> getFirstIndPerm(String a, String b) {
ArrayList<Integer> ans = new ArrayList<Integer>();
HashMap<Character,Integer > charMap = new HashMap<Character,Integer > ();
HashMap<Character,Integer > checkMap = new HashMap<Character,Integer > ();
int start = -1;
int l=0;
int shortLen=a.length();
updateMap(a.toCharArray(),charMap);
for(int i=0; i< b.length();++i){
char cur = b.charAt(i);
if(!charMap.containsKey(cur))
{
l=0;
checkMap.clear();
start=-1;
continue;
}
if(charMap.get(cur)==checkMap.get(cur)){
for(int j=0;j<l;j++){
char ch= b.charAt(start+j);
checkMap.put(ch, checkMap.get(ch)-1);
l--;
if(l==0)
{
start=-1;
break;
}
if(ch==cur){
start+=j+1;
break;
}
}
}
if(start==-1)
start=i;
if(!checkMap.containsKey(cur))
checkMap.put(cur, 1);
else
checkMap.put(cur,checkMap.get(cur)+1);
l++;
if(l==shortLen){
ans.add(start);
char ch= b.charAt(start);
checkMap.put(ch, checkMap.get(ch)-1);
l--;
start++;
}
}
return ans;
}
private static void updateMap(char[] cs, HashMap<Character, Integer> charMap) {
for(char ch : cs){
if(charMap.containsKey(ch))
charMap.put(ch,charMap.get(ch)+1);
else
charMap.put(ch,1);
}
}
}
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Hashtable;
import java.util.Map;
public class Sol {
public static void main (String [] args){
String a= "sis";
String b = "mississippi";
ArrayList<Integer> ans = getFirstIndPerm(a,b);
System.out.println(ans);
}
private static ArrayList<Integer> getFirstIndPerm(String a, String b) {
ArrayList<Integer> ans = new ArrayList<Integer>();
HashMap<Character,Integer > charMap = new HashMap<Character,Integer > ();
HashMap<Character,Integer > checkMap = new HashMap<Character,Integer > ();
int start = -1;
int l=0;
int shortLen=a.length();
updateMap(a.toCharArray(),charMap);
for(int i=0; i< b.length();++i){
char cur = b.charAt(i);
if(!charMap.containsKey(cur))
{
l=0;
checkMap.clear();
start=-1;
continue;
}
if(charMap.get(cur)==checkMap.get(cur)){
for(int j=0;j<l;j++){
char ch= b.charAt(start+j);
checkMap.put(ch, checkMap.get(ch)-1);
l--;
if(l==0)
{
start=-1;
break;
}
if(ch==cur){
start+=j+1;
break;
}
}
}
if(start==-1)
start=i;
if(!checkMap.containsKey(cur))
checkMap.put(cur, 1);
else
checkMap.put(cur,checkMap.get(cur)+1);
l++;
if(l==shortLen){
ans.add(start);
char ch= b.charAt(start);
checkMap.put(ch, checkMap.get(ch)-1);
l--;
start++;
}
}
return ans;
}
private static void updateMap(char[] cs, HashMap<Character, Integer> charMap) {
for(char ch : cs){
if(charMap.containsKey(ch))
charMap.put(ch,charMap.get(ch)+1);
else
charMap.put(ch,1);
}
}
}
Time complexity is O(N), and does not need extra space. It is possible that long string could repeat short string for several times (e.g.: 1abc2abc -> abc), so i use a int[] to store all start indexes.
var shortarr = shortstr.ToArray();
var longarr = longstr.ToArray();
List<int> output = new List<int>();
int si = 0, li = 0;
while (li < longarr.Length)
{
while (li < longarr.Length && si < shortarr.Length && longarr[li] == shortarr[si])
{
li++;
si++;
}
if (si == shortarr.Length)
{
output.Add(li - si);
si = 0;
}
li++;
}
return output.ToArray();
Interesting question. It can be solved in linear time, or, to be more precise, O(N+M) where N and M is the length of each string.
Assume the larger string has size N, and the shorted has size M. A naive approach is to iterate through each index i in the large string, compute the character frequency table of large_str[i..i+M-1], and compare it to the short string's character frequency table - if they are the same, then we report i as an index.
However, computing the character frequency table of each substring from scratch and comparing it against the short string's table is O(M), so this would be O(NM).
We can improve on this and make it O(N+M). First, compute the character frequency table for the short string. Then, compute the character frequency table for large_str[0..M-1]. As you build it for the larger string, keep track of how many unique positions were filled, and how many of those matched with positions in the short string frequency table.
Then it works as if you had a sliding window on the larger string. When you move it right one place, remove the character on the left, and add the new character on the right, updating the number of matched positions so far, along with the number of total positions.
If you implement it correctly, then at any given time you know how many positions you have matched so far and how many unique positions are set in the character frequency table. So you can test if both tables match in O(1) - they are the same it the number of matched positions on the large table is equal to the number of positions filled in the small table, and if the number of total filled positions in the large stable is equal to the number of positions filled in the small table.
Implementation in C below; ran a few test cases and seems to be good. One interesting testcase is "mississippi" with "sis". It correctly finds all overlapping matches.
- 010010.bin August 03, 2015