Microsoft Interview Question


Team: Application insight
Country: Israel
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
4
of 4 vote

Interesting question. It can be solved in linear time, or, to be more precise, O(N+M) where N and M is the length of each string.

Assume the larger string has size N, and the shorted has size M. A naive approach is to iterate through each index i in the large string, compute the character frequency table of large_str[i..i+M-1], and compare it to the short string's character frequency table - if they are the same, then we report i as an index.

However, computing the character frequency table of each substring from scratch and comparing it against the short string's table is O(M), so this would be O(NM).

We can improve on this and make it O(N+M). First, compute the character frequency table for the short string. Then, compute the character frequency table for large_str[0..M-1]. As you build it for the larger string, keep track of how many unique positions were filled, and how many of those matched with positions in the short string frequency table.

Then it works as if you had a sliding window on the larger string. When you move it right one place, remove the character on the left, and add the new character on the right, updating the number of matched positions so far, along with the number of total positions.

If you implement it correctly, then at any given time you know how many positions you have matched so far and how many unique positions are set in the character frequency table. So you can test if both tables match in O(1) - they are the same it the number of matched positions on the large table is equal to the number of positions filled in the small table, and if the number of total filled positions in the large stable is equal to the number of positions filled in the small table.

Implementation in C below; ran a few test cases and seems to be good. One interesting testcase is "mississippi" with "sis". It correctly finds all overlapping matches.

#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <limits.h>

void match_permutations_aux(const char *big_str, const char *short_str, size_t short_len,
			    const size_t short_freq[UCHAR_MAX+1], size_t nfreq) {

	static size_t big_freq[UCHAR_MAX+1];
	memset(big_freq, 0, sizeof(big_freq));

	size_t total_big = 0;
	size_t matched = 0;
	size_t i;
	for (i = 0; i < short_len; i++) {
		unsigned char idx = big_str[i];
		if (big_freq[idx] == short_freq[idx] && big_freq[idx] != 0)
			matched--;

		if (big_freq[idx] == 0)
			total_big++;

		big_freq[idx]++;

		if (big_freq[idx] == short_freq[idx])
			matched++;
	}

	if (matched == nfreq && total_big == nfreq)
		printf("0 ");

	for (i = short_len; big_str[i] != '\0'; i++) {
		unsigned char idx_remove = big_str[i-short_len];
		unsigned char idx_add = big_str[i];

		assert(big_freq[idx_remove] > 0);

		/* Remove leftmost character */
		if (big_freq[idx_remove] == short_freq[idx_remove])
			matched--;
		big_freq[idx_remove]--;
		if (big_freq[idx_remove] == short_freq[idx_remove] && big_freq[idx_remove] != 0)
			matched++;
		if (big_freq[idx_remove] == 0)
			total_big--;

		/* Add rightmost character */
		if (big_freq[idx_add] == 0)
			total_big++;
		if (big_freq[idx_add] == short_freq[idx_add] && big_freq[idx_add] != 0)
			matched--;
		big_freq[idx_add]++;
		if (big_freq[idx_add] == short_freq[idx_add])
			matched++;

		if (matched == nfreq && total_big == nfreq)
			printf("%zu ", i-short_len+1);

	}
}

void match_permutations(const char *str_a, const char *str_b) {
	if (str_a == NULL || str_b == NULL)
		return;

	const char *big_str = str_a;
	const char *short_str = str_b;
	size_t short_len;

	if (strlen(str_b) > strlen(str_a)) {
		big_str = str_b;
		short_str = str_a;
	}

	short_len = strlen(short_str);

	static size_t short_freq[UCHAR_MAX+1];
	memset(short_freq, 0, sizeof(short_freq));

	size_t total_short = 0;
	size_t i;
	for (i = 0; i < short_len; i++) {
		if (short_freq[(unsigned char) short_str[i]] == 0)
			total_short++;
		short_freq[(unsigned char) short_str[i]]++;
	}

	match_permutations_aux(big_str, short_str, short_len, short_freq, total_short);
	putchar('\n');
}

static char str_a_buff[1024];
static char str_b_buff[1024];

int main(void) {
	printf("Enter string A and B.\n");
	printf("> ");

	while (scanf("%s%s", str_a_buff, str_b_buff) == 2) {
		match_permutations(str_a_buff, str_b_buff);
		printf("> ");
	}

	return 0;
}

- 010010.bin August 03, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

I think there is no use of this line in the first for loop, correct me if I am wrong :

if (big_freq[idx] == short_freq[idx] && big_freq[idx] != 0)
			matched--;

- Phantom September 07, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Can we build a suffix tree with the long string and find the short string permutation in it ??

- ram.prasad.prabu August 04, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

sort both strings and then find the shorter one in the longer.

- shabgard August 13, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I think it can be solved with hash[26].
Bigger string: T
Shorter string: P
Make a hash of P: hash[26] -> {no.ofA,no.ofB, no.ofC,.., no.ofZ}
for example: mimic=>

0,0,c=1,0,0,0, 0,0,i=2,0,0,0, m=2,0,0,0,0,0, 0,0,0,0,0,0, 0,0

Now,

matchcnt = 0;
 for i=0 to strlen(T)
	{ if (hash[T[i] == 0) { matchcnt = 0; continue; }
	  else { hash[T[i]] --; matchcnt++; 
		     if (matchcnt == strlen(P) return (i-l);
		 }
	}

- anonymous August 26, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

C#
following sample returns 1, 2, 3, 4 for "BANANA" and "BA"

public void Main1()
		{
			String str1 = "BANANA";
			String str2 = "AN";
			//get indices of all permutations to a list
			List<int> perms = GetPerms(str1, str2);

			foreach(int i in perms)
			{
				Console.WriteLine(i);
			}
			Console.ReadLine();
		}

		private List<int> GetPerms(String s1, String s2)
		{
			List<int> matchIndices = new List<int>();
			String small, large;

			int smallLen, largeLen;

			if ((s1 != null) && (s2 != null))
			{
				//1. Identify which string is small and which is large
				smallLen = s1.Length;
				largeLen = s2.Length;

				if (smallLen > largeLen)
				{
					//swap
					int tmp = smallLen;
					smallLen = largeLen;
					largeLen = tmp;
					small = s2;
					large = s1;
				}
				else
				{
					small = s1;
					large = s2;
				}

				Int16[] smallFreq = new Int16[Int16.MaxValue];
				Int16[] largeFreq = new Int16[Int16.MaxValue];

				//get char frequency of small String
				for(int i = 0; i < smallLen; i++)
				{
					smallFreq[small[i]] = (Int16) (smallFreq[small[i]] + 1);
				}

				int charCount = 0;// Count of chars in small 
				int matchStartIndex = -1; // The index where a match was found

				//get char frequency of large String
				for (int i = 0; i < largeLen; i++)
				{
					if (smallFreq[large[i]] > largeFreq[large[i]]) //a character match found
					{
						if (matchStartIndex == -1)
						{
							matchStartIndex = i; //remember the start index of possible permutation
						}
						largeFreq[large[i]] = (Int16)(largeFreq[large[i]] + 1);
						charCount++;
						if (charCount == smallLen) //permutation found
						{
							matchIndices.Add(matchStartIndex);
							i = matchStartIndex; //start from the next location
							matchStartIndex = -1; //flag that we are not within a match
							charCount = 0;
							Array.Clear(largeFreq, 0, largeFreq.Length);
						}
					}
					else if ((smallFreq[large[i]] > 0) && (smallFreq[large[i]] == largeFreq[large[i]])) //too many instances of a character found within a possible permutation
					{
						//skip one index and try again
						i = matchStartIndex; //start from the next location
						matchStartIndex = -1; //flag that we are not within a match
						charCount = 0;
						Array.Clear(largeFreq, 0, largeFreq.Length);
					}
					else if (matchStartIndex != -1)
					{
						//mismatch found while in the middle of a possible permutation
						matchStartIndex = -1;
						charCount = 0;
						Array.Clear(largeFreq, 0, largeFreq.Length);
					}
				}

			}
			return matchIndices;
		}

- IC August 28, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

what do you think it should return for string1= "aacbcabc" and string2= "abc" ?

- popeye123 January 27, 2016 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

This is java version of the code with sliding window approach. Feel free to add comments on how to optimize this code.

import java.util.HashMap;
	import java.util.HashSet;
	import java.util.Iterator;
	import java.util.Map;
	import java.util.Map.Entry;
	import java.util.Set;
	
	private static int matchPermutedSubString(String str, String substr){
		
		if(str == null || substr == null){
			return 0;
		}
		if(str.length() < substr.length() || str.length() == 0 || substr.length() == 0){
			return 0;
		}
		
		Map<Character, Integer> mapSubStr = buildHashMap(substr);
		Map<Character, Integer> mapWindow = null;
		int windowLen = substr.length();
		
		for(int i = 0; i <= str.length(); i++){
			
			if(i + windowLen <= str.length()){
				int startIndex = i;
				String pat = str.substring(i, i + windowLen);
				mapWindow = buildHashMap(pat);
				
				if(areMapsEqual(mapSubStr, mapWindow)){
					return startIndex;
				}
				else{
					mapWindow = null;
				}
			}
			else{
				return 0;
			}
		}
		
		return 0;
	}
	
	private static boolean areMapsEqual(Map<Character, Integer> map1, Map<Character, Integer> map2){
		
		Set<Entry<Character, Integer>> entry1 = map1.entrySet();
		Iterator<Entry<Character, Integer>> itr1 =  entry1.iterator();
		Iterator<Entry<Character, Integer>> itr2 =  map2.entrySet().iterator();
		
		while(itr1.hasNext() && itr2.hasNext()){
			
			if(!itr1.next().equals(itr2.next())){
				return false;
			}
		}
		
		return true;
	}
	
	private static Map<Character, Integer> buildHashMap(String str){
		
		Map<Character, Integer> map = new HashMap();
		
		for(char c: str.toCharArray()){
			if(map.containsKey(c)){
				
				map.put(c, map.get(c) + 1);
			}
			else{
				map.put(c, 1);
			}
		}
		
		return map;
	}

- ASHISH12021 September 17, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Simple anagram question. String first = LongStr.substr(0,shortstr.length()); and second shortstr. Check if both are anagram to each other

- VJ October 01, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Java Solution with 2 maps
a short str-O( N)
b long strO( M)
each letter in b can be accessed 2 times at most ( 1- add to map 2- delete from map)

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Hashtable;
import java.util.Map;

public class Sol {
public static void main (String [] args){
		String a= "sis";
		String b = "mississippi";
		 ArrayList<Integer>  ans = getFirstIndPerm(a,b);
		System.out.println(ans);
		}

private static ArrayList<Integer> getFirstIndPerm(String a, String b) {
	ArrayList<Integer>  ans = new ArrayList<Integer>();
	HashMap<Character,Integer > charMap = new HashMap<Character,Integer > ();
	HashMap<Character,Integer > checkMap = new HashMap<Character,Integer > ();
	int start = -1;
	int l=0;
	int shortLen=a.length();
	updateMap(a.toCharArray(),charMap);
	for(int i=0; i< b.length();++i){
		char cur = b.charAt(i);
		if(!charMap.containsKey(cur))
		{	
			l=0;
			checkMap.clear();
			start=-1;
			continue;
		}
		if(charMap.get(cur)==checkMap.get(cur)){
			for(int j=0;j<l;j++){
				char ch= b.charAt(start+j);
				checkMap.put(ch, checkMap.get(ch)-1);
				l--;
				if(l==0)
				{
					start=-1;
					break;
				}
				if(ch==cur){
					start+=j+1;
					break;
				}
			}
		}
		if(start==-1)
			start=i;
		if(!checkMap.containsKey(cur))
			checkMap.put(cur, 1);
		else
			checkMap.put(cur,checkMap.get(cur)+1);
		l++;
		if(l==shortLen){
			ans.add(start);
			char ch= b.charAt(start);
			checkMap.put(ch, checkMap.get(ch)-1);
			l--;
			start++;
			
		}
			
		
		
	}
	return ans;
}

private static void updateMap(char[] cs, HashMap<Character, Integer> charMap) {
	for(char ch : cs){
		if(charMap.containsKey(ch))
			charMap.put(ch,charMap.get(ch)+1);
		else
			charMap.put(ch,1);
	}
}
		
}

- Idan October 17, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Hashtable;
import java.util.Map;

public class Sol {
public static void main (String [] args){
		String a= "sis";
		String b = "mississippi";
		 ArrayList<Integer>  ans = getFirstIndPerm(a,b);
		System.out.println(ans);
		}

private static ArrayList<Integer> getFirstIndPerm(String a, String b) {
	ArrayList<Integer>  ans = new ArrayList<Integer>();
	HashMap<Character,Integer > charMap = new HashMap<Character,Integer > ();
	HashMap<Character,Integer > checkMap = new HashMap<Character,Integer > ();
	int start = -1;
	int l=0;
	int shortLen=a.length();
	updateMap(a.toCharArray(),charMap);
	for(int i=0; i< b.length();++i){
		char cur = b.charAt(i);
		if(!charMap.containsKey(cur))
		{	
			l=0;
			checkMap.clear();
			start=-1;
			continue;
		}
		if(charMap.get(cur)==checkMap.get(cur)){
			for(int j=0;j<l;j++){
				char ch= b.charAt(start+j);
				checkMap.put(ch, checkMap.get(ch)-1);
				l--;
				if(l==0)
				{
					start=-1;
					break;
				}
				if(ch==cur){
					start+=j+1;
					break;
				}
			}
		}
		if(start==-1)
			start=i;
		if(!checkMap.containsKey(cur))
			checkMap.put(cur, 1);
		else
			checkMap.put(cur,checkMap.get(cur)+1);
		l++;
		if(l==shortLen){
			ans.add(start);
			char ch= b.charAt(start);
			checkMap.put(ch, checkMap.get(ch)-1);
			l--;
			start++;
			
		}
			
		
		
	}
	return ans;
}

private static void updateMap(char[] cs, HashMap<Character, Integer> charMap) {
	for(char ch : cs){
		if(charMap.containsKey(ch))
			charMap.put(ch,charMap.get(ch)+1);
		else
			charMap.put(ch,1);
	}
}
		
}

- Idan October 17, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Hashtable;
import java.util.Map;

public class Sol {
public static void main (String [] args){
		String a= "sis";
		String b = "mississippi";
		 ArrayList<Integer>  ans = getFirstIndPerm(a,b);
		System.out.println(ans);
		}

private static ArrayList<Integer> getFirstIndPerm(String a, String b) {
	ArrayList<Integer>  ans = new ArrayList<Integer>();
	HashMap<Character,Integer > charMap = new HashMap<Character,Integer > ();
	HashMap<Character,Integer > checkMap = new HashMap<Character,Integer > ();
	int start = -1;
	int l=0;
	int shortLen=a.length();
	updateMap(a.toCharArray(),charMap);
	for(int i=0; i< b.length();++i){
		char cur = b.charAt(i);
		if(!charMap.containsKey(cur))
		{	
			l=0;
			checkMap.clear();
			start=-1;
			continue;
		}
		if(charMap.get(cur)==checkMap.get(cur)){
			for(int j=0;j<l;j++){
				char ch= b.charAt(start+j);
				checkMap.put(ch, checkMap.get(ch)-1);
				l--;
				if(l==0)
				{
					start=-1;
					break;
				}
				if(ch==cur){
					start+=j+1;
					break;
				}
			}
		}
		if(start==-1)
			start=i;
		if(!checkMap.containsKey(cur))
			checkMap.put(cur, 1);
		else
			checkMap.put(cur,checkMap.get(cur)+1);
		l++;
		if(l==shortLen){
			ans.add(start);
			char ch= b.charAt(start);
			checkMap.put(ch, checkMap.get(ch)-1);
			l--;
			start++;
			
		}
			
		
		
	}
	return ans;
}

private static void updateMap(char[] cs, HashMap<Character, Integer> charMap) {
	for(char ch : cs){
		if(charMap.containsKey(ch))
			charMap.put(ch,charMap.get(ch)+1);
		else
			charMap.put(ch,1);
	}
}
		
}

- idan.nik October 17, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Time complexity is O(N), and does not need extra space. It is possible that long string could repeat short string for several times (e.g.: 1abc2abc -> abc), so i use a int[] to store all start indexes.

var shortarr = shortstr.ToArray();
var longarr = longstr.ToArray();
List<int> output = new List<int>();
int si = 0, li = 0;
while (li < longarr.Length)
{
	while (li < longarr.Length && si < shortarr.Length && longarr[li] == shortarr[si])
	{
		li++;
		si++;
	}

	if (si == shortarr.Length)
	{
		output.Add(li - si);
		si = 0;
	}

	li++;
}

return output.ToArray();

- renbin October 20, 2015 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More