Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: Phone Interview
This is how to generate the first 26 prime numbers:
int[] GenerateNPrimeNumbers(int n){
int[] p = new int[n+1]{};
P[0] = 1;
p[1] = 3;
int i = 1; // # of found prime numbers
int num = 3;// start checking from the next odd number of 3
while( i< n){
num+= 2;// get the next odd number
boolean isPrime = true;
int j = 1; // skip 1
// check all prime numbers that are smaller than num squre
while(p[j] <= Math.sqr(num)){
if(num/p[j] == 0){
isPrime = false;
break;
}
j++;
}
if(isPrime){
p[i] = num;
i++;
}
}
return p;
}
I can only come up with a brute force method for this.
1) create a historgram of characters of needle string
2) Starting at the beginning of haystack pick sets of k chars (where k is the length of needle string). If the historgram of this set matches the one for needle then return true
Running time O(kn)
A solution in Java making two assumptions: 1) characters are [a-z], 2) no letter repetition. Extending from 1 is easy: just use larger freq and location arrays. Extending from 2 requires location to be an array of list since for each character there might be multiple possible location candidates. It then also requires modification to the algorithm to explore the different possible path, whereas with assumption 1 there is only a single path.
// assumption, [a-z]
private static boolean anaStrStr (String needle, String haystack){
if (haystack.length() < needle.length())
return false;
int [] freq = new int[26];
int [] location = new int[26];
for (int i = 0; i < haystack.length(); i++){
freq[(int)haystack.charAt(i)-'a'] ++;
location[(int)haystack.charAt(i)-'a'] = i;
}
int s = Integer.MAX_VALUE;
int e = 0;
for (int i = 0; i < needle.length(); i++){
if ( freq[(int)needle.charAt(i)-'a'] == 0)
return false;
freq[(int)needle.charAt(i)-'a']--;
int loc = location[(int)needle.charAt(i)-'a'];
if ( loc == (s-1) || location[(int)needle.charAt(i)-'a'] == (e+1) || s == Integer.MAX_VALUE || e == 0){
s = Math.min(s, location[(int)needle.charAt(i)-'a']);
e = Math.max(e, location[(int)needle.charAt(i)-'a']);
} else {
return false;
}
}
return true;
}
My idea to solve this:
1. either use a array of integer count[] to record the number of each character appears in string 1, or use a hashtable, character as key and count as value.
2. go through string2 from head to tail and lookup the current character in ur array or hashtable. each lookup takes O(1) in either case.
3. once you found a matching character, use it as starting point and check the characters following by.
4. if step 3 fail in middle, reset and start over the search from the next character in string2.
This algorithm should take only O(n) time and O(n) space.
I will try come up with code later
Here's an idea:
1. Define an array of 256 counters for the needle characters. Count the number of occurrences for each character in needle.
2. Define the same array as in (1) for the first needle.length() characters in haystack.
3. Compare between the arrays, if all the counters are equal return true. Otherwise:
4. Iterate from i=needle.length() to haystack.length() and in each iteration:
4.1. Increase the occurrence of haystack.charAt(i) by 1 in the array defined in (2).
4.2. Decrease the occurrence of haystack.charAt(i-needle.length()) by 1 in the array defined in (2).
4.3. Compare the two arrays just like in step (3), if they are equal return true, otherwise continue to the next iteration.
5. If we finished the loop without finding a match then we should return false because there is no needle anagram sub-string in haystack.
Basically, we just check whether for each consecutive needle.length() characters are an anagram of needle by comparing the number of appearances of every character.
Complexity: Assuming the number of characters is constant (256), the run-time complexity is O(n) where n=|haystack|. Space complexity is O(1) (fixed size arrays).
private static int[] buildFreqArray(String s, int len){
if ((s==null) || (len<0)){
throw new IllegalArgumentException("input string must not be null, len must be positive");
}
int[] freq = new int[NUM_OF_CHARS];
for (int i=0;i<freq.length;i++){freq[i]=0;}
for (int i=0;(i<s.length()) && (i<len);i++){freq[s.charAt(i)]++;}
return freq;
}
private static boolean areFreqEqual(int[] freq1, int[] freq2){
if ((freq1==null) || (freq2==null)){
throw new NullPointerException("freq1 or freq2 are null");
}
if (freq1.length!=freq2.length){return false;}
boolean b = true;
for (int i=0;(i<freq1.length) && (b);i++){
b = (freq1[i]==freq2[i]);
}
return b;
}
public static boolean anaStrStr(String needle, String haystack){
if ((needle==null) || (haystack==null)){
throw new NullPointerException("needle or hatstack are null");
}
if (needle.length()>haystack.length()){return false;}
int[] needleFreq = buildFreqArray(needle,needle.length());
int[] haystackFreq = buildFreqArray(haystack,needle.length());
if (areFreqEqual(needleFreq,haystackFreq)){return true;}
for (int i=needle.length();i<haystack.length();i++){
haystackFreq[haystack.charAt(i)]++;
haystackFreq[haystack.charAt(i-needle.length())]--;
if (areFreqEqual(needleFreq,haystackFreq)){return true;}
}
return false;
}
The below solution is an O(n) time and O(n) space algorithm, although I assumed that the 2 strings can contain any character (that is, the primitive type char in Java - 16-bit Unicode characters). We could conclude that it is O(1) space also, because char can only have 65535 different values, therefore the Map's size is limited (has an upper bound).
1. We first build a map that contains the needle's characters and their number of occurences.
2. At this point we assume that the difference between the needle and the haystack is the length of the needle, since we haven't read yet anything from the haystack.
3. We also create a map for the haystack (similar to that of the needle's map, but we will fill it up along the way, it is empty first).
4. Then we iterate over the haystack and at every character we update the map of the haystack by looking at the last n characters of the haystack (where n is the length of the needle). We can actually do this in constant steps because we just remove an occurence of the character at the current-n position from the haystack's map and add an occurence at the newly read position. Meanwhile, we keep track of the difference between the needle and the current haystack "window". If at any point this difference is 0, then we return true (the current part of the haystack is an anagram with the needle). Otherwise, if we reach the end of the haystack without finding an anagram of the needle, then we return false.
It runs in O(n) time because we iterate over the haystack only once and at every step we do constant operations.
public boolean anaStrStr(String needle, String haystack) {
if (haystack == null || haystack.length() == 0 || needle == null)
return false;
else if (needle.length() == 0)
return true;
Map<Character, Integer> needleMap = new HashMap<Character, Integer>();
for (int i = 0; i < needle.length(); i++) {
inc(needleMap, needle.charAt(i));
}
int diffChars = needleMap.keySet().size();
char[] h = haystack.toCharArray();
Map<Character, Integer> haystackMap = new HashMap<Character, Integer>();
for (int i = 0; i < haystack.length(); i++) {
if (i >= needle.length()) {
diffChars += diff(haystackMap, needleMap, h[i - needle.length()], false);
}
diffChars += diff(haystackMap, needleMap, h[i], true);
if (diffChars == 0)
return true;
}
return false;
}
private int diff(Map<Character, Integer> map1, Map<Character, Integer> map2, char key, boolean inc) {
int value = getValue(map1, key);
int value2 = getValue(map2, key);
if (inc)
inc(map1, key);
else
dec(map1, key);
int valueMod = getValue(map1, key);
if (value != value2 && valueMod == value2)
return -1;
if (value == value2 && valueMod != value2)
return 1;
return 0;
}
private int getValue(Map<Character, Integer> map, char key) {
return map.get(key) == null ? 0 : map.get(key);
}
private void inc(Map<Character, Integer> map, char key) {
if (!map.containsKey(key))
map.put(key, 1);
else
map.put(key, map.get(key) + 1);
}
private void dec(Map<Character, Integer> map, char key) {
if (!map.containsKey(key))
return;
else if (map.get(key) == 1)
map.remove(key);
else
map.put(key, map.get(key) - 1);
}
my implementation based on this algorithm
static boolean anaStrStr (String needle, String haystack){
Map<Character, Integer> needleMap = new HashMap<Character, Integer>();
for(int i=0; i<needle.length(); i++){
char c = needle.charAt(i);
if(needleMap.get(c)!=null){
needleMap.put(c, needleMap.get(c)+1);
}
else{
needleMap.put(c, 1);
}
}
int diff = needle.length();
Map<Character, Integer> haystackMap = new HashMap<Character, Integer>();
for(int i=0; i<needle.length(); i++){
char c = haystack.charAt(i);
if(haystackMap.get(c)!=null){
haystackMap.put(c, haystackMap.get(c)+1);
}
else{
haystackMap.put(c, 1);
}
if(needleMap.get(c)!=null && needleMap.get(c)>=haystackMap.get(c)){
diff--;
}
}
for(int i=needle.length(); i<haystack.length(); i++){
if(diff==0){
return true;
}
char c = haystack.charAt(i-needle.length());
haystackMap.put(c, haystackMap.get(c)-1);
if(needleMap.get(c)!=null && needleMap.get(c)>haystackMap.get(c)){
diff++;
}
c = haystack.charAt(i);
if(haystackMap.get(c)!=null){
haystackMap.put(c, haystackMap.get(c)+1);
}
else{
haystackMap.put(c, 1);
}
if(needleMap.get(c)!=null && needleMap.get(c)>=haystackMap.get(c)){
diff--;
}
}
return false;
}
An O(nk) solution where n=length of haystack and k= length of needle.
bool is_present(string str, string find_me)
{
//map to store count of each chracter
map<char, int>my_map;
for(int i=0; i<find_me.size(); ++i) {
if(my_map.find(find_me[i]) != my_map.end())
my_map[find_me[i]] = 1;
else
my_map[find_me[i]]++;
}
//creating a copy of map to use in each iteration
map<char, int>map_copy;
map_copy.insert(my_map.begin(), my_map.end());
map<char, int>::iterator it;
/*Each time we find the chracter in map check whether we
have got a match of length of string to find. */
int len = find_me.length();
//local variable to keep track of number of chars matched already
int count_len = 0;
for(int i=0; i<str.size(); ++i) {
it = map_copy.find(str[i]);
if(it==map_copy.end() || it->second == 0){
map_copy.clear();
map_copy.insert(my_map.begin(), my_map.end());
count_len = 0;
}
else {
count_len++;
if(count_len == len)
return 1;
it->second--;
}
}
return 0;
}
How about following solution. If we check every sub string of haystack to be anagram of needle?
var haystack = "somelongtexthavinganagrams";
var anagram = "xett";
function anaStrStr(needle, haystack) {
if (haystack.length >= needle.length) {
var sortedNeedle = needle.split('').sort().join('');
var anograms = {};
for (var index = 0; index < haystack.length - needle.length; index += 1) {
var anogram = haystack.substr(index, needle.length);
if (anogram.split('').sort().join('') == sortedNeedle) {
return true;
}
}
} else {
return false;
}
}
var exists = anaStrStr(anagram, haystack);
1. Find substring of the smallest string length in the largest string.
2. Sort the substring and the smallest string and return true if they are equal.
public static boolean isAnagrams(String s1, String s2) {
boolean result = false;
char[] c;String s;
if (s1.length() > s2.length()) {
s = s1;
c = s2.toCharArray();
} else {
s = s2;
c = s1.toCharArray();
}
for (int i=0; i<=(s.length()-c.length);i++) {
String sub = s.substring(i, i+c.length);
result = sort(sub).equals(sort(s1));
if (result == true)
break;
}
return result;
}
public static String sort(String s) {
char c1[] = s.toCharArray();
java.util.Arrays.sort(c1);
return new String(c1);
}
Here is another O(N) solution with some assumptions:
- The needle string is relatively small compared to haystack string. By relatively small, I mean the difference might go into a few orders of magnitude.
- The needle string is small in size so that operations like toLowerCase and toCharArray are O(1). Also, Arrays.sort operation on the needle would be O(1) then.
The algorithm is,
- Store the needle string sorted and lowercase in a hash table. This is O(1) by assumptions above.
- Iterate over haystack string, picking a substring equal to needle's length at every iterated index. Sort this substring, convert it to lowercase and search in hashtable. Iterating and getting substrings will be O(N) iterations. Each iteration sorts a substring, makes it lowercase which are O(1) by assumption and lookups in hashtable, O(1). So, this is overall O(N).
public class Strings {
public static boolean anaStrStr(String needle, String haystack) {
boolean found = false;
char[] needleChars = needle.toLowerCase().toCharArray();
Arrays.sort(needleChars);
String needleL = new String(needleChars);
String haystackL = haystack.toLowerCase();
Set<String> needlehash = new HashSet<String>();
needlehash.add(needleL);
for (int i = 0; i < haystack.length() - needle.length(); i++) {
char[] current = haystackL.substring(i, i + needle.length()).toCharArray();
Arrays.sort(current);
found = needlehash.contains(new String(current));
if (found)
break;
}
return found;
}
public static void main(String[] args) {
String s1 = "cat";
String s2 = "car";
String s3 = "actor";
String s4 = "full";
String s5 = "bullford";
System.out.println(anaStrStr(s1, s3));
System.out.println(anaStrStr(s2, s3));
System.out.println(anaStrStr(s4, s5));
}
}
Another solution in javaScript
1. Build hash out of needle characters keeping total count of them
2. Count unique characters number
3. Function works only when needle length is greater than haystack length
4. Slide range equal to needle length along haystack and check for characters inside. If character goes out of range we decrement its count, if character comes into range we increment its count. If number of characters matches their number in needle we increment total number of matches, if number of characters does not match anymore we decrement number of matches. When number of matches equals number of unique characters in needle we exit with success.
function anaStrStr(needle, haystack) {
if (haystack.length >= needle.length) {
var needleLen = needle.length;
var needleArray = needle.split('');
// Create hash for needle characters with value equal to their total count
var needleHash = {};
var dueMatches = 0;
for (var index = 0; index < needleLen; index += 1) {
var needleChar = needleArray[index];
if (needleHash.hasOwnProperty(needleChar)) {
needleHash[needleChar] -= 1;
} else {
needleHash[needleChar] = -1;
dueMatches += 1;
}
}
// Initialize hash
var matches = 0;
for (var index = 0; index < needleLen; index += 1) {
var haystackChar = haystack.substr(index, 1);
if (needleHash.hasOwnProperty(haystackChar)) {
if (needleHash[haystackChar] == 0) {
matches -= 1
}
needleHash[haystackChar] += 1;
if (needleHash[haystackChar] == 0) {
matches += 1
}
}
}
if (matches == dueMatches) {
return true;
} else {
for (var index = needleLen; index < haystack.length; index += 1) {
/* remove first character from account */
var prevHaystackChar = haystack.substr(index - needleLen, 1);
if (needleHash.hasOwnProperty(prevHaystackChar)) {
if (needleHash[prevHaystackChar] == 0) {
matches -= 1
}
needleHash[prevHaystackChar] -= 1;
if (needleHash[prevHaystackChar] == 0) {
matches += 1
}
}
/* add new character into account */
var haystackChar = haystack.substr(index, 1);
if (needleHash.hasOwnProperty(haystackChar)) {
if (needleHash[haystackChar] == 0) {
matches -= 1
}
needleHash[haystackChar] += 1;
if (needleHash[haystackChar] == 0) {
matches += 1
}
}
if (matches == dueMatches) {
return true;
}
}
}
}
return false;
}
var haystack = "somelongteexthavinganagrams";
var anagram = "xett";
var exists = anaStrStr(anagram, haystack);
exists;
bool anaStrStr (string needle, string haystack)
{
char *p1,p2,p1adv,p1start;
int bitmap[256],bitmap2[256];
p1 = haystack;
p1adv = haystack;
p2 = needle;
while(*p2) {
p1adv++;
p2++;
}
p2 = needle;
for(i=0;i<256;i++) {
bitmap[i] = 0;
}
while (*p2) {
bitmap[*p2]++;
}
while(*p1adv) {
p2 = needle;
p1start = p1;
memcpy(bitmap2,bitmap,256);
while (*p2 && *p1) {
if (!bitmap2[*p1]) {
break;
}
bitmap[*p1]--;
}
if (!*p2)
return 1;
p1 = p1start+1;
p1adv++;
}
return 0;
}
Time Complexity : O(nm)
I would go with a prime number algorithm. No need to check order of letters, only matters the result of the matching prodcut of corresponding prime numbers: Each lettre in the needle has a unique prime number equivalent and since primes numbers product is unique ... equals product means anagram
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Trees
{
class Program
{
private static int[] primes100 = new int[]
{
3, 7, 11, 17, 23, 29, 37,
47, 59, 71, 89, 107, 131,
163, 197, 239, 293, 353,
431, 521, 631, 761, 919,
1103, 1327, 1597, 1931,
2333, 2801, 3371, 4049,
4861, 5839, 7013, 8419,
10103, 12143, 14591, 17519,
21023, 25229, 30293, 36353,
43627, 52361, 62851, 75431,
90523, 108631, 130363,
156437, 187751, 225307,
270371, 324449, 389357,
467237, 560689, 672827,
807403, 968897, 1162687,
1395263, 1674319, 2009191,
2411033, 2893249, 3471899,
4166287, 4999559, 5999471,
7199369
};
private static int[] getNPrimes(int _n)
{
int[] _primes;
if (_n <= 100)
_primes = primes100.Take(_n).ToArray();
else
{
_primes = new int[_n];
int number = 0;
int i = 2;
while (number < _n)
{
var isPrime = true;
for (int j = 2; j <= Math.Sqrt(i); j++)
{
if (i % j == 0 && i != 2)
isPrime = false;
}
if (isPrime)
{
_primes[number] = i;
number++;
}
i++;
}
}
return _primes;
}
private static bool anaStrStr(string needle, string haystack)
{
bool _output = false;
var needleDistinct = needle.ToCharArray().Distinct();
int[] arrayOfPrimes = getNPrimes(needleDistinct.Count());
Dictionary<char, int> primeByChar = new Dictionary<char, int>();
int i = 0;
int needlePrimeSignature = 1;
foreach (var c in needleDistinct)
{
if (!primeByChar.ContainsKey(c))
{
primeByChar.Add(c, arrayOfPrimes[i]);
i++;
}
}
foreach (var c in needle)
{
needlePrimeSignature *= primeByChar[c];
}
for (int j = 0; j <= (haystack.Length - needle.Length); j++)
{
var result = 1;
for (int k = j; k < needle.Length + j; k++)
{
var letter = haystack[k];
result *= primeByChar.ContainsKey(letter) ? primeByChar[haystack[k]] : 0;
}
_output = (result == needlePrimeSignature);
if (_output)
break;
}
return _output;
}
static void Main(string[] args)
{
Console.WriteLine("Enter needle");
var _needle = Console.ReadLine(); ;
Console.WriteLine("Enter haystack");
var _haystack = Console.ReadLine();
Console.WriteLine(anaStrStr(_needle, _haystack));
Console.ReadLine();
}
}
}
One missing point is that the anagram has to be a word. So we should have a dictionary handy. Looking up a word of length "n" in the dictionary takes O(n).
I assumed all words are in dictionary.
Python:
# A dictionary with max entropy containing all
# combinations of characters
def isInDict(word):
return True
def anaStrStr(word, text):
nw = len(word)
sword = ''.join(sorted(word))
for n in range(0, len(text) - nw):
if sword == ''.join(sorted(text[n: n + nw])):
return isInDict(text[n : n + nw])
Example:
# Test
text = "This is a complicated sentence,"
print anaStrStr("act", text)
print anaStrStr("tac", text)
print anaStrStr("xoxo", text)
print anaStrStr("tense", text) #
Result:
True
True
None
True
This solution seems clearer than the other ones. It has O(n) time complexity and O(k) extra space, where k is the size of the needle.
The idea is pretty simple you keep track of a count and a map of occurrences of the needle. In the event that you encounter a value that isn't in the needle, reset both of these to their original. When you reach count of zero, return true.
from collections import defaultdict
def anagram_in_str(needle, haystack):
needle_count_orig = defaultdict(int)
for ch in needle:
needle_count_orig[ch] += 1
needle_count, cnt = needle_count_orig.copy(), len(needle)
for i, ch in enumerate(haystack):
if ch in needle_count:
needle_count[ch] -= 1
cnt -= 1
if needle_count[ch] < 0:
needle_count, cnt = needle_count_orig.copy(), len(needle)
elif cnt == 0:
return True
else:
needle_count, cnt = needle_count_orig.copy(), len(needle)
return False
This solution seems clearer than the other ones. It has O(n) time complexity and O(k) extra space, where k is the size of the needle.
The idea is pretty simple you keep track of a count and a map of occurrences of the needle. In the event that you encounter a value that isn't in the needle, reset both of these to their original. When you reach count of zero, return true.
from collections import defaultdict
def anagram_in_str(needle, haystack):
needle_count_orig = defaultdict(int)
for ch in needle:
needle_count_orig[ch] += 1
needle_count, cnt = needle_count_orig.copy(), len(needle)
for i, ch in enumerate(haystack):
if ch in needle_count:
needle_count[ch] -= 1
cnt -= 1
if needle_count[ch] < 0:
needle_count, cnt = needle_count_orig.copy(), len(needle)
elif cnt == 0:
return True
else:
needle_count, cnt = needle_count_orig.copy(), len(needle)
return False
This solution seems clearer than the other ones. It has O(n) time complexity and O(k) extra space, where k is the size of the needle.
The idea is pretty simple you keep track of a count and a map of occurrences of the needle. In the event that you encounter a value that isn't in the needle, reset both of these to their original. When you reach count of zero, return true.
from collections import defaultdict
def anagram_in_str(needle, haystack):
needle_count_orig = defaultdict(int)
for ch in needle:
needle_count_orig[ch] += 1
needle_count, cnt = needle_count_orig.copy(), len(needle)
for i, ch in enumerate(haystack):
if ch in needle_count:
needle_count[ch] -= 1
cnt -= 1
if needle_count[ch] < 0:
needle_count, cnt = needle_count_orig.copy(), len(needle)
elif cnt == 0:
return True
else:
needle_count, cnt = needle_count_orig.copy(), len(needle)
return False
This solution seems clearer than the other ones. It has O(n) time complexity and O(k) extra space, where k is the size of the needle.
The idea is pretty simple you keep track of a count and a map of occurrences of the needle. In the event that you encounter a value that isn't in the needle, reset both of these to their original. When you reach count of zero, return true.
from collections import defaultdict
def anagram_in_str(needle, haystack):
needle_count_orig = defaultdict(int)
for ch in needle:
needle_count_orig[ch] += 1
needle_count, cnt = needle_count_orig.copy(), len(needle)
for i, ch in enumerate(haystack):
if ch in needle_count:
needle_count[ch] -= 1
cnt -= 1
if needle_count[ch] < 0:
needle_count, cnt = needle_count_orig.copy(), len(needle)
elif cnt == 0:
return True
else:
needle_count, cnt = needle_count_orig.copy(), len(needle)
return False
This solution seems clearer than the other ones. It has O(n) time complexity and O(k) extra space, where k is the size of the needle.
The idea is pretty simple you keep track of a count and a map of occurrences of the needle. In the event that you encounter a value that isn't in the needle, reset both of these to their original. When you reach count of zero, return true.
from collections import defaultdict
def anagram_in_str(needle, haystack):
needle_count_orig = defaultdict(int)
for ch in needle:
needle_count_orig[ch] += 1
needle_count, cnt = needle_count_orig.copy(), len(needle)
for i, ch in enumerate(haystack):
if ch in needle_count:
needle_count[ch] -= 1
cnt -= 1
if needle_count[ch] < 0:
needle_count, cnt = needle_count_orig.copy(), len(needle)
elif cnt == 0:
return True
else:
needle_count, cnt = needle_count_orig.copy(), len(needle)
return False
This solution seems clearer than the other ones. It has O(n) time complexity and O(k) extra space, where k is the size of the needle.
The idea is pretty simple you keep track of a count and a map of occurrences of the needle. In the event that you encounter a value that isn't in the needle, reset both of these to their original. When you reach count of zero, return true.
from collections import defaultdict
def anagram_in_str(needle, haystack):
needle_count_orig = defaultdict(int)
for ch in needle:
needle_count_orig[ch] += 1
needle_count, cnt = needle_count_orig.copy(), len(needle)
for i, ch in enumerate(haystack):
if ch in needle_count:
needle_count[ch] -= 1
cnt -= 1
if needle_count[ch] < 0:
needle_count, cnt = needle_count_orig.copy(), len(needle)
elif cnt == 0:
return True
else:
needle_count, cnt = needle_count_orig.copy(), len(needle)
return False
This solution seems clearer than the other ones. It has O(n) time complexity and O(k) extra space, where k is the size of the needle.
The idea is pretty simple you keep track of a count and a map of occurrences of the needle. In the event that you encounter a value that isn't in the needle, reset both of these to their original. When you reach count of zero, return true.
from collections import defaultdict
def anagram_in_str(needle, haystack):
needle_count_orig = defaultdict(int)
for ch in needle:
needle_count_orig[ch] += 1
needle_count, cnt = needle_count_orig.copy(), len(needle)
for i, ch in enumerate(haystack):
if ch in needle_count:
needle_count[ch] -= 1
cnt -= 1
if needle_count[ch] < 0:
needle_count, cnt = needle_count_orig.copy(), len(needle)
elif cnt == 0:
return True
else:
needle_count, cnt = needle_count_orig.copy(), len(needle)
return False
This solution seems clearer than the other ones. It has O(n) time complexity and O(k) extra space, where k is the size of the needle.
The idea is pretty simple you keep track of a count and a map of occurrences of the needle. In the event that you encounter a value that isn't in the needle, reset both of these to their original. When you reach count of zero, return true.
from collections import defaultdict
def anagram_in_str(needle, haystack):
needle_count_orig = defaultdict(int)
for ch in needle:
needle_count_orig[ch] += 1
needle_count, cnt = needle_count_orig.copy(), len(needle)
for i, ch in enumerate(haystack):
if ch in needle_count:
needle_count[ch] -= 1
cnt -= 1
if needle_count[ch] < 0:
needle_count, cnt = needle_count_orig.copy(), len(needle)
elif cnt == 0:
return True
else:
needle_count, cnt = needle_count_orig.copy(), len(needle)
return False
This solution seems clearer than the other ones. It has O(n) time complexity and O(k) extra space, where k is the size of the needle.
The idea is pretty simple you keep track of a count and a map of occurrences of the needle. In the event that you encounter a value that isn't in the needle, reset both of these to their original. When you reach count of zero, return true.
from collections import defaultdict
def anagram_in_str(needle, haystack):
needle_count_orig = defaultdict(int)
for ch in needle:
needle_count_orig[ch] += 1
needle_count, cnt = needle_count_orig.copy(), len(needle)
for i, ch in enumerate(haystack):
if ch in needle_count:
needle_count[ch] -= 1
cnt -= 1
if needle_count[ch] < 0:
needle_count, cnt = needle_count_orig.copy(), len(needle)
elif cnt == 0:
return True
else:
needle_count, cnt = needle_count_orig.copy(), len(needle)
return False
Actually the diff map can be just what we need from the haystack to have an anagram. The initial diff map contains all the letters in needle's map but the values are opposite, say needle map is a:2, b:1, c:1, diff map is a: -2, b: -1, c: -1.
Then for each char in stackhey, if it's in needle map, then we increase the value of the same key in diff map; otherwise, we add it to the diff map or increase its value by 1 in the diff map. Diff map means, if a value is negative, it means that "you own me these chars to be a anagram"; if a value is positive, it means that "you need to get rid of these chars to be a anagram". So after moving one step in heystack, check if diffMap is empty, if so, we found an anagram.
List<String> findAnagrams(String needle, String haystack){
List<String> anagrams = new ArrayList();
Map<char, int> map = new HashMap<char, int>();
Map<char, int> diffMap = new HashMap<char, int>();
int m = needle.length();
int n = haystack.length();
//Initialize needle map
for(int i = 0; i< m; i++){
char cur = needle.charAt[i];
if(!map.containsKey(cur){
map.put(cur, 1);
}else{
int num = map.get(cur);
map.set(cur, num+1);
}
}
// Initialize diff map
for(char c: map.getKeys()){
int num = map.get(c);
diffMap.add(c, num*(-1));
}
// Create a window to scan haystack
// using the last m chars
for(int i = n -1; i>= n - 1 - m; i--)
UpdateDiffMapAddChar(diffMap, haystack.charAt[i], map);
// scan haystack, add and remove one char at teach step;
while(i >= 0){
UpdateDiffMapAddChar(diffMap, haystack.charAt[i], map);
UpdateDiffMapRemoveChar(diffMap, heystack.charAt[i+ m];
// populate an anagram if diffMap is empty
if(diffMap.getKeys().size() == 0)
PopulateAnagram(haystack, i, needle.length(), anagrams);
}
return Anagrams;
}
PopulateAnagram(String s, int start, int count, ArrayList l){
l.add(s.subString(i, i+count);
}
UpdateDiffMapAddChar(HashMap<char, int> diffMap, char cur, HashMap<char, int> map){
if(map.containsKey(cur)){
int diffNum = diffMap.get(cur);
diffNum++;
if(diffNum == 0)
diffMap.remove(cur);
}else{
if(diffMap.contains(cur)){
int num = diffMap.get(cur);
diffMap.set(cur, num+1);
}else{
diffMap.set(cur, 1);
}
}
}
UpdateDiffMapRemoveChar(HashMap<char, int> diffMap, char cur){
if(diffMap.contains(cur)){
int num = diffMap.get(cur);
if(num-1 == 0)
diffMap.remove(cur);
}
}
My idea is to create a sign/key for the first string and then move throug the second string in a window of the lenght of the first string, in every step remove the last char (tail of window) and add a new one (head of window) and update the current sign/key.
public List<string> FindPalindromes(string s1, string s2)
{
var result = new List<string>();
if (s1.Length > s2.Length)
return result;
// Initialize the base palindrome array, this hold the base palindrome signature.
int[] a1 = new int[26];
foreach (var c in s1)
{
int index = c - 'a';
a1[index]++;
}
// Create the palindrome signature for the first s1.Length chars
int[] a2 = new int[26];
for (int i=0; i < s1.Length; i++)
{
int index = s2[i] - 'a';
a2[index]++;
}
// See how many chars conform the base palindrome, and calculate how many match with the starting palindrome
int total = 0;
int count = 0;
for (int i=0; i < a1.Length; i++)
{
if (a1[i] > 0)
total++;
if (a1[i] > 0 && a1[i] == a2[i])
count++;
}
Console.WriteLine("Total = " + total + ", Count = " + count);
// Start moving for the current palindrome a2 in S2, in each step check if we match the base palindrome,
// update a2 remove the last char from s2 and add de new one.
int first = s1.Length - 1;
int last = 0;
while (first < s2.Length)
{
// check if we match the base palindrome
if (count == total)
result.Add(s2.Substring(last, s1.Length));
// Remove the las char and update count
int index = s2[last] - 'a';
if (a1[index] > 0 && a1[index] == a2[index])
count--;
a2[index]--;
if (a1[index] > 0 && a1[index] == a2[index])
count++;
last++;
//index = s2[last] - 'a';
//a2[index]++;
first++;
if (first >= s2.Length)
break;
// Add First char and update count
index = s2[first] - 'a';
if (a1[index] > 0 && a1[index] == a2[index])
count--;
a2[index]++;
if (a1[index] > 0 && a1[index] == a2[index])
count++;
}
return result;
}
Any anagram? I assume you mean an anagram needle (the entire string), otherwise your example makes no sense.
Anyway, this is pretty straightforward. Reverse needle and then perform the KMP string matching algorithm. O(n) time and O(1) space.
I don't think KMP will work. Any valid permutation of the word in string needle if present in string haystack will be considered as a match. Hope it clarifies the question.
Use the first 26 prime numbers to calculate a hash value for each string window, anagrams of the needle will have the same hash value. E.g., a - 3, b - 5, c - 7, abc has hash value 3*5*7. because the products of prime numbers equal only when have the same prime numbers.
This solution is easier than the one I posted earlier
- Avaln May 28, 2014