Google Interview Question
Software EngineersCountry: Switzerland
Interview Type: Written Test
If the strings are character arrays one can sort both - O(n log n) - and compare the result:
private static boolean check(String a, String b) {
char[] aa = a.toCharArray();
char[] bb = b.toCharArray();
Arrays.sort(aa);
Arrays.sort(bb);
return Arrays.equals(aa, bb);
}
If strings are not too big (no risk of overflow).
public boolean isAnagrams(String str1, String str2){
if(str1 == null || str2 == null || str1.length() != str2.length()){
return false;
}
return getHash(str1) == getHash(str2);
}
public int getHash(String str){
int res = 0;
for(int i = 0; i < str.length(); i++){
res += str.charAt(i);
}
return res;
}
If entry is ASCII. I assume that there are no spaces and punctuation marks. Time complexity O(n), where n is string length. Space complexity O(1)
public boolean is Anagrams(String str1, String str2){
if(str1 == null || str2 == null || str1.length() != str2.length()){
return false;
}
int[] countsStr1 = new int[256];
int[] countsStr2 = new int[256];
for(int i =0; i < str1.length(); i++){
countsStr1[str1.charAt(i)] += 1;
countsStr2[str2.charAt(i)] += 1;
}
for(int i =0; i < countsStr1.length; i++){
if(countsStr1[i] != countsStr2[i]){
return false;
}
}
return true;
}
If entry is Unicode.
public boolean is Anagrams(String str1, String str2){
if(str1 == null || str2 == null || str1.length() != str2.length()){
return false;
}
Map<Character, Integer> countsStr1 = new HashMap<Character, Integer>();
Map<Character, Integer> countsStr2 = new HashMap<Character, Integer>();
for(int i =0; i < str1.length(); i++){
add(str1.charAt(i), countsStr1);
add(str2.charAt(i), countsStr2);
}
if(countsStr1.size() != countsStr2.size()){
return false;
}
if(!countsStr1.keySet().containsAll(countsStr2.keySet())){
return false;
}
if(!countsStr2.keySet().containsAll(countsStr1.keySet())){
return false;
}
for(Map.Entry<Character, Integer> entry1 : countsStr1.entrySet()){
Integer value2 = countsStr2.get(entry1.getKey());
if(entry1.getValue().intValue() != value2.intValue() ){
return false;
}
}
return true;
}
private void add(char value, Map<Character, Integer> counts){
Character ch = Character.valueOf(value);
Integer cnt = counts.get(ch);
if(cnt != null){
counts.put(ch, cnt + 1);
} else {
counts.put(ch, 1);
}
}
public class AnagramService
{
public boolean isAnagram(String a,String b)
{
if(a==null||b==null||a.trim().equals("")||b.trim().equals("")||a.length()!=b.length())
{
return false;
}
HashMap<Character,Integer> mp=new HashMap<Character,Integer>();
for(Character c: a)
{
if(!mp.contains(c))
{
mp.put(c,1);
}else{
int count=mp.get(c);
mp.put(c,count++);
}
}
for(Character c:b)
{
if(!mp.contains(c))
{
return false;
}
else
{
int count=mp.get(c);
count--;
if(count<0)
{
return false;
}
else if(count==0)
{
mp.remove(c);
}else
{
mp.put(c,count);
}
}
}
return(mp.isEmpty());
}
}
There are a couple approaches to this that generally depend on the expected length of the Strings being checked:
If the Strings are short, then sorting the actual content of the strings and comparing should be relatively easy and quick. However, this will cost O(m log m) time and O(m) memory where m is the length of the Strings.
Other methods include creating an anagrammatic signature of the String for comparison. Assuming you're using a character set like ASCII, then this approach would run with complexity O(m + s) costing memory O(s) where s is the size of the character set. This can be done by simply creating an array of ints the size of the character set and counting the frequency of the characters. To compare the two strings, simply compare the character frequency
Using ASCII, the approximate point at which I would definately go with the anagrammatic signature would be where the Strings are of about size 32 or greater. Now, this would change dramatically if I were doing something like comparing Genetic sequences. Those would have a character set of 4 (maybe 5 if counting RNA) and the Strings would be typically long.
Sample solution in C++
#include <string>
#include <unordered_map>
#include <iostream>
using namespace std;
bool isAnagram(const string& w1, const string& w2) {
if (w1.length() != w2.length()) return false;
unordered_map<char, int> m;
for (const auto ch: w1) {
++m[ch];
}
for (const auto ch: w2) {
--m[ch];
if (m[ch] == 0) m.erase(ch);
}
return m.empty();
}
int main() {
string s1 = "anagram";
string s2 = "ranagam";
string s3 = "banana";
cout << isAnagram(s1, s2) << endl;
cout << isAnagram(s1, s3) << endl;
return 0;
}
boolean isAnagram(String s1, String s2) {
if (null == s1 || null == s2 || s1.length() != s2.length()) {
return false;
}
int x = 0,sum1=0,sum2=0;
for (int i = 0; i < s1.length(); i++) {
x = x ^ s1.charAt(i) ^ s2.charAt(i);
sum1+= s1.charAt(i)
sum2+= s2.charAt(i)
}
// if anagram then XOR operation result will be 0
if (0 == x) {
return true;
}
return false;
}
You don't need to store data in a hash map or an array, and still be able to get O(n) time:
1) Loop over all char values mapped to int. So this goes from 0 to 255 for ASCII.
2) For each char value, count it's occurrences in the first string, and then in the second string. If they differ, return false (i.e. they're not anagrams)
3) If by the end the loop hasn't broken out, it means they're anagrams so return true.
Yes it may require a lot of internal loops but since the outer loop is hardcoded (from 0 to 255), the runtime is still technically O(n) but without requiring any space.
Ionut -- very nice answer
boolean isAnagram(String s1, String s2) {
if (null == s1 || null == s2 || s1.length() != s2.length()) {
return false;
}
int x = 0;
for (int i = 0; i < s1.length(); i++) {
x = x ^ s1.charAt(i) ^ s2.charAt(i);
}
// if anagram then XOR operation result will be 0
if (0 == x) {
return true;
}
return false;
}
boolean isAnagram(String s1, String s2) {
if (null == s1 || null == s2 || s1.length() != s2.length()) {
return false;
}
int x = 0;
for (int i = 0; i < s1.length(); i++) {
x = x ^ s1.charAt(i) ^ s2.charAt(i);
}
// if anagram then XOR operation result will be 0
if (0 == x) {
return true;
}
return false;
}
Count the characters in S1 and check if there are the same in S2 use a Hash Table to store the count, in theory HashTable operations are O(1) so the order is O(n). Another approach is to use a char array of 255 elements to store the count, is the same order O(N) but requires more memory O(M)
- hnatsu July 14, 2015