Google Interview Question
Quality Assurance EngineersCountry: India
Interview Type: In-Person
@Up you're right, it's not clear. if it's any number of characters then you have to calculate all the 1 to 9 letter word permutations and add them.
If P(n, k) is the number of k-permutations of a set of n elements, then you need to calculate P(9,9) + P(9,8)+...+P(9,1). Of course, you need to also take the duplicates into consideration, which makes the math more complicated.
But then again it's not clear, for instance, if single letter words are valid, etc.
@oZZz and @Sunny
I didn't say its 2^n - 1 unique words only, i said that there are 2^n-1 subsets. Check my response below to @Epic_coder.
Thus the number of subsets is N = 2^n - 1.
Then the words count = N! + (N-1)! + (N-2)! + ... + 1
Then less the duplicates FF, EE, II, F, I, E
Total Number of Unique words = words count - 6
@oOZz
I think your solution P(9,9) + P(9,8)+.. P(9,1) is close to right . The only thing missing is that there will be duplicates in each of the permutations. Any thoughts on how to avoid it?
@slbb You're still on the wrong page. This problem has nothing to do with the subsets, because sets can only have unique characters and the order of elements in a set is not important, but in this problem it matters. I think you're getting confused, because the subsets are contained in the permutations as well. Secondly the duplicates are much more than 5. In permutations we're trying to avoid counting the words twice, if the duplicate letters are swapped.
@Up AFAIK, you need to solve it case by case. For example,
P(9,1) = 6, this one is obvious, because you only have 6 distinct letters.
P(9,2) = two letter words
Case 1:
All Es only 1 way: EE
Case 2:
All Is only 1 way: II
Case 3:
All Fs is only 1 way: FF
Case 4:
All different two letter words is P(6,2) = 6*5 = 30, because 6 distinct letters
Then P(9, 2) = 1+1+1+30=33
P(9,3) = three letter words
Case 1:
two Es and one of C,F,I,N,T
5*(3!/2!)= 15
Case 2:
Two Is and one of C,E,F,N,T
5*(3!/2!)= 15
Case 3:
Two Fs and one of C,E,I,N,T
5*(3!/2!)= 15
Case 4:
All different three letter words is P(6,3) = 6*5*4 = 120, because 6 distinct letters
Then P(9, 3) = 15+15+15+120=165
...
It would make an interesting question if you were asked to find the number of unique dictionary words, otherwise it's just a simple permutations problem.
@Epic_coder
I think we should consider word of different size's.
2^n different subsets can be formed from a word of size n ( considering there are no duplicate letters) , then we need to find the permutation of each subset.
This algo will run forever. :)
You are right we have to consider words whose length is less than 9. Thus this is a subset problem but the different subsets should be 2^n - 1 since an empty set is not considered a valid word.
Thus the number of subsets is N = 2^ - 1.
Then the words count = N! + (N-1)! + (N-2)! + ... + 1
Then less the duplicates FF, EE, II, F, I, E
unique words count = words count - 6
2^n - 1 is an underestimate because let's say the string is AB. The words are A, B, AB and BA. And when the string is ABC, then the subset containing {A, B, C} alone can form 6 different words already: ABC, ACB, BAC, BCA, CAB, CBA.
#include <stdio.h>
#include <malloc.h>
#include <string.h>
void permutation(char *string,int level,int k)
{
char temp;
k = level;
if(level == strlen(string))
{
printf("%s ",string);
return;
}
while(1)
{
permutation(string,level+1,k);
if(k < strlen(string))
{
temp = *(string+(level - 1));
*(string+(level - 1)) = *(string + k);
*(string + k) = temp;
k++;
}
else
{
break;
}
}
return;
}
int main()
{
char *string = (char*)malloc(sizeof(50));
strcpy(string,"EFFICIENT");
int level = 1;
int k = 1;
permutation(string,level,k);
return 0;
}
HashSet<String> permutaion(String s){
HashSet<String> set=new HashSet<String>();
permutation(" ", s,set);
}
void permutation(String prefix,String s,HashSet<String> set){
if(s.length()==0){
set.add(prefix);
System.out.println( prefix);
}
for(int i=0;i<s.length();i++){
permutation(prefix+s.charAt(i),s.substring(0, i)+s.substring(i+1, s.length()),set);
}
}
Unique words will be = n!/ (number of Repeated Characters);
Solving the problem for the case of unique words of any length:
Also see: gist [dot] github [dot] com/kastur/6477508
/*
How many unique words (not necessarily a dictionary word) can be generated from
a word like: "EFFICIENT". Words can be of length 1...length(word)
Preprocess input to be a list of the form:
L = [(r, number of letters repeated r times)]
i.e. for the word EFFICIENT we get L = [(1, 3), (2, 3)]
since {C,N,T} each repeated once contributing to element (1,3)
and {E,F,I} each repeated twice contributing to element (2,3)
Let #[L] for the example would be:
#[(1, 3), (2, 3)] =
// number of unique words of length 9
[9! / ((2!)^3 * (1!)^3)] +
// number of unique words of length 8 obtained by removing 1 of 3 letters
// that are repeated twice.
+ 3 * #[(1, 4), (2, 2)]
// number of unique words of length 8 obtained by removing 1 of 3 letters
// that are repeated twice.
+ 3 * #[(1, 2), (2, 3)]
Base cases
#[(1, X)] = X! // unique words of length X when there are no repeated letters.
#[(0, X)] = #[(1, 0)] = 0
*/
#include <algorithm>
#include <array>
#include <cctype>
#include <cstdlib>
#include <ext/hash_map>
#include <ext/hash_set>
#include <fstream>
#include <iostream>
#include <list>
#include <limits>
#include <map>
#include <memory>
#include <string>
#include <utility>
#include <unistd.h>
#include <vector>
#include "utils-inl.h"
using namespace std;
using namespace __gnu_cxx;
class UniqueWords {
private:
typedef long long T;
map<int, int> map_;
int len_;
public:
UniqueWords(string word) {
sort(word.begin(), word.end());
for (auto iter = word.begin(); iter != word.end();) {
const auto eq_bounds = equal_range(iter, word.end(), *iter);
int count = distance(eq_bounds.first, eq_bounds.second);
map_[count]++;
iter = eq_bounds.second;
}
len_ = word.length();
}
T factorial(int x) {
return (x <= 1) ? 1 : x * factorial(x - 1);
}
void DecrementByAndEraseIfZero(int key, int decrement_amount) {
auto iter = map_.find(key);
if (iter == map_.end()) {
throw "key not found";
}
int* val = &iter->second;
(*val) -= decrement_amount;
if ((*val) < 0) {
throw "val below zero";
}
if (*val == 0) {
map_.erase(iter);
}
}
T CountUniqueWords() {
for (const auto kv : map_) {
cout << "<" << kv.first << ":" << kv.second << ">, ";
}
cout << "\n";
if (map_.size() == 1) {
if (map_[1] == 0) {
return 0;
} else {
return factorial(map_[1]);
}
}
T denom;
for (const auto iter : map_) {
denom = pow(factorial(iter.first), iter.second);
}
T sum = factorial(len_) / denom;
vector<int> recurse_keys;
for (const auto iter : map_) {
recurse_keys.push_back(iter.first);
}
--len_;
for (const int key : recurse_keys) {
if (key == 1) {
continue;
}
int repeats = map_[key];
DecrementByAndEraseIfZero(key, 1);
map_[key-1]++;
sum += repeats * CountUniqueWords();
map_[key]++;
DecrementByAndEraseIfZero((key-1), 1);
}
++len_;
return sum;
}
};
int main() {
vector<string> tokens;
while (ReadLine(" ", &tokens)) {
if (tokens.size() != 1) {
break;
}
cout << UniqueWords(tokens.at(0)).CountUniqueWords() << "\n";
}
}
Isn't this a simple permutation? So this means it's factorial of length of "EFFICIENT" = 9! = 362880
see 9! will also consider F & F as distinct alphabets .true for I also so
answer should be =9!/(2!*2!).
I seem to have misunderstood the question. Why would you consider F & F as distinct alphabets if "FFEICIENT" is valid word based on the problem statement. Also, the 9! didn't consider words whose length is less than 9. Thus, this could be (2^n - 1) as @Epic_coder mentioned. Check http://en.wikipedia.org/wiki/Combination#Number_of_k-combinations_for_all_k
CORRECTION:
2^n -1 subsets not unique words. check my response to @Epic_coder
It's permutation, but note that there are repeated characters.
- oOZz June 21, 20139!/2!x2!x2!= 45360, because you have 9 characters, but there are two Es, Is, and Fs.