## Google Interview Question for Software Engineer / Developers

Country: United States
Interview Type: Phone Interview

Comment hidden because of low score. Click to expand.
2
of 2 vote

1) In java we can create a Hash<String, int> where int will be the count(though we do not need that) Later we can find out the length of the hash. It will work for java.

2) In case of C we can use trie to implement this.

Please let me know if there is any better solution than the above.

Comment hidden because of low score. Click to expand.
0

using trie is a good option, but it takes O(n * m) time where m is the length of the string.

Comment hidden because of low score. Click to expand.
0

If you use a hash, won't you still have an 'm' factor for computing the hash function?
Actually, I can't imagine any algorithm without a dependency on 'm'...

Comment hidden because of low score. Click to expand.
0

In java you actually have a HashSet. Internally it uses a HashMap, but still you wouldn't have to think about the key -> value pair.

Comment hidden because of low score. Click to expand.
1
of 1 vote

Hashing would be best approach for O(n).
n = length of array.
Before inserting each element to hash table look for if found n-- otherwise insert to table.
return n

Comment hidden because of low score. Click to expand.
0
of 0 vote

is it array of Strings??
we can solve it in many ways..can you give the complexity of the soln?

Comment hidden because of low score. Click to expand.
0

Yes, array of strings, complexity O(n)

Comment hidden because of low score. Click to expand.
0

Hi all, I am unable to understand the question. Could you please elaborate it.

Comment hidden because of low score. Click to expand.
0
of 2 vote

Copy each element of the array into std::set. Size of set = number of unique elements!
complexity = O(n)

Comment hidden because of low score. Click to expand.
0

Mmm... this would lead to the question: in terms of what data structure does the STL implement sets?

Comment hidden because of low score. Click to expand.
0

A std::set is implemented as a red-black tree, so this "algorithm" would be O(n log n).

Comment hidden because of low score. Click to expand.
0

@Anonymous std::set is not implemented as a red-black tree, rather as a "binary search tree" and insertion in set would take logarithmic time and we find the size of set in constant time.

Comment hidden because of low score. Click to expand.
0

std::set is indeed red-black tree,which on it's turn is binary search tree (balanced), so overall complexity is NlogN (inserting N elements, logN for each insertion).

Comment hidden because of low score. Click to expand.
0
of 0 vote

TRIE IS good option. but only probelm it wastes so much of space.

step1: scan all strings and construct trie.
step2 : do Trie traversal, find unique string counts.

TC O(n)
SC O(n)

Comment hidden because of low score. Click to expand.
0

I don't think it wastes much more than a hash table.
Moreover, you don't need two passes: while building the trie, just increment a counter whenever you have to switch a node from non-terminal to terminal.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Trie is a good option, but proper approach would be cluster the strings with size as a perameter, then using a trie.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Hash option is good - with the caveat that a good hashing function is used with very low probability of collision. Also note that the strings themselves don't need to be inserted into the hash table, we only need to generate the keys.

Comment hidden because of low score. Click to expand.
0

can this problem be solved using operating on the given array?
for example:

/*Let 's' be the size of the array of strings and let 'a' be the array of strings*/
int counter(char a[][])
{
for(int i=0;i<s;i++)
{
if(!strcmp(s[i],s[i+1])
{
count++;/*This is the variable which is initialised to zero and counts no. of matching string*/
}
}
if(count>0)
{
count++;
}
return (s-count);
}

Comment hidden because of low score. Click to expand.
0

You can't have a perfect hash function, so for every collision you must compare the two strings to see if they're really the same, even if probability is low.
This means that you have to store the strings along with their keys.

Comment hidden because of low score. Click to expand.
0
of 0 vote

I'm thinking to use hashMap in java, each char element in the string as a key. Only put the item into the table when it's the first time to find the key. Then the size of the items will be the number of distinct element. Please correct me if anything is wrong. thanks!

``````public int distinct_element(String str){
HashMap<Character, Boolean> map=new HashMap<Character, Boolean>();
for(int i=0;i<str.length();i++){
char value=str.charAt(i);
if(map.containsKey(value)==false){
map.put(value, true);
}
}
return map.size();
}``````

Comment hidden because of low score. Click to expand.
0

Here is the essential thing is input is not number of words, it is n characters.

Comment hidden because of low score. Click to expand.
0
of 0 vote

How could it be o(n)? say string 12345, it will have 5 * (5 + 1) /2 node, that means if you count one by one, it takes n**2 / 2 steps. it is o(n**2).

Comment hidden because of low score. Click to expand.
0
of 0 vote

can someone please elaborate the question??

Comment hidden because of low score. Click to expand.
0
of 0 vote

So,

I could read the 2 possible solutions:
1) By Hash Table: Which is straight Forward. We just need to come up with a good Hash Function
2) Build TRIE.

Is there any other good solutions?

Thanks

Comment hidden because of low score. Click to expand.
0

If you want to use trie here, you need to ask the interviewer whether the characters in the string are in a limited set.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Use the first step of KMP.

For example, given a string "abab", the number of substrings is 4 * 5 / 2 = 10

Compute the next array, we got A= [-1 , -1 , 0 , 1]. This array implies the redundancies of the string. For example, A = 0, we know that s = s, there are one redundant substring. A = 1, we know that s[2, 3] = s[0,1] and s = s, there are two redundant substrings.

So the result is: n(n + 1) / 2 - SUM(A[i] + 1) i from 0 to n

Complexity: O(n)

Comment hidden because of low score. Click to expand.
0

Sorry. The algorithm is wrong.

We should use suffix array and lcp array to solve this problem. The complexity is O(n)

Give questions "cabab", we got the suffix array as follows

ab 0 2
aba 2 3
abab 3 4
b 0 5
bab 1 7
cabab 0 12

There are 12 distinct substrings in total

Time complexity is still O(n)

Comment hidden because of low score. Click to expand.
0
of 2 vote

Use a suffix array and calculate the longest common prefix for the adjacent strings in the array.
Total no. of substrings = n*(n+1)/2

Now find the sum of all LCP for the suffix array.

No. of distinct substrings = Total no. of substrings - sum of all LCP

``````#include<cstdio>
#include<cstring>
#include<algorithm>
#include<cstdlib>

using namespace std;

int pstrcmp(const void* p, const void* q)
{
return strcmp(*(const char**)p, *(const char**)q);
}

int lcp(char *p, char *q)
{
int   len = 0;

while(*p!='\0' && *p == *q){
len++;
p++;
q++;
}

return len;

}

int main()
{
char s;

printf("Enter string:\n");
scanf("%s",&s);

int len;

len = strlen(s);

char *a[len];

//Assign the suffix
for(int i=0;i<len;i++)
a[i] = &s[i];

//Sort the suffix array
qsort(a,len,sizeof(char *),pstrcmp);

int sum = 0;

for(int i = 0;i<len-1;i++){

int   l;

l = lcp(a[i],a[i+1]);
sum = sum + l;
}

int  res;

res = (len * (len+1))/2;

res = res - sum;

printf("%d\n",res);

return 0;

}``````

Comment hidden because of low score. Click to expand.
0

its array of strings..not subtrings in a string

Comment hidden because of low score. Click to expand.
0
of 0 vote

Different solutions:

a) Sort the array of strings
b) Put each string in a Set
c) Put each string in a HashMap
d) Construct a Trie from array of strings.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Could anyone tell me if my solution is O(n)?

``````std::vector<std::string> strings;

strings.push_back("hola");
strings.push_back("heli");
strings.push_back("caca");
strings.push_back("caca");
strings.push_back("pedo");
strings.push_back("heli");
strings.push_back("hola");

std::unordered_set<std::string> set;

for (unsigned int k = 0; k < strings.size(); ++k)
{
std::string string = strings.at(k);
set.insert(string);
}

int count = set.size();``````

Comment hidden because of low score. Click to expand.
0

Nope. Not O(n).

And why in the world would you name the variable as 'string'?

Comment hidden because of low score. Click to expand.
0

Yes, you are right. But, why in the world would you answer my question with a simple "Not O(n)" instead of giving me the run time of my code?

Comment hidden because of low score. Click to expand.
0

Oh sorry. Do you want your money back?

Comment hidden because of low score. Click to expand.
0

Oookay... enough. Thank you so much anyway :)

Comment hidden because of low score. Click to expand.
0

Assuming set operations are O(log n), this is Omega(n^2 log n).

Comment hidden because of low score. Click to expand.
0
of 0 vote

Below is my Java code for this problem:
I used a HashMap. I would ask the interview does Capitalization matter? (Ex. Is Cat the same as CaT?)
If Capitalization matters, use the code below. If not, add
eachString = eachString.toLowerCase(); to the for loop.

``````public static int distinctString ( String [] inputArray){

Map <String, Integer > hashMapOfString = new HashMap <String, Integer > ();

int distinctCount =0;

for ( String eachString : inputArray){

if ( ! hashMapOfString.containsKey(eachString) ){
hashMapOfString.put(eachString,1);
distinctCount++;
}

}

return distinctCount;
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

In Java , we can use a HashSet. Add a string to the HashSet, if it returns false, which means the String is already present there, then remove that element from the HashSet. The resultant HashSet will contain all the distinct Strings.

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.