Give you an array of String,

Google Interview Question for Software Engineer / Developers

0

of 0 votes

37
Answers
Give you an array of String,
return number of distinct strings in that array.
- superffeng April 20, 2012 in United States | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

1) In java we can create a Hash<String, int> where int will be the count(though we do not need that) Later we can find out the length of the hash. It will work for java.

2) In case of C we can use trie to implement this.

Please let me know if there is any better solution than the above.

- Sourav April 21, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

using trie is a good option, but it takes O(n * m) time where m is the length of the string.

- krishna April 30, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

If you use a hash, won't you still have an 'm' factor for computing the hash function?
Actually, I can't imagine any algorithm without a dependency on 'm'...

- Anonymous May 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

In java you actually have a HashSet. Internally it uses a HashMap, but still you wouldn't have to think about the key -> value pair.

- Martin January 30, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Hashing would be best approach for O(n).
n = length of array.
Before inserting each element to hash table look for if found n-- otherwise insert to table.
return n

- Punit Jain April 22, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

is it array of Strings??
we can solve it in many ways..can you give the complexity of the soln?

- 9aran April 20, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Yes, array of strings, complexity O(n)

- superffeng April 20, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Hi all, I am unable to understand the question. Could you please elaborate it.

- Srishti April 21, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

Copy each element of the array into std::set. Size of set = number of unique elements!
complexity = O(n)

- Sai April 21, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Mmm... this would lead to the question: in terms of what data structure does the STL implement sets?

- Anonymous May 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

A std::set is implemented as a red-black tree, so this "algorithm" would be O(n log n).

- Anonymous May 07, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Anonymous std::set is not implemented as a red-black tree, rather as a "binary search tree" and insertion in set would take logarithmic time and we find the size of set in constant time.

- Unknown June 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

So funny to read these comments :)
std::set is indeed red-black tree,which on it's turn is binary search tree (balanced), so overall complexity is NlogN (inserting N elements, logN for each insertion).

- gevorgk July 04, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

TRIE IS good option. but only probelm it wastes so much of space.

step1: scan all strings and construct trie.
step2 : do Trie traversal, find unique string counts.

TC O(n)
SC O(n)

- Anonymous April 21, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I don't think it wastes much more than a hash table.
Moreover, you don't need two passes: while building the trie, just increment a counter whenever you have to switch a node from non-terminal to terminal.

- Anonymous May 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Trie is a good option, but proper approach would be cluster the strings with size as a perameter, then using a trie.

- krishna April 22, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Hash option is good - with the caveat that a good hashing function is used with very low probability of collision. Also note that the strings themselves don't need to be inserted into the hash table, we only need to generate the keys.

- Anonymous April 22, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

can this problem be solved using operating on the given array?
for example:

/*Let 's' be the size of the array of strings and let 'a' be the array of strings*/
int counter(char a[][])
{
for(int i=0;i<s;i++)
{
if(!strcmp(s[i],s[i+1])
{
count++;/*This is the variable which is initialised to zero and counts no. of matching string*/
}
}
if(count>0)
{
count++;
}
return (s-count);
}

- cder April 26, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

You can't have a perfect hash function, so for every collision you must compare the two strings to see if they're really the same, even if probability is low.
This means that you have to store the strings along with their keys.

- Dave May 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I'm thinking to use hashMap in java, each char element in the string as a key. Only put the item into the table when it's the first time to find the key. Then the size of the items will be the number of distinct element. Please correct me if anything is wrong. thanks!

public int distinct_element(String str){
			HashMap<Character, Boolean> map=new HashMap<Character, Boolean>();
			for(int i=0;i<str.length();i++){
				char value=str.charAt(i);
				if(map.containsKey(value)==false){
					map.put(value, true);
				}
			}
			return map.size();
		}

- Anonymous June 14, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Here is the essential thing is input is not number of words, it is n characters.

- joejo August 29, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

How could it be o(n)? say string 12345, it will have 5 * (5 + 1) /2 node, that means if you count one by one, it takes n**2 / 2 steps. it is o(n**2).

- Joejo August 28, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

can someone please elaborate the question??

- Puneet August 29, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

So,

I could read the 2 possible solutions:
1) By Hash Table: Which is straight Forward. We just need to come up with a good Hash Function
2) Build TRIE.

Is there any other good solutions?

Thanks

- Andy2000 September 09, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

If you want to use trie here, you need to ask the interviewer whether the characters in the string are in a limited set.

- cjmemory September 23, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Use the first step of KMP.

For example, given a string "abab", the number of substrings is 4 * 5 / 2 = 10

Compute the next array, we got A= [-1 , -1 , 0 , 1]. This array implies the redundancies of the string. For example, A[2] = 0, we know that s[2] = s[0], there are one redundant substring. A[3] = 1, we know that s[2, 3] = s[0,1] and s[3] = s[1], there are two redundant substrings.

So the result is: n(n + 1) / 2 - SUM(A[i] + 1) i from 0 to n

Complexity: O(n)

- yy October 24, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Sorry. The algorithm is wrong.

We should use suffix array and lcp array to solve this problem. The complexity is O(n)

Give questions "cabab", we got the suffix array as follows

ab 0 2
aba 2 3
abab 3 4
b 0 5
bab 1 7
cabab 0 12

There are 12 distinct substrings in total

Time complexity is still O(n)

- yy October 24, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

Use a suffix array and calculate the longest common prefix for the adjacent strings in the array.
Total no. of substrings = n*(n+1)/2

Now find the sum of all LCP for the suffix array.

No. of distinct substrings = Total no. of substrings - sum of all LCP

#include<cstdio>
#include<cstring>
#include<algorithm>
#include<cstdlib>

using namespace std;

int pstrcmp(const void* p, const void* q)
{
    return strcmp(*(const char**)p, *(const char**)q);
}


 int lcp(char *p, char *q)
{
	int   len = 0;

	while(*p!='\0' && *p == *q){
		len++;
		p++;
		q++;
	}

	return len;

}


int main()
{
	char s[100];

	printf("Enter string:\n");
	scanf("%s",&s);

	int len;

	len = strlen(s);

	char *a[len];


	//Assign the suffix
	for(int i=0;i<len;i++)
		a[i] = &s[i];
	
	//Sort the suffix array
	qsort(a,len,sizeof(char *),pstrcmp);

	int sum = 0;

	for(int i = 0;i<len-1;i++){

		int   l;

		l = lcp(a[i],a[i+1]);	
		sum = sum + l;
	}

	int  res;

	res = (len * (len+1))/2;

	res = res - sum;

	printf("%d\n",res);

	

	return 0;

}

- Mukul October 29, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

its array of strings..not subtrings in a string

- Anonymous August 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Different solutions:

a) Sort the array of strings
b) Put each string in a Set
c) Put each string in a HashMap
d) Construct a Trie from array of strings.

- mytestaccount2 October 02, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Could anyone tell me if my solution is O(n)?

std::vector<std::string> strings;
	
	strings.push_back("hola");
	strings.push_back("adeu");
	strings.push_back("heli");
	strings.push_back("caca");
	strings.push_back("caca");
	strings.push_back("pedo");
	strings.push_back("heli");
	strings.push_back("adeu");
	strings.push_back("hola");

	std::unordered_set<std::string> set;

	for (unsigned int k = 0; k < strings.size(); ++k)
	{
		std::string string = strings.at(k);
		set.insert(string);
	}

	int count = set.size();

- Dídac Pérez December 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Nope. Not O(n).

And why in the world would you name the variable as 'string'?

- Anonymous December 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Yes, you are right. But, why in the world would you answer my question with a simple "Not O(n)" instead of giving me the run time of my code?

- Dídac Pérez December 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Oh sorry. Do you want your money back?

- Anonymous December 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Oookay... enough. Thank you so much anyway :)

- Dídac Pérez December 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Assuming set operations are O(log n), this is Omega(n^2 log n).

- TEACH December 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Below is my Java code for this problem:
I used a HashMap. I would ask the interview does Capitalization matter? (Ex. Is Cat the same as CaT?)
If Capitalization matters, use the code below. If not, add
eachString = eachString.toLowerCase(); to the for loop.

public static int distinctString ( String [] inputArray){

			Map <String, Integer > hashMapOfString = new HashMap <String, Integer > ();
			
			int distinctCount =0;

			for ( String eachString : inputArray){

				      if ( ! hashMapOfString.containsKey(eachString) ){
					hashMapOfString.put(eachString,1);
					distinctCount++;	
				}

			}


			return distinctCount;
		}

- koeber99 October 20, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

In Java , we can use a HashSet. Add a string to the HashSet, if it returns false, which means the String is already present there, then remove that element from the HashSet. The resultant HashSet will contain all the distinct Strings.

- Somdutta August 02, 2015 | Flag Reply

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews