Given an array of strings of 0

Google Interview Question for Software Engineer / Developers

0

of 0 votes

34
Answers
Given an array of strings of 0s and 1s. X and Y are also given. Return the maximum number of elements in a subset of the array elements which will X number of zeroes and Y number of 1s when combined. For eg: if array[] = {"01", "10", "0", "110"} X=3, Y=2
Answer should be 3 since first 3 strings when combined will give the required number of 0s and 1s.
- controlc September 18, 2011 in India | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: In-Person

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 6 vote

This solution seems to work, however I think the time complexity is O(X*Y*L) where L is the number of characters in the array of strings:

import java.util.*;
public class subsetSize {
	String[] arr;
	HashMap<int[], Integer> h;
	
	public subsetSize(String[] arr) {
		h = new HashMap<int[], Integer>();
		this.arr = arr;
	}
	
	public static int count0s(String s) {
		char[] c = s.toCharArray();
		int n = 0;
		for (int i = 0; i < c.length; i++) {
			if (c[i] == '0') n++;
		}
		return n;
	}
	
	public Integer getSubsetSize(int x, int y) {
		return getSubsetSize(arr.length-1, x, y);
	}
	
	public Integer getSubsetSize(int i, int x, int y) {
		Integer r;
		r = h.get(new int[] {i,x,y});
		if (r!=null) return r;
		
		if (i==-1) {
			if(x==0 && y==0) r =  0;
			else r = -1;
		} else if(x < 0 || y < 0) r = -1;
		else {
			int n0s = count0s(arr[i]);
			int n1s = arr[i].length() - n0s;
			
			Integer s1 = getSubsetSize(i-1, x-n0s,y-n1s) + 1;
			Integer s2 = getSubsetSize(i-1, x, y);
			r = Math.max(s1,s2);
		}
		h.put(new int[]{i,x,y}, r);
		return r;
	}
}

The recurrence relation is subsetSize(i,x,y) = max(subsetSize(i-1,x,y), subsetSize(i-1,x-number of 0s in arr[i], y-number of 1s in array[i]))

Each sub-problem takes O(l) where l is the length of array[i]. There are n*X*Y subproblems so the entire solution runs in O(n*X*Y*l). Since the number of individual characters in the array is equal to n*l, the solution runs in O(X*Y*L) which seems to be pseudo-polynomial due to X and Y not being given in unary encoding.

- Anonymous September 21, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Sorry, there was a small bug.

Integer s1 = getSubsetSize(i-1, x-n0s,y-n1s) + 1;
Integer s2 = getSubsetSize(i-1, x, y);
r = Math.max(s1,s2);

should be

Integer s1 = getSubsetSize(i-1, x-n0s,y-n1s) + 1;
Integer s2 = getSubsetSize(i-1, x, y);
if (s1 == 0) s1 = -1;
r = Math.max(s1,s2);

- Anonymous September 21, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I am think this hashmap won't work in the way you want since you didn't override the .hashcode() and .equals() method in the array object, hence h.get(...) will always return null...

- Anonymous October 10, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

This code has a bug that you didn't keep track the ones of in the array used. Like the example from the question, if we are looking for x=3, y=2. We want to find indexes 0, 1, 2 from the array. If you didn't keep track the one used, you may run into 0, 0, 2.

- Jeff October 22, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Hey, seeing your code, I got confused. Please clarify. Are you just making combinations of the String in the set?

- Deepak February 25, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

#include <iostream>
using namespace std;

int main()
{
	int f[100][100];

	int X = 3;
	int Y = 2;

	memset(f, 0, sizeof(f));

	const int aSize = 4;
	string a[aSize] = {"01", "10", "0", "110"};

	for(int i = 0; i < aSize; i++)
	{
		int zeroNum = 0;
		int oneNum = 0;
		for(int j = 0; j < a[i].size(); j++)
			if (a[i][j] == '1')
				oneNum++;
			else
				zeroNum++;

		for(int x = X; x >= 0; x--)
			for(int y = Y; y >= 0; y--)
			{
				int preX = x - zeroNum;
				int preY = y - oneNum;
				if (preX >= 0 && preY >= 0 && (f[preX][preY] > 0 || (preX == 0 && preY == 0)))
					f[x][y] = max(f[x][y], f[preX][preY] + 1);
			}

	}

	cout << f[X][Y] << endl;
}

- remlostime October 22, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@remlostine
Can you please explain a bit?
I tried your code for{01,001}
its giving me 4 as answer but it should be 2 only.

- kehkelunga November 13, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

dp[i][j] = j - i + 1 ; // if after concatination of these i to j elements the required no. of zeros and 1's are found.
or
dp[i][j] = max_int ; // if not found.

dp[i][j] = min(dp[i][j-1] ,dp[i+1][j], dp[i][j])

- Anonymous September 18, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

There are a couple of issues here. Firstly it should be max and not min. Secondly, the elements in the subset may or may not be contiguous.

- controlc September 18, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

what do you mean by continguos here.. ?

- Anonymous September 18, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

What if the given array is like {"01", "10", "110", "0"}? Does your solution still returns 3?

- controlc September 18, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

No... so what could be the actual solution... ?

- Anonymous September 18, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

what do you mean by continguos here.. ?

- Anonymous September 18, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

No... so what could be the actual solution... ?

- Anonymous September 18, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Sort the given array, in increasing no. of symbols each string contain.
like array[] = {"01", "10", "0", "110"} will become after sorting
array[] = {"0","01", "10", "110"}
int ans=0; // this will be the answer
Now for i=0 to n-1(array length)
a= No. of 0's in array[i];
b= No. of 1's in array[i];
if(a<=x&&b<=y){
ans++;//this element will be in subset
x-=a; y-=b;
}

- HarryPawar September 18, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

not working for all cases..
Eg. array=['11','00','110001']
x=3, y=3

- harry.iitr September 18, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

<pre lang="" line="1" title="CodeMonkey56585" class="run-this">typedef struct node
{
int zeroNums;
int oneNums;
int index;
}node;

int maxSubArrNum(node *arr, int nLength,int X,int Y)
{
if (NULL == arr || nLength <= 0)
return -1;

if (X < 0 || Y < 0)
return -1;

if (0 == X && 0 == Y)
return 0;

if (X == arr[0].zeroNums && Y == arr[0].oneNums)
return 1;

int max = -1;
for (int i = 1;i < nLength;i++)
{
int tmp = maxSubArrNum(arr+i,nLength-i,X-arr[0].zeroNums,Y-arr[0].oneNums);
if (tmp >= 0 && tmp > max) max = tmp;
}
if (max != -1)
max++;
return max;
}

int getArrResults(char **arr, int nLength, int X, int Y)
{
if (NULL == arr)
return -1;

node *nodeArr = (node*)malloc(sizeof(node)*nLength);
char *pstr = NULL;
for (int i = 0;i < nLength;i++)
{
pstr = arr[i];
nodeArr[i].oneNums = nodeArr[i].zeroNums = 0;
nodeArr[i].index = i;
while (*pstr)
*pstr++ == '0' ? nodeArr[i].zeroNums++:nodeArr[i].oneNums++;
}

int max = 0;
for (int i = 0;i < nLength;i++)
{
int tmp = maxSubArrNum(nodeArr+i,nLength-i,X,Y);
if (tmp > max) max = tmp;
}
return max;
}</pre><pre title="CodeMonkey56585" input="yes">
</pre>

- Anonymous September 21, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

It's a typical DP problem. Similar to Knapsack. Here's the logic. The recursion equation is like this:
1. Assume ai0/ai1 is the number of 0's/1's in string array[i].
2. Define s[i,j,k] is the max # of elements in array from element 0 to i which contain j 0's and k 1's. 0<=i<=N, 0<=j<=X, 1<=k<=Y
3. Initialize s[i,ai0,ai1]=1
4. s[i,j,k]=s[i-1,j,k](if j<ai0 or k<ai1) or
s[i,j,k]=max(s[i-1,j,k],s[i-1,j-ai0,k-ai1]) (if j>=ai0 and k>=ai1 and s[i-1,j-ai0,k-ai1]>0)

The time complexity is O(N*X*Y) and Space could be O(X*Y) if we only need to return the number of elements.

- g233007 October 09, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

s[i,j,k]=max(s[i-1,j,k],s[i-1,j-ai0,k-ai1] + 1)

- jackass October 18, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

The recurrence relation should be
dp[i][j] = max number of strings such that when concatenated, they contain i Xs and j Ys.
Then

initialize dp[][] with -1
dp[0][0] = 0;
for(i = 0 to 3) {
for(j = 0 to 2) {
for(all strings as string){
    nX = countX(string);
    nY = countY(string);
    if(nX >=i && nY >= j)
        dp[i][j] = max(dp[i-nX][j-nY]+1,dp[i][j]);
}
}
}
dp[3][2] should be yor answer.

- viiids November 09, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

int maxNumber(vector<string> arr, int X, int Y) {

int dp[arr.size()][X][Y];
int ai0[arr.size()], ai1[arr.size()];

for(int i = 0; i <= X; i++) {
for(int j = 0; j <= Y; j++) {
dp[0][i][j] = 0;
}
}

for(int i = 0; i < arr.size(); i++) {
ai1[i] = 0;
ai0[i] = 0;
for(int j = 0; j < arr[i].length(); j++) {
if(arr[i][j] == '1')
ai1[i]++;
else
ai0[i]++;
}
}

if(ai0[0] <= X && ai1[0] <= Y)
dp[0][ai0[0]][ai1[0]] = 1;

for(int i = 1; i < arr.size(); i++) {
for(int x = 0; x <= X; x++) {
for(int y = 0; y <= Y; y++) {
dp[i][x][y] = dp[i-1][x][y];
if(x-ai0[i] >= 0 && y-ai1[i] >=0) {
if(dp[i-1][x-ai0[i]][y-ai1[i]] != 0)
dp[i][x][y] = max(dp[i][x][y], dp[i-1][x-ai0[i]][y-ai1[i]] + 1);
else {
if(x == ai0[i] && y == ai1[i])
dp[i][x][y] = max(dp[i][x][y], 1);
}
}
}
}
}

cout << dp[arr.size() - 1][X][Y] << endl;

}

- Anon January 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

int maxNumber(vector<string> arr, int X, int Y) {

        int dp[arr.size()][X][Y];
        int ai0[arr.size()], ai1[arr.size()];



        for(int i = 0; i <= X; i++) {
                for(int j = 0; j <= Y; j++) {
                        dp[0][i][j] = 0;
                }
        }

        for(int i = 0; i < arr.size(); i++) {
                ai1[i] = 0;
                ai0[i] = 0;
                for(int j = 0; j < arr[i].length(); j++) {
                        if(arr[i][j] == '1')
                                ai1[i]++;
                        else
                                ai0[i]++;
                }
        }

        if(ai0[0] <= X && ai1[0] <= Y)
                dp[0][ai0[0]][ai1[0]] = 1;

        for(int i = 1; i < arr.size(); i++) {
                for(int x = 0; x <= X; x++) {
                        for(int y = 0; y <= Y; y++) {
                                dp[i][x][y] = dp[i-1][x][y];
                                if(x-ai0[i] >= 0 && y-ai1[i] >=0) {
                                        if(dp[i-1][x-ai0[i]][y-ai1[i]] != 0)
                                                dp[i][x][y] = max(dp[i][x][y], dp[i-1][x-ai0[i]][y-ai1[i]] + 1);
                                        else {
                                                if(x == ai0[i] && y == ai1[i])
                                                        dp[i][x][y] = max(dp[i][x][y], 1);
                                        }
                                }
                        }
                }
        }

        cout << dp[arr.size() - 1][X][Y] << endl;

}

- Anonymous January 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

wrong answer

- Anonymous July 27, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I use dp to solve it. N represent the index of string, X represent the number of 0 and Y the number of 1. Here is the code. It passes several cases. Please tell me if you find it wrong.

#include <iostream>
#include <vector>
#include <climits>
using namespace std;

int minumum_numnber01(vector<string> strs, int X, int Y)
{
    int N=strs.size();
    int dp[N+1][X+1][Y+1];
    for(int i=0; i<=N; i++)
    {

        for(int j=0; j<=X; j++)
            for(int k=0; k<=Y; k++)
              dp[i][j][k]=INT_MAX/2;
          dp[i][0][0]=0;
    }

    for(int i=1; i<=N; i++)
     for(int j=1; j<=X; j++)
      for(int k=1; k<=Y; k++)
    {
        int tx=0, ty=0;
        for(auto ele:strs[i-1])
        {
           if(ele=='0') tx++;
           if(ele=='1') ty++;
        }
         dp[i][j][k]=dp[i-1][j][k];
         if(j-tx>=0 && k-ty>=0)
             dp[i][j][k]=min(dp[i-1][j][k], dp[i-1][j-tx][k-ty]+1);

    }
    cout<<dp[1][2][1]<<endl;
    return dp[N][X][Y];
}
int main()
{
   vector<string> strs{"001","01","0001","00011"};
   cout<<minumum_numnber01(strs, 6, 4)<<endl;
    return 0;
}

- Aimee July 29, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

typedef struct node
{
int nums0;
int nums1;
int index;
};

struct node means node.index string has node.nums0 0 and node.nums1 1.
so the question is equal to find node[i],node[j]..., that node[i].nums0
+node[j].nums0+... = x,node[i].nums1+node[j].nums1+... = y.
let SubSumEle(nodeArr, Arrlength,x,y) is result, the max result is
max(SubSumEle(nodeArr+0, Arrlength-0,x,y) ,SubSumEle(nodeArr+1, Arrlength-1,x,y) ,...)

- Anonymous September 18, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 3 vote

It is NP-complete.
For simplicity lets assume that the number of 0's or 1's is less than 100.
Then we can encode each string as y*100 + x where y is the # of 1's and x of 0's.
eg. the array {"01", "10", "0", "110"} becomes {101,101,1,201} and for X=3, Y=2 we get 203.
Now the problem is reduced to subset sum problem which is NP-complete.

- Anonymous September 19, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

You have changed the encoding. If the input is unary (like in this case) there is a linear time algorithm for subset sum, based on dynamic programming. Search the web for Strongly NP-Complete and pseudo polynomial time algorithm.

So as given, the problem is not NP-Complete (or to be more precise, NP-Hard, as it is not a decision problem).

- Anonymous September 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@anonymous: I think you are wrong here. No linear time algorithm exists for subset sum. It calls pseudo-polynomial time. Look up in wiki.

- anonymous2 September 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@anonymous2: Read properly: If the input is unary, then there is a polynomial time algorithm (linear might not be completely accurate) for subset sum.

i.e if array is {2,5,7} it is given as {"11", "11111", "1111111"}. Here size of the input is 14.

Check this out: cs.toronto.edu/~pmccabe/csc363-2005S/notes17.pdf

Look for UNARY-SUBSET-SUM.

- Anonymous September 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I concur with you. Thanks for introducing the fancy term "UNARY-SUBSET-SUM". As it shows the complexity is O(nt) where n = # of items, and t = size of input (represented in unary system).

So, for the given problem, how could we derive the solution? What would be the complexity of the DP solution?

- anonymous2 September 20, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

It is not fancy. All it says is the input representation matters when deciding complexity (especially so when taking about NP-Completeness).

See the first pages of pdf for the DP solution.

- Anonymous September 21, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I termed it "fancy" because it doesn't convince me to tell that generalized SUBSET SUM problem belongs to P, even corresponding UNARY-SUBSET-SUM does. What the point of thinking the input in UNARY system (where we formulate I/O in binary) and thus claiming that UNARY-SUBSET-SUM is solvable in polynomial time, when the complexity is O(nt) - same complexity of solving generalized subset sum. The only difference in UNARY-SUBSET-SUM formulation is t = "size" of input, whereas in generalized SUBSET-SUM t depends on "value" of input (and thus make the running time pseudo-polynomial in later case).

Now for the given problem, how could you encode it in UNARY? I can think about DP that solve it in O(nXY) where X = # of 0s, Y = # of 1s. Any faster method there?

- anonymous2 September 23, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@anaonymous2: You are missing the point entirely. UNARY-SUBSET-SUM is in P. The 'normal' SUBSET-SUM is NP-Complete.

As to the original interview question: it is ALREADY IN UNARY! How much time will it take to even read the whole input array? Isn't it Theta(t)?

If the encoding was different (eg you are given number of zeroes and ones instead of the whole string), it would be O(n + log t) (or something like that). Exponential difference.

For instance, say n was 5 and the five strings had around 2^(32) zeroes and ones each. This is like 40GB of data!

If instead the encoding was number of zeroes and ones, you would not require more than 320 bits!

By doing brute force in the second scenario (320 bits), you can quickly solve subset sum (as n=5).

But in the first scenario, you cannot avoid reading 40GB of data!

So saying there is no difference (rephrasing you) is incorrect.

And you are missing the context of the comments. Some guy comes in and claims the original problem is NP-Complete, which is _incorrect_ as the input is in _unary_. The comment was to point out that mistake.

- Anonymous September 23, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Thanks a lot. Your explanation clears out my confusion.

- anonymous2 September 23, 2011 | Flag

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews