Amazon Interview Question
SDE-2sCountry: India
C[i] = A[i] - B[i].??
This won't work. The contiguous subarray sum of A and B might be equal, but C[i] need not be zero. Consider the case:
A = 0101
B = 1010
Then C would be -11-11. C is not zero here, but you still have got a subarrays in A and B whose sums are equal
I think the solution might be correct as we have to look at subarray in C whose sum is zero not just C....so if we see C={-11-11} the sum is zero for i=0,j=3
So its basically we are finding i & j where we have equal no of 0`s or 1`s:
Case 1: i=0, j = 0
Sum[A(i,j)] = 0, Sum[B(i,j)] = 1, j-i=0 -> (invalid case as sum A(i,j) not equal to sum B(i,j))
Case 2: i=0, j = 1
Sum[A(i,j)] = 1, Sum[B(i,j)] = 1, j-i=1
Case 3: i=0, j = 2
Sum[A(i,j)] = 1, Sum[B(i,j)] = 2, j-i=2 -> (invalid case as sum A(i,j) not equal to sum B(i,j))
Case 4: i=0, j = 3
Sum[A(i,j)] = 2, Sum[B(i,j)] = 2, j-i=3
Among the valid cases if you see i=0, j=3 is the indices which satisfies both the given conditions and you you have the same from finding the longest sub array from C[i]=A[i]-B[i]
Lets consider another example:
[i] 0 1 2 3 4 5 6 7 8 9 10
A=[ 0 1 0 0 0 1 0 0 1 0 0]
B=[ 1 0 1 1 1 0 1 1 0 1 0]
C=[-1 1 -1 -1 -1 1 -1 -1 1 -1 0]
So the longest sub-array in C whose sum is zero => i = 5 & j = 8
Sum[A(5,8)] = 2, j-i = 3
Sum[B(5,8)] = 2, j-i = 3
One more:
A=[001000100]
B=[000001000] -> i=3, j= 8
Thanks Anonymous for this solution, let me see if I come up with any negative test case to fail you algorithm.
@Sai
Haa! That's brute-force with time complexity O(N^2). Your algorithm will hardly terminate with large inputs.
@Iret,Aam.ok.Udohc
I am not sure its O(N^2).
"This is same as finding the longest subarray with sum zero in the array C[i] = A[i] - B[i]. Prefix sums + hashtable should give it." this looks to me O(n).
Please explain how you see it as O(N^2)
@elaas.eyitoohc.
Once you have computed the C array by doing A[i] - B[j], What will be your algorithm to identify i and j such that (j-i) is largest and C[i] + C[i+1]...+C[j] = 0? If you try out all the possible combinations of i and j such that 0<=i<n and 0<=j<n, then your algorithm will be of complexity O(n^2).
Hope you understand it at least now. BTW, you are a gifted "ayitoohc"
I think it should be C[i] = C[i-1] + A[i] - B[i],
Then start scanning C from both sides for a 0, which will give us (i,j)
I also think this solution she work.ppl having different opinion plz provide test case.
This algo works ! And it's indeed O(n) .
Say we have two Arrays:
A=[ 0 1 0 0 0 1 0 0 1 0 0]
B=[ 1 0 1 1 1 0 1 1 0 1 0]
Then we create two arrays : C[i] = A[i] - B[i] . D[i] = D[i-1] + C[i];
C=[-1 1 -1 -1 -1 1 -1 -1 1 -1 0]
D=[-1,0,-1,-2,-3,-2,-3,-4,-3,-4,-4]
now we consider D-array. if D[i] == D[j] , what does it means ?
it means sum( C[i,j ] ) ==0 .
and since C[i] =A[i] - B[i] , now it means sum( A[i,j] ) == sum ( B[i, j] )
so now the task becomes find two indices in D[] ,and (j - i) is maximum.
we need the help of HashMap.
Here is the algo:
1.create D[] Array. init a empty hash_map
2.traverse the D[] Array , if D[j] exists in hash_map (let's say its valus is i ),then we update the answer with (i, j ) ;
3. if D[j] doesn't exist in hash_map , insert it !
4.finally ,we got the answer , and now you see we only traverse the D[] array one time , so the complexity is O(n ).
5. thanks!
@Iret,Aam.ok.Udohc:
As I said, you are an idiot. Read the original answer again. Compute C[i] = A[i] - B[i] and then you do PREFIX SUMS + HASHTABLE...
(shyJohnnie seems to have elaborated).
Hope you understand it at least now.
Thank you. Come again.
Iret,Aam.ok.Udohc is such a frigging moron, a prime example of 'pearls before swine'. This solution works, and is actually quite nice.
Appeared before. The above solution is correct. See: question?id=14569662 which has code.
Why cant we do it recursive
main()
{
int max diff = INT_MIN;
fun(arr1,arr2,0,n, &max_diff);
}
int fun(int arr1, int arr2, int i, int n, int * max_diff)
{
if(i>j)
return;
// first take the sum from i to j in both array if it same than J-I is maximum so return that
int sum1 = sum(arr1, i,j);
int sum2 = sum(arr2,i,j);
if(sum1 == sum2)
{
if((j-i)>*min_diff)
*min_diff = j-i;
}
else
fun(arr1,arr2,i+1,j);
fun(arr1,arr2,i,j+1);
return *min_diff;
}
Use 2 cumulative arrays C1 and C2 for both the arrays. The locations C1[i] and C2[i] hold the sum of all values up to index i from both of the arrays respectively.
0. Start with a window of size n(starting at index i=0 and ending at index j=n-1)
1. Compare the values of that window size using arrays C1 and C2. Return (i,j) if they are equal
2. Reduce the window size by 1. Now i and j become 0 and n-2 respectively. Compare the values in the cumulative arrays for the ranges: (0,n-2) & (1,n-1)
3. Repeat steps 1 and 2 until the window size is reduced to 1 or a match in the cumulative arrays is found in which case the corresponding (i,j) would be returned.
What is the value in cumulative array for (1,n-1). arent the values cumulated from 0 index ?
@tryingtosolvemystery
The sum from 1..n-1 is obtained from the cumulative array by using C1[n-1] - C1[0].
That is the main advantage of a cumulative array. You can obtain the summation between any two indices (i,j) in constant time once you have a cumulative array.
Consider following example:
0 1 2 3 4 5 6 7 8 9
[1,0,0,0,1,1,1,0,1,0] -> Sum = 5
[1,1,0,0,0,0,0,0,1,0] -> Sum = 3
Improvement from the left: 0 + 1 step
Improvement from the right: 9 - 8 steps
-> trim from the left to index 1
...but the best solution is:
Index: 0 to 4
Diff: 4
Sum: 2
Can you elaborate?
Here is the C# solution.
Please suggest on optimization, with data structure in case of DP.
void Main()
{
// Given 2 binary arrays A and B i.e. containing only 0s and 1s each of size N.
// Find indices i,j such that Sum of elements from i to j in both arrays is equal and j-i (i.e. the length of the set i,j) is the maximum possible value.
Tuple<int,int> result = FindMaxSubArray(0);
Console.WriteLine("start index " + result.Item1.ToString() + "end index" + result.Item2.ToString());
}
// Define other methods and classes here
BitArray A = new BitArray(new bool[]{false,true,false,false,true,true,false,false,false,true});
BitArray B = new BitArray(new bool[]{true,true,false,false,false,true,false,false,true,true});
Tuple<int,int> FindMaxSubArray(int startIndex)
{
int endIndex = 0;
int additionOfA = 0, additionOfB = 0;
if(startIndex >= A.Length - 1)
{
return new Tuple<int,int>(startIndex,-1000);
}
for(int i=startIndex; i<A.Length; i++)
{
additionOfA = additionOfA + Convert.ToInt32(A[i]);
additionOfB = additionOfB + Convert.ToInt32(B[i]);
if(additionOfA == additionOfB)
{
endIndex = i;
}
}
Tuple<int,int> subProblemSolution = FindMaxSubArray(startIndex + 1);
if((subProblemSolution.Item2 - subProblemSolution.Item1) > (endIndex - startIndex))
{
return subProblemSolution;
}
else
{
return new Tuple<int,int>(startIndex,endIndex);
}
}
Its simple recursive solution.
Initially set start index =0 and calculate the max sub array satisfying the condition, and keep incrementing the start end index till length of array. and Compare the resulting subarray length and return the indexes with max subarray length.
Damn! Did you think about the time complexity? You are going to blow up the computer resources man.. Be kind to CPU and RAM.
thats why I asked for
"Please suggest on optimization, with data structure in case of DP", I guess you missed reading it.
You don't need any exotic DS for DP. Use either an array or a hashtable. Your ingenuity lies in coming up with the recursive formulation of a solution for DP and then converting that recursion to use tables.
Optimal solution for this is O(N)...
Basically we can try to classify problems such that kind of approach and complexity can be identified quickly first. Anyone wana share their ideas on the usual approaches you follow while problem solving ?
We can create a temporary array C, such that C[i] = A[i] XOR B[i] and then problem reduces to find the maximum subaray of C with equal number of 0's and 1's
geeksforgeeks.org/largest-subarray-with-equal-number-of-0s-and-1s/
Here is a C# code that refers to a solution mentioned above. Its worst case time complexity is O(n^2) though. I am wondering if there is an O(n) solution.
public static int[] GetLongestBinarySubArrayWithEqualSum(bool[] A, bool[] B)
{
int[] SumA = CreateSumArray(A);
int[] SumB = CreateSumArray(B);
for (int size = A.Length; size >= 1; size--)
{
int count = A.Length - size + 1;
for (int i = 0; i < count; i++)
{
int j = size - 1 + i;
if (HaveEqualSum(SumA, SumB, i, j))
{
return new int[] { i, j };
}
}
}
return null;
}
private static bool HaveEqualSum(int[] SumA, int[] SumB, int i, int j)
{
return ((SumA[j] - SumA[i]) == (SumB[j] - SumB[i]));
}
private static int[] CreateSumArray(bool[] array)
{
int[] sumArray = new int[array.Length];
sumArray[0] = (array[0] ? 1 : 0);
for (int x = 1; x < array.Length; x++)
{
sumArray[x] = sumArray[x - 1] + (array[x] ? 1 : 0);
}
return sumArray;
}
////////
void Func(int a[], int n, int m, int& resi, int& resj)
{
int cnt =0, rcnt=0;
int idx=-1, ridx=-1;
int idx1=-1, ridx1=-1;
printf("cnt1x=%d\n", m);
for(int i=0, j=n-1; i<n && j>=0; i++,j--)
{
if(a[i]) cnt++;
if(a[j])rcnt++;
if(cnt==1 && idx==-1)idx=i;
if(rcnt==1 && ridx==-1)ridx=j;
if(cnt==m && idx1==-1)idx1=i;
if(rcnt==m && ridx1==-1)ridx1=j;
if(cnt==m && rcnt==m)break;
}
int i,j,ri,rj;
if(idx>(n-1-idx1))
{
i=0; j=idx-1;
}
else{
i=idx1+1; j=n-1;
}
if(ridx1>(n-1-ridx))
{
ri=0; rj=ridx1-1;
}
else{
ri=ridx+1; rj=n-1;
}
printf("i,j = %d,%d ri,rj=%d,%d\n", i,j,ri,rj);
if((rj-ri)>(j-i))
{
resi=ri;
resj=rj;
}
else
{
resi=i;
resj=j;
}
}
int main()
{
int a[]={1,0,1,1,1,1,0,1};
int b[]={1,0,0,0,1,1,1,0};
int N=8;
for(int i=0;i<N;i++)
printf("%d ", a[i]);
printf("\nsize=%d\n", N);
for(int i=0;i<N;i++)
printf("%d ", b[i]);
printf("\n");
printf("\n");
int cnt1a=0, cnt1b=0, cnt1x=0;
for(int i=0; i<N; i++)
{
if(a[i])cnt1a++;
if(b[i])cnt1b++;
//if(a[i]!=b[i])
int temp=a[i];
a[i]=((a[i]==b[i]?0:1) && a[i]);
b[i]=((temp==b[i]?0:1) && b[i]);
}
for(int i=0;i<N;i++)
printf("%d ", a[i]);
printf("\nsize=%d\n", N);
for(int i=0;i<N;i++)
printf("%d ", b[i]);
printf("\n");
printf("cnt1xa=%d cnt1xb=%d\n", cnt1a, cnt1b);
cnt1x = cnt1a-cnt1b;
int resi=0, resj=N-1;
if(cnt1x==0) printf("i=0, j=%d\n", N-1);
else if(cnt1x>0)
Func(a, N, cnt1x, resi, resj);
else
Func(b, N, -cnt1x, resi, resj);
printf("resi=%d resj=%d\n", resi, resj);
return 1;
}
\\\\\
void Func(int a[], int n, int m, int& resi, int& resj)
{
int cnt =0, rcnt=0;
int idx=-1, ridx=-1;
int idx1=-1, ridx1=-1;
printf("cnt1x=%d\n", m);
for(int i=0, j=n-1; i<n && j>=0; i++,j--)
{
if(a[i]) cnt++;
if(a[j])rcnt++;
if(cnt==1 && idx==-1)idx=i;
if(rcnt==1 && ridx==-1)ridx=j;
if(cnt==m && idx1==-1)idx1=i;
if(rcnt==m && ridx1==-1)ridx1=j;
if(cnt==m && rcnt==m)break;
}
int i,j,ri,rj;
if(idx>(n-1-idx1))
{
i=0; j=idx-1;
}
else{
i=idx1+1; j=n-1;
}
if(ridx1>(n-1-ridx))
{
ri=0; rj=ridx1-1;
}
else{
ri=ridx+1; rj=n-1;
}
printf("i,j = %d,%d ri,rj=%d,%d\n", i,j,ri,rj);
if((rj-ri)>(j-i))
{
resi=ri;
resj=rj;
}
else
{
resi=i;
resj=j;
}
}
int main()
{
int a[]={1,0,1,1,1,1,0,1};
int b[]={1,0,0,0,1,1,1,0};
int N=8;
for(int i=0;i<N;i++)
printf("%d ", a[i]);
printf("\nsize=%d\n", N);
for(int i=0;i<N;i++)
printf("%d ", b[i]);
printf("\n");
printf("\n");
int cnt1a=0, cnt1b=0, cnt1x=0;
for(int i=0; i<N; i++)
{
if(a[i])cnt1a++;
if(b[i])cnt1b++;
//if(a[i]!=b[i])
int temp=a[i];
a[i]=((a[i]==b[i]?0:1) && a[i]);
b[i]=((temp==b[i]?0:1) && b[i]);
}
for(int i=0;i<N;i++)
printf("%d ", a[i]);
printf("\nsize=%d\n", N);
for(int i=0;i<N;i++)
printf("%d ", b[i]);
printf("\n");
printf("cnt1xa=%d cnt1xb=%d\n", cnt1a, cnt1b);
cnt1x = cnt1a-cnt1b;
int resi=0, resj=N-1;
if(cnt1x==0) printf("i=0, j=%d\n", N-1);
else if(cnt1x>0)
Func(a, N, cnt1x, resi, resj);
else
Func(b, N, -cnt1x, resi, resj);
printf("resi=%d resj=%d\n", resi, resj);
int abc;
cin>>abc;
return 1;
}
Its complexity is O(n). Since we are traversing arrays only twice.
The algo is to find no. of 1s in both arrays a& b,
Then operate
a[i] = (a[i]^b[i])&a[i]
b[i] = (a[i]^b[i])&b[i]
now iterate the array with more no. of 1's.
in the resultant array we'll count no. of 1's=extra 1s and then after excluding subarray with sum zero, rest is the reqd subarray
Sum Of A[i..j] = Sum Of B[i..j]
Sum Of A[i..j]- Sum Of B[i..j] =0
evaluate (Ai-Bi)..(Aj-Bj) and store for further Cumulative sum (say CSi)
Store CSi in Map, ( let it be like 10, 9,...... , 9, 11) what does it mean is: Span (i to j) contains same set of shuffled data, so subtracting within that span it disappeared and came to a value just before starting of the Span.
Reference: shyJohnnie's comment in this discussion
-- At first it sounds like DP or LCS or suffix, prefix or both of them but finally a math puzzle
-- I hate to fry brain, I don't know this company's pay structure or is it worth frying ?
Use cumulative array for comparing...
int maxCommonArray(int A[], int B[], int n)
{
unordered_map<int, vector<int>> dist;
int ASum = 0, BSum = 0, ABDiff;
for (int i=0; i<n; i++)
{
ASum = ASum+A[i];
BSum = BSum+B[i];
ABDiff = ASum-BSum;
dist[ABDiff].push_back(i);
}
//Get the max length;
int maxLength = -1;
unordered_map<int, vector<int>>::iterator dI;
for (dI = dist.begin(); dI != dist.end(); dI++)
{
vector<int> crt = dI->second;
int sizeC = crt.size();
int crtLen = crt[sizeC-1] - crt[0];
if (crtLen>maxLength)
maxLength = crtLen;
}
return maxLength;
}
How about this? The below piece of code has the input Arrays 'a' and 'b'.
static void Main(string[] args)
{
int x = 0;
int y = 0;
int[] a = { 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1};
int[] b = { 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1};
int sum_of_Array_A = 0;
int sum_of_Array_B = 0;
int n = 0;
int i = 0;
int j = 0;
n = a.Length;
for (x = 0; x < n; x++)
{
for (y = x; y < n; y++)
{
sum_of_Array_A = sum_of_Array_A + a[y];
sum_of_Array_B = sum_of_Array_B + b[y];
if (sum_of_Array_A == sum_of_Array_B)
{
if (y - x > j - i)
{
i = x;
j = y;
}
}
}
sum_of_Array_A = 0;
sum_of_Array_B = 0;
}
Console.WriteLine("The maximum length of the sub-array is {0} and the indices are {1}, {2}", j - i, i, j);
}
very simple code and O(n) complexity. hope i didn't miss any corner stone.
int main(){
int a[] ={1,0,1,0,0,1,1,0,1,0};
int b[] ={0,1,0,1,1,1,0,0,0,0};
static int c;
for(int i=0; i<(sizeof(a)/sizeof(a[0])-1); i++){
a[i+1] = a[i] + a[i+1];
b[i+1] = b[i] + b[i+1];
if (a[i+1] == b[i+1])
c=i+1;
}
cout<<"answer is i = 0 and j = "<<c<<endl;
system("pause");
}
void printMaxCommonSubArray(const vector<int> &A, const vector<int> &B)
{
assert(A.size() != B.size());
vector<int> C(A.size(), 0);
for(int i=0; i<A.size(); ++i)
C[i] = A[i]-B[i];
unordered_map<int, int> h;
int sum = 0;
h[0] = -1;
int x = 0;
int y = 0;
for(int i=0; i<C.size(); ++i)
{
sum += C[i];
if(h.find(sum) != h.end())
{
if(y-x+1 < i- h[sum])
{
x = h[sum]+1;
y = i;
}
}
else
h[sum] = i;
}
}
cout<<"from\t"<<x << "to\t"<<y<<"\n";
void Max_IJ(int A[], int B[], int len){
Hash SumTbl;
int i,j,indx, max, sum, prev;
i=j=max=sum=0;
for (indx=0; indx<len;indx++){
// sum so far. "-" means B has more 1's than A, and vice versa.
sum += A[indx]-B[indx];
// if the same "sum" is found before then
// it means that A and B has the same # of 1's and 0's from this last "sum"
// up until now. So, the goal here is just to find which time has the max
// indx - prev where prev is the index right after the last time the same
// "sum" is found.
if ((prev=SumTbl.get(sum)) == 0)
SumTbl.put(sum, indx);
else {
if ((indx-(prev+1)) > max) {
max = indx - (prev+1);
i = prev+1;
j = indx;
}
}//else
}//for
cout <<"Max (i,j) at: ("<<i<<","<<j<<")<<endl;
return;
}
// Just re-format the code for better reading from my last post.
// We just need 1 pass over both array at the same time -- O(n)
void Max_IJ(int A[], int B[], int len){
Hash SumTbl;
int i,j,indx, max, sum, prev;
i=j=max=sum=0;
for (indx=0; indx<len;indx++){
// sum so far. "-" means B has more 1's than A, and vice versa.
sum += A[indx]-B[indx];
// if the same "sum" is found before then
// it means that A and B has the same # of 1's and 0's from this last "sum"
// up until now. So, the goal is just to find which time has the max
// (indx - (prev+1)) where prev is the index right after the last time the
// same "sum" is found.
if ((prev=SumTbl.get(sum)) == 0)
SumTbl.put(sum, indx);
else {
if ((indx-(prev+1)) > max) {
max = indx - (prev+1);
i = prev+1;
j = indx;
}
}//else
}//for
cout <<"Max (i,j) at: ("<<i<<","<<j<<")<<endl;
return;
}
This is same as finding the longest subarray with sum zero in the array C[i] = A[i] - B[i].
- Anonymous August 07, 2013Prefix sums + hashtable should give it.