Amazon Interview Question
Country: India
if the array looks like this: int[] a = { 1,5,1,1,1,1}. Then the 2 partitions would be {1,1,1,1,1} and {5} as per above. But avg of a set would be calculated as (sum_of_elements/elements_count). So, if we partition the sets as {1,1,1,1,1} and {5}, then their avgs would be 1 and 5 respectively.
Does this somehow mean that given a set, one of the conditions required for it to be partitioned into 2 sets with same average is that it contains even number of elements?
The Problem is very similar to the Subset problem.I guess the interviewer is targetting the subset problem concept .
with the fact: the average of the whole array should be the same with the two equal averages.
rest is to solve the subset sum problem.
here is the code which tries to find all combinations of such partition.
( well I guess the struct "result" can be simply replaced by another int array)
struct result{
int sum;
int num;
int *arr;
};
bool findSubsetSumWithCount(int *arr, int maxCount, int sum, result &myResult, int curPos)
{
int count = maxCount -1;
if (curPos >= ARRAY_LEN ) return false;
if ( arr[curPos] == sum){
if (count == 0 ){
myResult.num ++;
myResult.sum = arr[curPos];
myResult.arr[curPos] = 1;
return true;
}else{
return false;
}
}else if ( arr[curPos] > sum){
return false;
}else{
if( findSubsetSumWithCount(arr, count, sum-arr[curPos], myResult, curPos +1)){
myResult.arr[curPos] = 1;
myResult.num ++;
myResult.sum += arr[curPos];
return true;
}else{
return findSubsetSumWithCount(arr, count+1, sum, myResult, curPos +1);
}
}
}
void initMyResult(result &myResult, int len)
{
myResult.sum = -1;
myResult.num = 0;
myResult.arr = new int[len];
for (int i =0; i< len; i++)
myResult.arr[i] = 0;
}
int main(void)
{
int arr[ARRAY_LEN] = {10, 5, 7, 6, 2,9,3};
result myResult;
int sum = 0;
int ave = 0;
int len = ARRAY_LEN;
initMyResult(myResult, len);
// find the total ave:
for (int i=0; i < len; i++){
sum += arr[i];
}
ave = (int) sum/ len ;
for(int i =1; i< len; i++){
int calSum = ave * i;
if(findSubsetSumWithCount(arr, i, calSum, myResult, 0)){
cout << "found one result, average is " << myResult.sum/myResult.num << ", the first p
artition is: {" ;
for(int j=0; j<len; j++){
if(myResult.arr[j] == 1)
cout <<" " << arr[j];
}
cout << " }" << endl;
}
initMyResult(myResult, len);
}
cout << "this is the end. if nothing found, then this is an unpartitionable array :(\n";
return 0;
}
tested it with {10, 5, 7, 6, 2,9,3}
results (i'm amazed that there are several combinations):
found one result, average is 6, the first partition is: { 5 7 }
found one result, average is 6, the first partition is: { 10 6 2 }
found one result, average is 6, the first partition is: { 10 2 9 3 }
found one result, average is 6, the first partition is: { 10 5 7 6 2 }
found one result, average is 6, the first partition is: { 10 5 7 2 9 3 }
also tested with {1,2,3,4}
results:
found one result, average is 2, the first partition is: { 2 }
found one result, average is 2, the first partition is: { 1 3 }
found one result, average is 2, the first partition is: { 1 2 3 }
_____________________________________
well, i think there are some duplicates.
change the
for(int i =1; i< len; i++){
int calSum = ave * i;
........
to
for (int i = 1; i < len/2 +1; i++)
would solve the redundancy problem.
But can't the average be a float ?
example : {2,3,4,5} avg = 3.5
array split is {2,5} and {3,4}
Let me know if I am wrong
the average can be float, but if there exists equal subarrays, then the ave*num (=sum) will not be float; otherwise this ave has no corresponding subarrays.
as in your example, avg=3.5, when i =2, 3.5*2 =7, and the alg can find a corresponding subarray.
if 3.5*3=10.5 which is a float, the alg can't find a subarray and therefore for this i=3 it returns false (but will continue to try i=4,5,....).
public static void main(String[] args) {
int a[]={8,15,0,0,12,11};
int b[]={5,8,15,2,6};
new ArrayMidpoint().intersectionOfArrays(a, b);
Map map = new HashMap();
int sum=0;
for (int i = 0; i < a.length; i++) {
sum=sum+a[i];
map.put( sum,i);
}
sum=0;
int index=0,count=1;
for(int i=a.length-1;i>=0;i--)
{
sum+=a[i];
if(map.get(sum)!=null)
{
index=Integer.parseInt(map.get(sum)+"");
if(index+count==a.length-1)
{
break;
}
}
count++;
}
for(int i=0;i<=index;i++)
{
System.out.print(a[i]+" ");
}
System.out.print(" ");
for(int i=index+1;i<a.length;i++)
{
System.out.print(a[i]+" ");
}
}
void print_partitions(int *pStart, int sz)
{
int left_sum = 0;
int right_sum = 0;
int i = sz;
while(i--){
right_sum += pStart[i];
}
for(i=0;i<sz;++i) {
left_sum += pStart[i];
right_sum -= pStart[i];
if ((i+1)*right_sum == (sz - i - 1)*left_sum) {
copy(pStart, &pStart[i+1], ostream_iterator<int>(cout, ", "));
cout << "<-> ";
copy(&pStart[i+1], &pStart[sz], ostream_iterator<int>(cout, ", "));
cout << endl;
}
}
}
void print_partitions(int *pStart, int sz)
{
int left_sum = 0;
int right_sum = 0;
int i = sz;
while(i--){
right_sum += pStart[i];
}
for(i=0;i<sz;++i) {
left_sum += pStart[i];
right_sum -= pStart[i];
if ((i+1)*right_sum == (sz - i - 1)*left_sum) {
copy(pStart, &pStart[i+1], ostream_iterator<int>(cout, ", "));
cout << "<-> ";
copy(&pStart[i+1], &pStart[sz], ostream_iterator<int>(cout, ", "));
cout << endl;
}
}
}
Guys first of all its an NP complete problem and it is a partition problem. There can be two solutions for this
1) A greedy solution where the numbers are traversed in reverse order and the numbers are inserted into two sets each time checking which set has the least sum.
2) Its a differencing algorithm
Let me consider a set {1,2,3,4}- Here we can determine whether the set can be partitioned or not
Algorithm
First have a boolean array bol[], which is like an hash array and this will have the bol[1],bol[2],bol[3],bol[4] as true ...
Next calculate the sum of the subset which is sumsubset=totalsum/2
Then from this sumsubset decrement till the value of bol[sumsubset]=true
Inside the loop make this calculation as the difference=(totalsum-sumsubset)-sumsubset
First have a boolean array bol[]
void balancepoint(int *a ,int n)
{
int i,sum1=0,sum2=0,avg1,avg2;
//find sum of the array
for(i=0;i<n;i++)
sum2+=a[i];
//find left hand side and right hand side avg.
for(i=0;i<n;i++)
{
sum1+=a[i]; avg1=sum1/(i+1);
sum2-=a[i]; avg2=sum2/(n-i-1);
if(avg1==avg2)
{cout<<i ;return;}
}
}
partition may contain elements that are non-contiguous in the array...........Moreover you have to partition rather than taking averages of subarrays....
Obviously there's a solution. In fact, I have at least a few in mind.
The question is how efficient your solution is. That's always the question in computer science. Algorithms for most things are obvious if you're willing to accept inefficient ones.
@eugene.yarovoi My questions was: are we given only the arrays which allow such partitioning, or the array contents are arbitrary, and we shall deduce also whether the solution exists. For example, this array cannot be partitioned: {1,2}
My solution is this.
Sort the array , using any Nlog(N ) algorithm.
After sorting, If the array has N elements, pick the
first element, and the last,
Second element and the last but one th element
and so on.. Until you pick N/2 element this will for subset -1
The remaining N/2 elements will form subset 2
Please read the question. It is talking about average. The subarrays do not have to contain N/2 elements.
I don't understand... If we will not have such array then..any algo can't grantee the division of array as expected. for example : 1, 3, 4, 15. how can you divide this array.
I have manually tested it.. Might be not full proof ..
Sort the array in decreasing order..
1. take the first number and add to one diff array.. and second number add to 2nd array.. avg of 1st will 1st number avg of 2nd will be 2nd number.
2. take the next element from original array and add to array having greater average and keep track of new avg and number of element added to that array for further calculation.
3. Do the step 2 untill the last element of original array..
4. If possibility exist you will find the arrays having equal average.
it will fail in this case:
1,2,3,4,5,15,19
the solution should be 1,5,15 and 2,3,4,19
Both arrays will have avg 7, but your solution will fail in this.
can you please check correctly ..
My algo will give output as
19 5 3 1
15 4 2
which is also correct... :) there may be more then one solution and mine algo is giving one of them...
Sanjay: I don't believe your algorithm will always yield the correct answer. Consider 3 12 24 20 31. The correct answer is 3 20 31 | 12 24. I believe your algo would try 3 24 31 | 12 20 and then consider this unpartionable.
19, 15, 5, 1....
with this test sample the algo fails to partition, into 19, 1 and 15,5
1)Sort the Array (nlgn)
2)Map the sorted array to balanced BST.
3)Now each level average in the binary array should be the average of the entire BST (given array)
4)Partition at any level of the created BST using BFS to make it into two parts.
5)If we need each partition to have n/2 elements, partition at level before leaf nodes.
Time complexity: O(n), Space Complexity: Constant Space(If we are using the array itself as BST).
Solution by mapping the problem to SubSet Problem
1. Calculate the sum of all elements (num[1….N])in the array, let it be X.
2. If X is odd , no solution is possible
3. If X is even, find a subset of elements in the array whose sum is X/2.
Solution to the subset sum problem
Solution consists of filling a 2d array M[i,j]; (Dynamic Programming)where
1 <= i <= N (number of elements in the array)
1 <= j <= X/2 (desired sum)
where M[i,j] = true iff there exists a subset in the array num[1..i] whose sum is j
Fill the array using the following recursion:-
1. M[1,j] = true iff num[1] == j, for 1 <= j <= X/2
(subset sum exists, if the first element is the sum we are looking for )
2. M [i,s] = M[i-1,s] || num[i] == s || M[ i-1,s-num[i] ]
(
A subset some of s exists in array of i elements if either of the following is true :-
1. The subset sum exists in array with less than i elements
2. ith element is equal to the sum we are looking for.
3. subset with less than i elements has a subset_sum of s-num[i], in this case
num[i] provides the remaining part of the required sum
)
Once the array has been filled, M[N,X/2] determines if array can be divided into two subsets with equal sum.
The subsets can be obtained by using a backpointer with each node as follows
1. backpointer = NULL;
2-1 backpointer = num[i-1]
2-2 backpointer = NULL
2-3 backpointer = num[i-s]
to print the subset(one of the subsets),
follow backpointer till a NULL is reached
an element is part of the solution only iff the corresponding value is true
Time : O(N*X)
Space : O(N*X)
what about 1, 3, 5 whose sum is 9 and can be divided in 3 and 1,5 :) check with example man.
Can you elaborate more on how that backpointer can be accomplished? On 2-3 are you sure num[i - s], it is quite easily to be out of index, isn't it?
@Backpointer - if M[N,X/2] is true, follow backpointers till u reach a null, along the trail only the cells that have current value as true are a part of solution(one of the subsets)
@2-3 - the value is false if it is out of bounds
* - as pointed out by Sanjay, this solution does splits the array into two subsets with equal sum and not average.
Thanks for the explaination.
Is backpointer implemented as another array for tracking purposes or some sort of linked-list?
dont u ppl think tat we are trying to complicate a simple problem.. the question said partition the array and NOT divide it into SUBSETS.. so array partition func shud return a split location.. i.e avg(LHS of split)=avg(RHS of split)
public static void main(String[] args) {
int a[]={8,15,0,0,12,11};
int b[]={5,8,15,2,6};
new ArrayMidpoint().intersectionOfArrays(a, b);
Map map = new HashMap();
int sum=0;
for (int i = 0; i < a.length; i++) {
sum=sum+a[i];
map.put( sum,i);
}
sum=0;
int index=0,count=1;
for(int i=a.length-1;i>=0;i--)
{
sum+=a[i];
if(map.get(sum)!=null)
{
index=Integer.parseInt(map.get(sum)+"");
if(index+count==a.length-1)
{
break;
}
}
count++;
}
for(int i=0;i<=index;i++)
{
System.out.print(a[i]+" ");
}
System.out.print(" ");
for(int i=index+1;i<a.length;i++)
{
System.out.print(a[i]+" ");
}
}
The average of those two arrays would be equal to the average of the whole. So once we got to know the average then we can search for subarray which has this average through knapsack algorithm (dynamic programming). Complexity would be n^2
- Aditya January 17, 2012