Google Interview Question
Country: United States
As mentioned in another post, the median is Nth and (N+1)/2. Just call this function twice. Still O(log N)
I may not understood your solution completely, but I don't think we need all the trouble.
Since both the arrays are sorted, we start two pointers at the middle elements of the arrays. Compare the elements.If they are equal they are the medians. If one is less than the other, move the pointer of the lesser value to the right. A constant number of such technical manipulations of the pointers and comparisons of the elements, you should be able to realize the median. Let me know if you have a counter-example.
@Murali
Counter example:
A1 = {0,2,8,22,34,50}
A2 = {1,3,4,10,15,26}
Step 1:
A1.median = (8+22)/2 = 15
A2.median = (4+10)/2 = 7
A2.median < A1.median ==> .. (moving A2 pointer right until A2.median = 15)
Then we get A1.median == A2.median (=15)
And we see that 0,2,8,1,3,4,10 are below 15 and 22,34,50,26 are above 15. Difference between the groups is 3, so the answer is wrong.
I haven't tried to run your solution, but thinking about it, I see:
1. This is not binary search if you don't reduce the problem size by half - which you dont do here, you advance your pointers by 1 at each step.
2. The arrays in your example are not of equal size as the problem state
3. How do you "find how many elements are smaller or equal to the middle elements in each array" you don't know anything about the relations between the elements of the two arrays, so you cannot simply sum up the indices to come up with this number (once you chose the average of the two medians, some elements can move to the greater or lower group).
IMHO your solution, if it works, is on O(N)
@Murali, that would be O(N)
@Anonymous, in each step, the range of one of the arrays is reduced in half. maybe you want to check it more carefully
Am I missing something, or is the example above (in Miguel's original answer) confusing because he is using two arrays that are NOT of equals sizes? Example:
v[] = {1,2,7,9}
w[] = {3,4,5,6,8}
Shouldn't these be equal sizes, as stated in the problem?
Input arrays are sorted. we have to find median of array of size 2n. median would be (2n+1)/2
start comparing the each element from array 1 and array 2, which ever element is less move forward for next element on that array only, proceed till (2n+1)/2th element return that element
time complexity ~O(n/2) space complexity: O(1)
The median will always be (nth + (n+1) th)/2 because there are two arrays of size n, so there will be 2n(even number of) elements.
Here is the solution in Java:
// Both arrays are sorted and have the same size
public static double findMedianOfTwoArrays(int[] a1, int[] a2) {
// Since total size is 2N, median is average of (length/2)th and ((length/2)+1)th largest values
int median1 = 0;
int median2 = 0;
int i1 = 0;
int i2 = 0;
for (int i = 0; i < a1.length; i++) {
median1 = a1[i1] <= a2[i2] ? a1[i1++] : a2[i2++];
}
if (i1 >= a1.length)
median2 = a2[i2];
else if (i2 >= a2.length)
median2 = a1[i1];
else
median2 = a1[i1] <= a2[i2] ? a1[i1] : a2[i2];
return (median1 + median2) / 2.0;
}
An O(log N) solution:
Algorithm:
1) Calculate the medians m1 and m2 of the input arrays arr1 and arr2 respectively.
2) If m1 and m2 both are equal then we are done and we return m1 (or m2).
3) If m1 is greater than m2, then median is present in one of the below two sub-arrays:
a) From first element of arr1 to m1 (arr1[0...n/2]).
b) From m2 to last element of ar2 (ar2[n/2...n-1]).
4) If m2 is greater than m1, then median is present in one of the below two sub-arrays:
a) From m1 to last element of ar1 (ar1[n/2...n-1]).
b) From first element of ar2 to m2 (ar2[0...n/2]).
5) Repeat the above process until size of both the sub-arrays becomes 2.
6) If size of the two arrays is 2 then use below formula to get the median:
Median = (max(arr1[0], arr2[0]) + min(arr1[1], arr2[1])) / 2
Complexity Theta(lg N):
double
median(int* begin1, int* end1, int* begin2, int* end2) {
unsigned n = end1 - begin1;
// assume n == end2 - begin2
if (n == 1)
return (*begin1 + *begin2) / 2.0;
n = n / 2;
int m1 = *(begin1 + n);
int m2 = *(end2 - n);
if (m1 < m2)
return median(begin1 + n, end1, begin2, end2 - n);
return median(begin1, begin1 + n, end2 - n, end2);
}
1. Since both arrays have N elements - we totally have 2N elements, so the median must be the average of 2 elements
2. The simplest way and O(N) way is to merge the 2 arrays (O(N)) and look for the (N/2)th element (O(N)).
I doubt it can be done in O(logN), since essentially the total set containing both elements is unsorted and you know nothing about the relations between the elements in the two arrays.
1. Since both arrays have N elements - we totally have 2N elements, so the median must be the average of 2 elements
2. The simplest way and O(N) way is to merge the 2 arrays (O(N)) and look for the (N/2)th element (O(N)).
I doubt it can be done in O(logN), since essentially the total set containing both elements is unsorted and you know nothing about the relations between the elements in the two arrays.
1. Since both arrays have N elements - we totally have 2N elements, so the median must be the average of 2 elements
2. The simplest way and O(N) way is to merge the 2 arrays (O(N)) and look for the (N/2)th element (O(N)).
I doubt it can be done in O(logN), since essentially the total set containing both elements is unsorted and you know nothing about the relations between the elements in the two arrays.
Actually, this question is very similar to 'how to find kth item in two sorted array with size N'.
For median, if sum of length is even, we need to find, 2n/2 + 2n/2+1. When sum of length is odd, we can simply find the number at 2n/2 position.
public static double findMedian(int[] a /* array1 */, int[] b /* array2 */) {
int median = getMedian(a.length, b.length);
if (median == 0) throw new IllegalArgumentException("Can not get median");
if (a[a.length - 1] <= b[0] || b[b.length-1] <= a[0]) {
return preProcessing(a, b, median);
}
if ((a.length + b.length)%2 == 0) {
return (findMedian(a, 0, b, 0, median) + findMedian(a, 0, b, 0, median + 1))/ 2; // 2logK -> O(logK)
}
return findMedian(a, 0, b, 0, median); // logK
}
private static double findMedian(int[] a, int aS, int[] b, int bS, int median /* median */) {
if (median == 1) {
return min(a[aS], b[bS]);
}
if (a[aS + median/2 -1] <= b[bS + median/2 -1] ) {
return findMedian(a, aS + median/2, b, bS, median - (median/2));
} else{
return findMedian(a, aS, b, bS + median/2, median - (median/2));
}
}
1. As both arrays are of size N, the median will be at index N+1.
2. Take a temp variable to run from 0 to N. take separate indexes for both array and adjust the index of the relevant array based on comparison.
int findMedian(int[] a, int[] b) {
int len = a.length;
System.out.println("median will be at pos::" + (len+1));
if (a[0] > b[len-1]) {
System.out.println("median is::" + a[0]);
return a[0];
} else if (a[len-1] < b[0]) {
System.out.println("median is::" + b[0]);
return b[0];
}
// stores the median
int median = -1;
// stores current index of first array
int index1 = 0;
// stores current index of second array
int index2 = 0;
// stores the number of iterations.
int temp = 0;
while (temp < len+1) {
if (a[index1] < b[index2]) {
median = a[index1];
index1++;
temp++;
} else {
median = b[index2];
index2++;
temp++;
}
}
return median;
}
/*
Given 2 set of arrays of size N(sorted +ve integers ) find the median of the resultant array of size 2N.
(dont even think of sorting the two arrays in a third array , though u can sort them. Try something better than order NLogN
for visual c++ 2005
*/
#include "stdafx.h"
#include <iostream>
#include <conio.h>
#include <algorithm>
#include <map>
#include <vector>
#include <list>
#include <iterator>
#include <math.h>
#include <numeric>
#include <sstream>
#include <stack>
#include <string>
using namespace std;
class Solution
{
public:
static int FindNLargestNumberInTwoArray( int arrA[], int nALength, int arrB[], int nBLength, int n)
{
if ( n == 1 )
return max( arrA[ nALength - 1], arrB[ nBLength - 1]);
int nRemoveNum = n / 2;
int nMinRemovedValueInA = arrA[ nALength - nRemoveNum ];
int nMinRemovedValueInB = arrB[ nBLength - nRemoveNum ];
if ( nMinRemovedValueInA > nMinRemovedValueInB )
// remove top nRemove Num in A
return FindNLargestNumberInTwoArray( arrA, nALength - nRemoveNum, arrB, nBLength, n - nRemoveNum );
else
// remove top nRemove Num in B
return FindNLargestNumberInTwoArray( arrA, nALength, arrB, nBLength - nRemoveNum, n - nRemoveNum );
}
static int FindMedian( int arrA[], int nALength, int arrB[], int nBLength)
{
return FindNLargestNumberInTwoArray(arrA, nALength, arrB, nBLength, ( nALength + nBLength + 1 ) / 2 );
}
};
int _tmain(int argc, _TCHAR* argv[])
{
int arrA[] = {1,3,5,7,8,9,17};
int arrB[] = {2,10,11,15,16,17,18};
int nResult = Solution::FindMedian( arrA, sizeof( arrA ) / sizeof( int ), arrB, sizeof( arrB ) / sizeof( int ) );
// output result
cout << "The answer is " << nResult << "." << endl;
_getch();
return 0;
}
I can get O(N)
int getMedimumOfTwoAray(int array1[], int array2[], int size)
{
int min, max;
int a_front, b_front, a_rear, b_rear;
int front_counter, rear_counter;
a_front = b_front = front_counter = rear_counter = 0;
a_rear = b_rear = size - 1;
min = (array1[0] > array2[0]) ? array2[0] : array1[0];
max = (array1[size-1] > array2[size-1]) ? array1[size-1] : array2[size-1];
while ( front_counter < size){
if (array1[a_front] > array2[b_front]){
min = array2[b_front];
b_front++;
}
else{
min = array1[a_front];
a_front++;
}
front_counter++;
if (array1[a_rear] > array2[b_rear]){
max = array1[a_rear];
a_rear--;
}
else{
max = array2[b_rear];
b_rear--;
}
rear_counter++;
}
return (min+max) / 2;
}
The idea is compare the median of both arrays:
Let Array A have smaller median M_A than Array B having M_B
M_A <= M_B
1. Delete elements in A before M_A (less than N elements will come before M_A in merged sorted array)
2. Delete elements in B before M_B (less than N elements will come after M_B in merged sorted array)
(be careful to delete equal number of elements from both A and B so that median remains same, and number of elements deleted should be < N/2 in both arrays)
Code:
--------------------------------------------------------------------------------
#include <iostream>
double findCommonMedian(int A[], int B[], int N) {
if(N <= 0)
return 0;
if(N == 1)
return ((double)A[0] + (double)B[0]) / 2;
int mid = (N - 1) / 2;
int * S = NULL;
int * L = NULL;
if(A[mid] <= B[mid]) {
S = A;
L = B;
}
else {
S = B;
L = A;
}
if(N == 2)
return ((double)S[mid + 1] + (double)L[mid]) / 2;
if(S && L) {
if(N & 1)
findCommonMedian(S + mid, L, mid + 1);
else {
findCommonMedian(S + mid, L, mid + 2);
}
}
}
int main() {
int A[] = {1, 4, 6, 7, 8, 9, 11, 13, 14, 17, 22};
int B[] = {2, 3, 5, 10, 12, 15, 16, 18, 19, 20, 21};
std::cout << findCommonMedian(A, B, 11) << std::endl;
return 0;
}
double findKthElement(int A[], int m, int B[], int n, int k){
if(m==0) return B[k-1];
if(n==0) return A[k-1];
if(k==0) return min(A[0], B[0]);
if(A[m/2] >= B[n/2]){
if(k > m/2+n/2+1)
return findKthElement(A, m, B+n/2+1, n-n/2-1, k-n/2-1);
else
return findKthElement(A, m/2, B, n, k);
}
else{
if(k > m/2+n/2+1)
return findKthElement(A+m/2+1, m-m/2-1, B, n, k-m/2-1);
else
return findKthElement(A, m, B, n/2, k);
}
}
double findMedianSortedArrays(int A[], int m, int B[], int n) {
if((m+n)&1)
return findKthElement(A, m, B, n, (m+n+1)/2);
else
return (findKthElement(A, m, B, n, (m+n)/2)
+ findKthElement(A, m, B, n, (m+n)/2+1))/2;
}
O(Log(n)) solution.
int two_array_median(int a[], int b[], int N) {
if (!a || !b || N < 1)
return -1;
if (N == 1)
return (a[0]+b[0])/2;
if (N == 2)
return (std::max(a[0], b[0]) + std::min(a[1], b[1]))/2;
int m1 = one_array_median(a, N);
int m2 = one_array_median(b, N);
int mid = N/2+1;
if (m1<m2) {
return two_array_median(a+N-mid, b, mid);
} else {
return two_array_median(a, b+N-mid, mid);
}
}
int one_array_median(int a[], int N) {
if (N % 2)
return a[N/2];
else
return (a[N/2 - 1] + a[N/2])/2;
}
void getMedian()
{
int array1[] = { 1,2,3,40,50};
int array2[] = {8,9,15,16,90};
int i =0,j=0;
int counter=1;
int median = (sizeof(array1)/sizeof(array1[0]) +sizeof(array2)/sizeof(array2[0]))/2;
int currentMedian;
while (counter <= median )
{
if(array1[i] < array2[j])
{
currentMedian = array1[i++];
}
else
{
currentMedian = array2[j++];
}
counter++;
}
cout<<"Median =" <<currentMedian;
}
a=[0,…,a_start, …, a_mid,…,a_end,…,a.length-1]
b=[0,…,b_start, …, b_mid,…,b_end,…,b.length-1]
Element number of a[a_start,…,a_mid] and b[b_start,…,b_mid] is half_length.
find the kth from a and b.
Assume a_mid is larger than b_mid.
If kIf k>=half_length, then we know that kth number can’t be in lower part of b, b[b_start,…,b_mid]. So next, we search a[a_start, a_end] and b[b_mid+1,…,b_end]. Meanwhile, we should set new value to k. k=k-(b_mid-b_start+1)
Return result:
if(a_start>a_end){
return b[b_start+k-1];
}
if(b_start>b_end){
return a[a_start+k-1];
}
To find my code and read more analysis, please visit my website: allenlipeng47.com/PersonalPage/index/view/94/nkey
a=[0,…,a_start, …, a_mid,…,a_end,…,a.length-1]
b=[0,…,b_start, …, b_mid,…,b_end,…,b.length-1]
Element number of a[a_start,…,a_mid] and b[b_start,…,b_mid] is half_length.
find the kth from a and b.
Assume a_mid is larger than b_mid.
If k>=half_length, then we know that kth number can’t be in lower part of b, b[b_start,…,b_mid]. So next, we search a[a_start, a_end] and b[b_mid+1,…,b_end]. Meanwhile, we should set new value to k. k=k-(b_mid-b_start+1)
Return result:
if(a_start>a_end){
return b[b_start+k-1];
}
if(b_start>b_end){
return a[a_start+k-1];
}
To find my code and read more analysis, please visit my website: allenlipeng47.com/PersonalPage/index/view/94/nkey
Algorithm:
1) Calculate the medians m1 and m2 of the input arrays ar1[]
and ar2[] respectively.
2) If m1 and m2 both are equal then we are done.
return m1 (or m2)
3) If m1 is greater than m2, then median is present in one
of the below two subarrays.
a) From first element of ar1 to m1 (ar1[0...|_n/2_|])
b) From m2 to last element of ar2 (ar2[|_n/2_|...n-1])
4) If m2 is greater than m1, then median is present in one
of the below two subarrays.
a) From m1 to last element of ar1 (ar1[|_n/2_|...n-1])
b) From first element of ar2 to m2 (ar2[0...|_n/2_|])
5) Repeat the above process until size of both the subarrays
becomes 2.
6) If size of the two arrays is 2 then use below formula to get
the median.
Median = (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1]))/2
// A divide and conquer based efficient solution to find median
// of two sorted arrays of same size.
#include<bits/stdc++.h>
using namespace std;
int median(int [], int); /* to get median of a sorted array */
/* This function returns median of ar1[] and ar2[].
Assumptions in this function:
Both ar1[] and ar2[] are sorted arrays
Both have n elements */
int getMedian(int ar1[], int ar2[], int n)
{
/* return -1 for invalid input */
if (n <= 0)
return -1;
if (n == 1)
return (ar1[0] + ar2[0])/2;
if (n == 2)
return (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1])) / 2;
int m1 = median(ar1, n); /* get the median of the first array */
int m2 = median(ar2, n); /* get the median of the second array */
/* If medians are equal then return either m1 or m2 */
if (m1 == m2)
return m1;
/* if m1 < m2 then median must exist in ar1[m1....] and
ar2[....m2] */
if (m1 < m2)
{
if (n % 2 == 0)
return getMedian(ar1 + n/2 - 1, ar2, n - n/2 +1);
return getMedian(ar1 + n/2, ar2, n - n/2);
}
/* if m1 > m2 then median must exist in ar1[....m1] and
ar2[m2...] */
if (n % 2 == 0)
return getMedian(ar2 + n/2 - 1, ar1, n - n/2 + 1);
return getMedian(ar2 + n/2, ar1, n - n/2);
}
/* Function to get median of a sorted array */
int median(int arr[], int n)
{
if (n%2 == 0)
return (arr[n/2] + arr[n/2-1])/2;
else
return arr[n/2]
}
We can reduce this to finding the Nth element in 2 sorted arrays. This can be done in O(log N) with a binary search.
My approach is to have the usual indices for binary searching an array, but for both arrays. Then, find how many elements are smaller or equal to the middle elements in each array. If the amount of numbers less or equal to the middle elements is less than N, then the answer can't be one of the numbers smaller or equal than the smallest middle element, so advance the binary search indices for them. Example:
v[] = {1,2,7,9}
w[] = {3,4,5,6,8}
nth = 4
1) v[midv] = 2, w[midw] = 5, the amount <= middles = 5 which is > N so decrease the largest middle
2) v[midv] = 2, w[midw] = 3 (left=0, right=1); amount = 3 <= N, advance midv, N = 2
3) v[midv] = 7 (3,4), w[midw] = 3 ; amount = 2 <= advance midw, N = 1
4) v[midv] = 7, w[midw] = 4 (1,1) ; amount = 2 > N, advance midv
...
leftv > rightv, answer is w[leftw] = 4
- Miguel Oliveira July 25, 2014