Amazon Interview Question
Software Engineer / DevelopersYes. That seemed to be the right approach to me also. If we know the numbers are all less than M, we can use counting sort which is O(n), comparison takes O(n) time.
Here is a sample implementation using quicksort.
#include <stdio.h>
#define M 8
#define N 8
#define SIZE 10
void swap(int s[], int i, int j) {
int temp;
temp = s[i];
s[i] = s[j];
s[j] = temp;
}
void quicksort(int s[], int l, int h) {
int i, last;
if(l>=h)
return;
for(last=l, i=l+1; i<=h; i++)
if(s[i]<s[l])
swap(s, i, ++last);
swap(s, l, last);
quicksort(s, last+1, h);
quicksort(s, l, last-1);
}
int main() {
int s[N] = {2,5,6,8,10,2,3,5}; // array1 of size N
int t[M] = {4,7,3,5,2,10,2,4}; // array2 of size M
int r[SIZE]; // result array
int i, j, k;
// sort the arrays
quicksort(s, 0, 7);
quicksort(t, 0, 7);
i = j = k = 0;
while(i<N && j<M) {
if(s[i]==t[j]) {
r[k++] = s[i];
i++;
j++;
} else if(s[i] > t[j])
j++;
else
i++;
}
for(i=0; i<k; i++)
printf("%2d ", r[i]);
printf("\n");
return 0;
}
Is the question to output numbers for each occurence in both arrays in sorted order? If so for the second example, shouldn't the output contain (2,2,5,7)
Sort the smaller array and for every element in the larger array check if the element exists.
Take 2 pivots to the larger array, *begin and *end
for every element
if element exists then increment begin pivot
else swap with *end and decrement end pivot.
elements from start to begin pivot of the larger are the common elements
Complexity: O(m*log(m) + n*log(m)) where m<n { ==> O(n*log(m)) }
Space : O(1)
1)Construt a BST(Binary Search Tree) for smaller array..
2)Now for each element of larger array check for it in BST..
3)If u find it Print that node and Delete it from BST.
BST helps very efficiently to identify the elemnts that are repeated...As in 2nd example of given Question..
[2,2,2,3,4,5,6,7] & [2,2,5,5,7,10] output: [2,2,5,7]
This one works Good Using BST...
We can also construct BST for larger array..instead of smaller...To reduce the Time complexity..
Now Timecomplexity will be
O(n*log(m))
m--larger array for which BST was constructed
n--smaller arrat
We can also construct BST for larger array..instead of smaller...To reduce the Time complexity..
Now Timecomplexity will be
O(n*log(m))
m--larger array for which BST was constructed
n--smaller arrat
Well I guess the interview said that there is no space for hashmap. If that's true then hashmap would have reqd O(n) space which is the same as binary tree. So I guess maybe he wouldn't be much happier with Binary tree soln as well
Well another alternative is to use count sort if the range is small.Its not directly dependent on N but its rather dependent in the range of the numbers contained in the array. So if the range is small space can be saved.
Or yet another alternative is that if the repetitions r small then probably use a bit vector and assign lets say 4 bits per number. This would allow us a maximum repetitions upto 16. Assigning 4 bits per number vl still save a lot of space then assigning rather conventional 32 bits for a number.
I've asked the interviewer the range of numbers, he said that the range is just as large as the integer. Since I would say, if the range is smaller, we can use bucket sort or radix sort. However, it's not possible to do that.
the problem with the BST or any other searching method like binary search is flawed with the presence of duplicate integers in the array
{2,2} and {2,2,2} and if you create the BST for the first array the o/p has to be {2,2,2} which is wrong and should be {2,2} unless you add additional data to maintain if a particular node in the BST was selected earlier which nullifies the constraint of additional space
i guess the approach of sorting both arrays and maintaining two separate pointers for both the arrays is a better approach
here we increment both the pointers on a hit and increment the pointer with a smaller value on miss till we reach the end of the list.
on hit we print the value
so the overall complexity would be if one array is of size m and other is of size n then
complexity is mlogm+nlogn+n+m
and if n>>m then it is nlogn
I think using map it should work :
map<int, int> m1, m2;
int arr1[] = {2,2,2,3,4,5,6,7};int size1 = sizeof(arr1)/ sizeof(int);
int arr2[] = {2,2,5,5,7,10};int size2 = sizeof(arr2)/ sizeof(int);
for(int i=0;i<size1;++i)
m1[arr1[i]]++;
for(int i=0;i<size2;++i)
m2[arr2[i]]++;
map<int, int>::iterator iter;
map<int, int>::iterator iter2;
if( m1.size() >= m2.size() )
{
for(iter=m2.begin();iter!=m2.end(); ++iter)
{
if( m1.find( (*iter).first ) != m1.end() ){
int min = (*iter).second >= m1[(*iter).first] ? m1[(*iter).first] : (*iter).second;
for( int i=0; i< min ;++i)
cout << (*iter).first << " ";
}
}
}else{
for(iter=m1.begin();iter!=m1.end(); ++iter)
{
if( m2.find( (*iter).first ) != m2.end() ){
int min = (*iter).second >= m2[(*iter).first] ? m2[(*iter).first] : (*iter).second;
for( int i=0; i< min ;++i)
cout << (*iter).first << " ";
}
}
}
Sorry again, Better formatting :
But yes, if you use STL map, it will work like a charm
int main()
{
map<int, int> m1, m2;
int arr1[] = {2,2,2,3,4,5,6,7};int size1 = sizeof(arr1)/ sizeof(int);
int arr2[] = {2,2,5,5,7,10};int size2 = sizeof(arr2)/ sizeof(int);
for(int i=0;i<size1;++i)
m1[arr1[i]]++;
for(int i=0;i<size2;++i)
m2[arr2[i]]++;
map<int, int>::iterator iter;
map<int, int>::iterator iter2;
if( m1.size() >= m2.size() )
{
for(iter=m2.begin();iter!=m2.end(); ++iter)
{
if( m1.find( (*iter).first ) != m1.end() ){
int min = (*iter).second >= m1[(*iter).first] ? m1[(*iter).first] : (*iter).second;
for( int i=0; i< min ;++i)
cout << (*iter).first << " ";
}
}
}else{
for(iter=m1.begin();iter!=m1.end(); ++iter)
{
if( m2.find( (*iter).first ) != m2.end() ){
int min = (*iter).second >= m2[(*iter).first] ? m2[(*iter).first] : (*iter).second;
for( int i=0; i< min ;++i)
cout << (*iter).first << " ";
}
}
}
return 0;
}
Why not sort the two arrays, and then maintain one pointer for every array to check each number one by one? Just like what you do in one-pass merge sort?
- ibread January 22, 2011It also takes O(nlgn), the same as you do the binary search. But much more efficient if we do care the constant coefficient.