## Interview Question

**Country:**United States

it takes O(k + c) where c may be even larger if there are too many duplicates, we can try handling the duplicates by tweaking the code. But can there be an O(logk) solution.

That would not be optimal. You are not taking advantage of the two arrays already being sorted.

Let s be 1 and take two pointers, one for each array, initially pointing to the last (greatest) element of its array.

1. If both arrays are exhausted, error: there are not k elements in the arrays.

2. If one array is exhausted, the element pointed to in the other array is the s-th largest.

3. Otherwise the maximum of the elements pointed to in both arrays is the s-th largest.

4. If s=k, return this element.

5. Otherwise move each pointer that pointed to the s-th largest element (this could be both pointers) backwards in its array until it reaches a different element or, if there are no more, call its array "exhausted".

6. Increment s and repeat.

In my stament, A[], B[] is the given two sorted arrays, and |A|=n,|B|=m

1) two pointers Algorithm must works (such as one sub-program in MERGE_SORT)

however, you may have to scan all the element in worst case (k=n+m). And the time complexity is O(n+m)

2) we could find that the result element will either in A[] or in B[] (or both cause the duplicates)

Now suppose the result element is in A[], we can binary search it.

for one element you are searching now (suppose it's X), we can calculate how many element in total is less or equal to X. (do another binary search in B[])

Then we can get the result if it's indeed in the A[].

For the result in B[], we could process in the same way.

so the Time Complexity is O( log(n)*log(m) )

Solution in C:

#!/usr/bin/python

from sys import stdin,stdout,argv

class SubArr:

def __init__(self, arr):

self.arr=arr

self.first=0

self.last=len(arr)

def get(self, pos):

res=9999

pos+=self.first

if pos<len(self.arr) and pos<self.last:

res=self.arr[pos]

return res

def size(self):

return self.last-self.first

def median(self):

sz=self.size()

pos=sz/2

if sz%2==1:

res=self.get(pos)

else:

res=(self.get(pos)+self.get(pos-1))/2.0

return res

def median2(s1, s2):

if s1.get(s1.size()-1)<=s2.get(0):

return s1.get(s1.size()-1)

elif s2.get(s2.size()-1)<=s1.get(0):

return s1.get(0)

elif s2.size()==1:

return s2.get(0)

t=s2.size()/2

med1=s1.median()

med2=s2.median()

if med1==med2:

return int(med1)

elif med1<med2:

s1.first+=t

s2.last-=t

else:

s1.last-=t

s2.first+=t

return median2(s1,s2)

def kth_elem(l1, l2, k):

if k>=len(l1)+len(l2):

return "error"

s1=SubArr(l1)

s2=SubArr(l2)

if s1.get(k)>s2.get(k):

tmp_s=s1

s1=s2

s2=tmp_s

if k==0:

return s1.get(0)

s1.last=k+1

s2.last=k

return median2(s1,s2)

def main():

l1=[1,3,4,10,15,20,22,27,35,50]

l2=[1,3,4,5,6,7,12,33]

print l1

print l2

for k in range(0, 20):

print str(k)+"th element is "+str(kth_elem(l1, l2, k))

if __name__ == "__main__":

main()

def median(s):

l=len(s)

if l%2==0:

return (s[l/2-1]+s[l/2])/2.0

else:

return s[(l-1)/2]

def median2(s1,s2):

if len(s2)==1:

return s2[0]

t=len(s2)/2

if median(s1)<median(s2):

s1=s1[-t:]

s2=s2[:t]

else:

s1=s1[:t]

s2=s2[-t:]

return median2(s1,s2)

def k_elem(s1, s2, k):

if s1[k-1]<s2[k-1]:

swap(s1,s2)

s1=s1[:k]

s2=s2[:k-1]

return median2(s1,s2)

```
def median(s):
l=len(s)
if l%2==0:
return (s[l/2-1]+s[l/2])/2.0
return s[(l-1)/2]
def median2(s1, s2):
print s1
print s2
if len(s2)==1:
if s2[0]>s1[1]:
return s1[1]
elif s2[0]<s1[0]:
return s1[0]
return s2[0]
t=len(s2)/2
print t
if median(s1)<median(s2):
s1=s1[t:]
s2=s2[:-t]
else:
s1=s1[:-t]
s2=s2[t:]
return median2(s1,s2)
def k_elem(s1, s2, k):
print s1
print s2
print k
if s1[k-1]>s2[k-1]:
tmp_s=s1
s1=s2
s2=tmp_s
s1=s1[:k]
s2=s2[:k-1]
return median2(s1,s2)
def main():
print "hello"
l1=[1,2,3,4]
l2=[5,6,7]
print k_elem(l1, l2, 3)
if __name__ == "__main__":
main()
```

Here's the clean solution in python - try it !

Explanation: take the first K elements of both sorted arrays and then find the median !!

Finding the median of two sorted arrays is done in O(log(K)) time by each time removing half of the elements of each array - the bottom half of the array with the lower median, and the upper half of the array with the higher median !!!

```
def median(s):
l=len(s)
if l%2==0:
return (s[l/2-1]+s[l/2])/2.0
return s[(l-1)/2]
def median2(s1, s2):
if len(s2)==1:
if s2[0]>s1[1]:
return s1[1]
elif s2[0]<s1[0]:
return s1[0]
return s2[0]
t=len(s2)/2
if median(s1)<median(s2):
s1=s1[t:]
s2=s2[:-t]
else:
s1=s1[:-t]
s2=s2[t:]
return median2(s1,s2)
def k_elem(s1, s2, k):
if s1[k-1]>s2[k-1]:
tmp_s=s1
s1=s2
s2=tmp_s
s1=s1[:k]
s2=s2[:k-1]
return median2(s1,s2)
def main():
l1=[1,3,15,14]
l2=[5,6,7]
print k_elem(l1, l2, 3)
```

- Buzz_Lightyear November 16, 2012