please read full question befo

Microsoft Interview Question for Computer Scientists

-3

of 3 votes

28
Answers
please read full question before ans,
time complexity to merge k sorted arrays of size n each where space is not constraint and merge them with in m swap( m is total no of element i.e m=k*n )
Example:

Input:
k = 5, n = 4
arr[][] = { {3, 33, 55, 71},
{29, 63, 64, 88},
{100,999, 1100, 1800}
{18,99, 155, 180}
{360,480, 590, 620}} ;

Output: 3 18 29 33 55 63 64 71 88 99 100 155 180 360 480 590 620 999 1100 1800

==> ( " sort this array with in 15 swap or insertion ")
- vivek August 01, 2013 in United States | Report Duplicate | Flag | PURGE
Microsoft Computer Scientist Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 4 vote

Algorithm as follows
1.Create a min heap of first elements of all the arrays.
(a node of heap would be an object containg number's value and number of array it belongs to.)O(Klogk)
2.Now remove the element from the min heap
3. Add corresponding element of that array to heap to which min element belonged.
repeat steps 2 and 3
total time complexity will be ->m is total no of element i.e m=k*n

- blackfever August 01, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

@blackfever
>>total time complexity will be ->m is total no of element i.e m=k*n

What about the heap operation of insert that is of O(lgk) ? And you will have O(nk) inserts into the heap.

- Murali Mohan August 01, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

time complexity will be m*[log k] assuming log k << m so it can be taken as m ofcourse we need to do nk inserts in heap

- blackfever August 01, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

>> assuming log k << m so it can be taken as m

FYI, In big-oh notation, conventionally you can ignore constants, but can't drop functions. Algorithms with complexity of O(log*(n)) are still mentioned as having asymptotic complexity in n, but never considered as constant-time algorithms just because log*(n) is very small.

- Murali Mohan August 02, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

How do we know which array that the removed element (from the heap) came from? Do we have to check all arrays for that element?

- Guest August 02, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@apostle you are right the complexity will be mlogk

- blackfever August 02, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@guest in the heap we keeping 2 things the value of element and the array to which it belongs so when removed we will know which element to fetch

- blackfever August 02, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

why we are not merging the sorted array of merge sort using heap ???

- Anonymous August 03, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

what is constant factor of heap sort in k sorted array
as far i know constant factor of heap sort ---> 11.62* (nln(n))
and quick sort --->2* (nln(n))
but in case of k sorted array it will remains same or change and why???

- Anonymous August 03, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

time is k*n*lgk, you need to heapify after remove the root

- Anonymous September 03, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 3 vote

The algorithm that I am suggesting is O(m) and the way the question has been framed I can understand that the total time should be m. But if its actually O(m) then here's my solution.

1. Pair all arrays such that we have k/2 arrays. Merge the arrays in each pair. Time complexity to merge each individual pair is O(n) (n comparisons). Time complexity to merge all such pairs is (n*k/2)
2. Repeat step 1 till we have got our final sorted array.

Total time complexity for all merges is:
= k/2*O(n) + k/4*O(n) + k/8*O(n) +.....
= O(n) * k *(1/2 + 1/4 + 1/8 + ...)
= O(n) * k * 1/2*(1/(1-1/2)) (Geometric series)
= O(n) * k
= O(n*k)
= O(m)

- as09 August 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

You forgot to consider that merging the arrays increases their length. The correct complexity is
= (k/2)*(2*n) + (k/4)*(2*2*n) + (k/8)*(2*2*2*n) + ... + (k/2^lg(k))*(2^lg(k)*n)
= nk + nk + nk + ... + nk (lg(k) times)
= nk * lg(k)
= O(m*log(k))

- Anonymous November 26, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

you can consider this approach:
a. Now in your example there are 5 arrays and they are sorted.
b. Let us suppose the arrays are a,b,c,d,e. Then take three variables i,j,d such that i,j=0 and d=m that is the size of the array. Take a destination array of size 5m that is didx[5m]
c. Then do this.
1.while(i<m&&j<m)
if a[i]<didx[d] then didx[d]=a[i],i++,d++
else didx[d]=b[j]j++,d++
then take i=0,d=0,j=m
2.while(i<m&&j<3m) then
if(c[i]<didx[j] didx[d]=c[i],i++,d++
else didx[d]=didx[j],j++,d++
again take i=0,j=0,d=4m and then do the above step 1
then take i=0,d=0,j=5m and do step 2 at last didx holds the sorted elements with 15 swaps.

- vgeek August 01, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

What is didx[5m]?

- Anonymous August 02, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

to hold all the sorted arrays the destination array is didx. It will hold all the arrays in a sorted fashion

- vgeek August 02, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

please rephrase the question. that english is not clear at all

- Miguel Oliveira August 01, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

- vivek August 01, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

@blackfever: In your step 2 you said remove elements from min heap, don't we need to compare removed element with all the elements from the min heap to find out which one to insert.

- Alice7 August 03, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

How about if we create a BST using first array and then for each array keep adding elements into the tree as per BST rules. Insertion in tree if counted as one for each element will be K*n = m
Once tree is completed, simply traverse using pre-order traversal.

- Shwetha August 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

the insertion will take O(log m) per element, so the time complexity will be O(m log m) which is worse than (k*n log k)

- Miguel Oliveira August 14, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Since space is not an issue, we can create a hash map with two entries: one for the actual values and second for their frequencies. It takes O(k*n) to hash all the elements of these lists, based on the values of the nodes of these lists. As a result, the hash entries will be sorted (they are based on the nodes' values). Next, we write these values, as per their corresponding frequencies, in the final destination. The pseudo code looks like :

for(i=0;i<HashSize;i++)
{ 
  for(j=0;j<Hash[i][0];j++) //say, hash[i][0] refers to the freq. and hash[i][1] to the actual values
   destination[index++]=hash[i][1];
}

This is also O(n*k), ignoring the inner loop that adds a small factor, depending on the number of repeated values. If this factor is big, it implies that the range is very small and values are mostly repeated, and hence does not affect the overall complexity.

- Anonymous August 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

considering the above matrix, stored in row major, the first element is the smallest, even if we go through the array for every element, like even if we do a selection sort we can do it within 15 swaps right? i think it will take 12 swaps.

- rohith September 03, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I dont understand why people are bent on using a heap here?
Wouldn't it be more efficient to just keep track of the current ptr (named curPtr) in each of the K lists.
1.) All the K curptr are initialized to 0. We find the minimum element from the set of K elements formed by (list1[curPtr1], list2[curptr2], etc)
2.) We output the corresponding minimum element and hence increase the corresponding curPtr
3.) We need to make sure that once the curPtr for some list reaches the "n" we ignore it in the next iteration
4.) Since we will output one element in each iteration. The function will run in O(K*n) time.
Maybe I have missed something or committed some grave blunder. So, please feel free to comment.
UPDATE: Here is some quick code:

int _tmain(int argc, _TCHAR* argv[])
{
	int k, n, mat[100][100];
	printf("Enter K:");
	scanf("%d", &k);
	printf("Enter N:");
	scanf("%d", &n);
	for(int i = 0; i < k; i++)
	{
		for(int j = 0; j < n; j++)
		{
			scanf("%d", &(mat[i][j]));
		}
	}
	int ptr[100], min = 0, index;
	int done = 0;
	for(int i = 0; i < k; i++)
	{
			ptr[i] = 0;
	}
	index = 0;
	for(int i = 0; done == 0; i++)
	{
		min = mat[index][ptr[index]];
		for(int j = 0; j < k; j++)
		{
			if ((ptr[j] < n) && (min > mat[j][ptr[j]]))
			{
				min = mat[j][ptr[j]];
				index = j;
			}
		}
		printf("%d\n", min);
		ptr[index]++;
		done = 1;
		for(int j = 0; j < k; j++)
		{
			if (ptr[j] < n)
			{
				done = 0;
				index = j;
				break;
			}
		}
	}
	return 0;
}

- sourabhforu September 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Your step 1 runs in O(k). Using a heap allows to speed it up to O(log k) that's the reason.

- Miguel Oliveira September 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Ok. But my overall algorithm seems to be O(kn). Which looks better than O(k*n*logn). Isnt it?

- sourabhforu September 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Let M = K*N be the total amount of numbers.
Your algorithm runs in O(M * K), while the heap version runs in O( M * log K ).
Note that your "for(int i = 0; done == 0; i++)" cycle runs M times, not N. And then you have a nested cycle search in each of the K arrays.

- Miguel Oliveira September 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Ohh! I got it! I was confused by the other comments that the particular algo being discussed was O(K *n * Logn). Thanks!

CareerCup

Microsoft Interview Question for Computer Scientists

Books

Videos

Resume Review

Mock Interviews