IBM Interview Question for Software Engineer / Developers






Comment hidden because of low score. Click to expand.
0
of 0 vote

Divide the list into many parts
Now take 2 list and find median of both the list
Now take the numbers which are in between the medians and the median of them will give the median of both the list.
Do it for rest of the part

- DashDash August 07, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Dont think that will work

- Anonymous August 07, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

I feel there might be some better algorithm , anyone?

- Triton August 07, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I am not sure if there is any subtle way to take medians of each chunk and with some book keeping, we can get to the solution.

So, I would go with a naive way of sorting the entire array and then finding out the median. Let's say we have 'M' memory available and the file size is 'N'. (assuming N and M have same units)
1) Number of chunks C = N / M;
2) Sort each chunk into memory and sort it and flush it to disk into a new file F'
3) So, now do a merge sort over the chunks:
That is, bring in first half of Chunk1 and first half of Chunk2 and then do a merge sort and keeping outputting the correct sort into the new file. (just like how we create a new array during merge sort).
So, if you visualize, you are creating tree from leaves to root. (leaves being chunks and then gradually merge sorting until you get 1 big sorted file).
Yeah, disadvantage is that you are creating more memory every time (but we can flush everything later from the disk).

Once you have the sorted file on the disk, bring in memory Chunk C/2 into memory and then take its middle element.
PS: Yes, we will have to work out on the ceiling/floor issues but that is not interesting.

Hope it helps. Thanks!

- Aatish August 08, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

The file is huge. The whole file cannot be loaded into memory! Moreover, why should you ever solve it in nlogn (sort) when you just have to find the median?

Think about doing a one-sided binary search since you do not know the size of the list at first. If you do not know about it, google it! I would use the same partition mechanism as used in the selection strategy to also incorporate one-sided binary search into it. I can imagine that this problem can be solved in two passes (i.e) O(n).

- Kurt August 08, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

But that would involve lot of block transfers i.e. for one iteration of the partition mechanism there has to be k block transfers where k=No of blocks used to store the integers.

Besides the algorithm you suggested although has O(n) asymptotic complexity, the constants involved are pretty huge, in practice it may not work well.

- Triton August 08, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

i have a solution... first take 3 numbers and find the median.. now add the next number and find the median of that 4 number.... now add one more and continue this till the numbers end... it works... this is called decrease and conquer algo

- Anonymous August 08, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

i have a solution... first take 3 numbers and find the median.. now add the next number and find the median of that 4 number.... now add one more and continue this till the numbers end... it works... this is called decrease and conquer algo

- Anonymous August 08, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Nope. it will not. The fact that you will bring in all the numbers in memory is not possible (As you have limited memory)

- Anonymous August 08, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Find the Range
Build the histogram of 10 bins ... that will give you a crude estimate
Keep refining the estimate till you converge to one value.

Complexity: log(Range)*N

- zubair August 09, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

details pls?

- mohit August 09, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

There is a nice solution at http discuss.joelonsoftware.comdefault.asp?interview.11.794214 (credit to Calvin):

-----------------COPY&PASTE----------------
1. Use a max-heap with k elements to go through all the N elements, which will get the biggest k elements. Record the smallest in heap, say W. Complexity: O(N*logk)

2. Do step 1 again, but filter out elements bigger than W. This time the smallest element in heap is the 2kth biggest element. Do this step N/(2*k) times.

3. Output the heap in sorted order. The N/2 - floor(N/(2*k)) * k is the result.

------------END OF COPY&PASTE----------------

- Kolo August 09, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

any idea on how k and N are related?

- mohit August 10, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I think k should be chosen so that the max-heap can fit in the limited memory.

- Kolo August 10, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Use definition: median minimizes f(X) = sum(|xi - X|).

A possible solution of NlogN complexity:
1. run through the file and find min(x) and max(x): N ops
2. do a standard binary search to minimize f(X) = sum(|xi - X|), where initial X is set to (min(x)+max(x))/2 and the boundaries set to min(x) and max(x): each iteration requires going through the whole file to calculate f(X), also it takes logN cycles to reach a given precision: total NlogN ops

- Anonymous August 09, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

If the file contains 5*10^10 numbers with values -MAXINT, 5*10^10 numbers with values MAXINT, and 0 then 0 is the median. In this example, sum(|xi-X|) can cause a buffer overflow error.

- Kolo August 09, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Valid point, but the problem can be easily fixed: instead of sum() calculate the corresponding average using iterative approach to avoid overflow.

In particular, define
A(X,1)=x(1)-X,
A(X,n+1)=n/(n+1)*(A(X,n)+(x(n)-X)/n)
Then minimize A(X,N) as a function of X as described above.

- Mefodii August 10, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

how do you do a binary search on unsorted input?

- abc March 25, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

by the way why doesn't the median-of-median's algorithm work here?

- Anonymous August 10, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Take the first element as pivot. run through the whole file to check how many elements are less than pivot and how many are greater. This will give an idea whether the median will be greater or less than the current pivot.

Now go for the next element in the file to make it pivot (use previos decisions to bound the range of elements dat can be possible candidates for median), run this way until for on pivot equal number of elements on both larger and smaller side are found..

- a.khan August 20, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

a.khan = adnan khan?

- Anonymous January 20, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

What you have suggested is the Median of Medians strategy used in pivot selection of Quicksort but I don't think that it will always be the actual median of the list.

What I would suggest is take a constant k and then divide the list into k groups. For every group create a min heap of it(Choose k such that k elements fit into the memory)Complexity till now:O(n/k * k)=O(n).
Now create a heap out of the elements at the root of the heaps that we have. Remove the root and bring in the next element from the list where the root belonged and heapify, do it n/2 times,you get the median.Overall Complexity:O(n+ n*log k) = O(n log k)

- Triton August 07, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

I think this is not right solution... so Fuck off!!!

- VIKAS KUMAR MALL August 07, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Don't be an idiot VIKAS. If you have a working solution let's hear it.

- Anonymous August 07, 2010 | Flag


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More