## Microsoft Interview Question Software Engineer / Developers

• 0

We are given a list of numbers A= {5,9,12,16,25,30,45} and their fixed probability of occurance P={0.22,0.18,0.20,0.05,0.25,0.02,0.08} where sum(p) =1. The problem is to arrange the numbers in a binary search tree in a way that minimizes the expected total access time. Note: In a binary search tree, number of comparisions needed to access element at depth d is (1+d)

Country: India
Interview Type: In-Person

Comment hidden because of low score. Click to expand.
2
of 8 vote

Yes this problem can be solved by using dynamic programming in O(n3) time
C(i,j) = min { C(i,r-1) + p(r) + C(r+1,j) + sum_{k=i}^{r-1} p(k) + sum_{k=r+1}^{j} p(k) } i <=j
i<= r <=j
= 0 if i = j-1 here we have no occurrence of non existing element node , i.e we only search nodes which are present

To understand this problem more clearly you can watch this video

Comment hidden because of low score. Click to expand.
0

I wish the explanation in your post was clear enough to understand without the video.

The video seems to contain a valid approach.

Comment hidden because of low score. Click to expand.
0

Agreed to the comment above. The explanation could be more clear.

Comment hidden because of low score. Click to expand.
1
of 7 vote

Just Build a Huffman Tree!

Comment hidden because of low score. Click to expand.
0

in huffman tree, the element with higher frequency or probability can be reached in short time than element with low frequency or probability..

to have details of huffman encoding.. see CORMEN's Introduction to Algorithm

Comment hidden because of low score. Click to expand.
0

They require to build binary search tree and not Huffman tree.

Comment hidden because of low score. Click to expand.
0

you are right..
1. sort the elements according to the probability => preorder of BST
2. build BST

so that Root element is the element with highest probability..
for the given case..by doing like this only drawback , 30 is searched faster than 16.

Comment hidden because of low score. Click to expand.
0

cobra: that's not correct. Consider a tree whose inorder traversal is 1, 2, 3, 4, 5, 6, .., N and whose probabilities are, for instance, .01, .02, .03, .035, .... Strictly ascending probabilities with respect to the in-order traversal.Your idea would place N as the root, N-1 as its left child, N-2 as the left child of that, ... Your tree would be a linked list. This can be worse in the expectation than a more balanced tree even if the root of the balanced tree isn't the node that has the highest probability.

Comment hidden because of low score. Click to expand.
0

i guess cobra's solution is correct. its just about to keep the probabilities as the keys to BST and info as satellite data.

Comment hidden because of low score. Click to expand.
0

It's not correct, for the reason I mentioned. Take a dataset like the one I mentioned. Run cobra's algorithm. Then run the dynamic programming algorithm described by another response. Then observe how the dynamic programming solution yields a better expected time. This shows that cobra's algorithm isn't optimal.

Comment hidden because of low score. Click to expand.
1
of 1 vote

The problem is very similar to matrix chain multiplication, the only difference is the calculation of new comparison from previous ones.
Now look at it very carefully, it is clear that we need to use DP, and in DP also we need to calculate(remember) a matrix. Now because the not a be placed at any position from i to j (if we are calculating MAT[ i ][ j ]), so the complexity will go to O(N*N*N)
Now Here is the algorithm :
M[i][j] = min(p[k] + M[i][k-1] + M[k+1][j] + sum(M[r][r] for r=i:j) for k=i:j), please consider boundary conditions,
we set M[i][i] = p[i], not that firstly we sort the array of number and maintain their corresponding probabilities also(sort can be in any order).

For example like above construct a upper diagonal matrix, here each entry actually store the comparisons only. Now if you want to construct the tree, you need another matrix(B od same size which store node data), which tells who to be chosen as parent node at each place, so M[0][N-1] will give you cost and B[0][N-1] will give you root node.

Please let me know if am not clear..

Comment hidden because of low score. Click to expand.
0
of 0 vote

Knuth has a section on this.

Dynamic programming problem.

Comment hidden because of low score. Click to expand.
0

is knuth a book on algorithms?

Comment hidden because of low score. Click to expand.
0

If you are trying to be funny, you succeeded in making me laugh.

I believe the answer refers to "The Art of Computer Programming" by Donald E Knuth. (One of the volumes I suppose, probably Vol II, on sorting and searching).

Comment hidden because of low score. Click to expand.
0
of 0 vote

Use a splay tree, and insert the elements one-by-one in increasing order of probability. In a splay tree (which is a special type of BST), the most recently inserted/accessed elements occur at the top of the tree.

Note; I'm not sure this solution is correct.

Comment hidden because of low score. Click to expand.
0

Why would it be correct? You haven't given a good rationale as to why it would be. Just because the most frequent elements are near the top, doesn't mean they're arranged in the most optimal way, or if they are, you should give a proof or at least an argument. I think that, generally speaking, when you post a solution to a non-trivial problem, the burden is on you to show that the solution is correct.

Try comparing your results with the results of the DP approach, whose optimality is easy to see and prove. You'll probably find cases where your trees are suboptimal.

Comment hidden because of low score. Click to expand.
0

I feel the splay tree solution mentioned here is a good attempt ... but if nodes are considered in increasing order of probability for constructing the BST, then it will lead to a linked list like structure ... so this will not be an efficient BST ... Let me know if I am missing anything here

Comment hidden because of low score. Click to expand.
0
of 0 vote

Use OBST
cse.yorku.ca/~aaw/Gubarenko/BSTAnimation. html

Comment hidden because of low score. Click to expand.
0
of 0 vote

It appears to me that it might be simply done by using sorting the elements on the basis of their probabilities, and then putting them in the tree in a level-order fashion, i.e., if the elements after sorting are e1, e2, e3, e4, e5, e6 and so on, then e1 will be root, e2, e3 would be its children, e4, e5 would be e2's children and so on.

Here's my logic for that:
The expected time is Sum{P(i)*T(i)} for all i, right? and we need to minimize it. So, simple enough, i say. Put the Highest P(i)'s with the lowest T(i)'s... as many as you can. the highest P(i) would be at root, with t=1 (lowest) , followed by e2 and e3, the next highests getting the next lowest time (t=2).

Comment hidden because of low score. Click to expand.
0

Then u may not end up in a bst!!

Comment hidden because of low score. Click to expand.
0

Agreed with madhu. You need to ensure you form a BST. You'd be right if that was not a constraint here.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Treap will solve this

Comment hidden because of low score. Click to expand.
0

Just because a treap is a tree with some random structure doesn't mean that it's the solution to this question involving trees and probabilities. The solution here is deterministic because we're asked to minimize an expectation.

The DP approach is the right idea.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Just build a priority queue using a heap where the priority is the probability of occurance in this case.

Comment hidden because of low score. Click to expand.
0
of 0 vote

DP
Base Case:
OBST(i,0) = 0 OBST(0,j) = 0

Sub Optimal Problem:
OBST(i,j) = OBST(i,r-1)+P(r)+OBST(r+1,j) + Sum(pk)(i<=k<=j)
i<=r<=j

Comment hidden because of low score. Click to expand.
-1
of 1 vote

This problem can be solved using Huffman encoding

Comment hidden because of low score. Click to expand.
0

How?

Comment hidden because of low score. Click to expand.
-2
of 2 vote

@samant.bit do dey expect the o/p to be 25 12 45 5 16 30 9???

Comment hidden because of low score. Click to expand.
0

Huffman tree is based on greedy algorithm, however that will not always give correct (optimal) solution for BST

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book walking you through every aspect of getting a job at a top tech company, while focuses on software engineering interviews.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.