Programming Interview Questions

Page:

1

Comment hidden because of low score. Click to expand.

2

of 2 vote

Target is to find closest 100 starts.

The best way to implement is K-NN algorithm.
Solution Step
1) Here assume K is 10 . Selection of K is find out using RMSE. this is another bigger topic and let's not deep dive into that.
2) Here are selecting 10 centroid not 1 centroid. Why 10 centroid?. Because some star cluster can happen more than one location.
3) Let's take an analogy. let's assume in US is a space (3 dimension) and people are stars. We have to find where the population density is more. NY is one cluster, LA is another cluster. Bay area is 3rd cluster and similary we have more clusters. Out of these we have to find the densely populated location. Of course the obvious answer is NY city. Of course we don't know the answer how do we calculate.

4) Read first 10 rows from the fille ( because we assume our K is 10) and make them as centroid.
5) Read rows from 1 to 100000 and for each row find closest centroid.

for rows 1 to10000
for cluster 1 to 10
closestCluster = min(closestCluster,(cluster[j]-row[i]))
end
end

6) Assume we randomly select centroid to find densely populated location. Assume we selected 10 location in kansas state (because they are in the middle of US) and we calculate distance between each person to all the 10 location. Find the min distance and assign that person to that centroid.

7) Once we assign all the points/person to the centroid , recalculate the centroid.
8) Each centroid has some points/person assigned to them. Recalculate the centroid for x,y,z using ((x1+x2+x3+...+xn)/n
This is the new centroid. After this step the centroids are moved to different direction and they are scattered.

9) Continue the step 5,6,7,8 till the centroid is converged.
10) When I say converge centroid doesn't change much distance from previous runtime to current time and change is really insignificant.
11) While finding centroid for each point/person , maintaing a minheap structure for each cluster and this maintains informations about 100 closest point from the cluster.
12) For each cluster find distance between centroid to each point and sum them up.

int[] distances;
for each cluster c
c=0;
for each points 1 to 100
c=c+ distance between point and cluster[i]
end
distance[i]=c;
end

13) After this step find the mindistance from the array and all the points belongs to that centroid/cluster are closest points.

Note : In some places I used generously data structure. I was more concentrating on finding the solution and implementation rather than best implementation. If you explain the stuff above it takes more than 60 minutes and he would be more happy with solution.

- analog76 February 11, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

Ans1: (P+Q) choose P times.
Ans2: P power Q times ( for all 4 possible way).
Ans3: Use Combinatorics and Inclusion exlucsion principle.
suppose the points (m,n) are blocked where m<P and n <Q.
then
(P+Q) choose p - (all the points from origin to M,n) ( all the points from m,n to P and Q).

- analog76 February 11, 2012 | Flag Reply

Page:

1

CareerCup

analog76

Books

Videos

Resume Review

Mock Interviews