CapitalIQ Interview Question
ConsultantsCountry: India
Interview Type: Phone Interview
@oOZz
Randomized partition subroutine randomly chooses a pivot and partitions the set around that pivot so that all the elements less than the pivot are to the left of the pivot and all the elements greater than the pivot are to the right of the pivot. How would you tell the randomized partition subroutine to partition the set around the
(k-1)st element?
How would you tell the randomized partition subroutine to partition the set around the (k-1)st element?
You don't tell the randomized partition subroutine anything. It partitions the array around the pivot element A[p] and returns p. Therefore, you need to call it until the p = n-k (not p = k -1 actually, I corrected it above.)
static void quick_select(int[] array, int start, int end, int k) {
int ktop = n - k;
while (true) {
int p = rnd_partition(array, start, end);
if (ktop < p) {
end = p-1;
} else if (ktop > p) {
start = p+1;
} else {
// at this point p == n-k is true
// so print array [k, n)
break;
}
}
}
static int rnd_partition(int[] array, int start, int end) {
int low = start -1;
int high = end;
// randint() return a random int between start and end
swap(array, randint(start, end), end);
int pivot = array[end];
while (true) {
while(array[++low] < pivot);
while(array[--high] > pivot && high > low);
if (low >= high) break;
swap(array, low, high);
}
swap(array, low, end);
return end;
}
isn't it the worst case running time is O(n^2) as in every selection we may get the min element as pivot.
i think, need some modification in selecting pivot. while using randomized selection, we can choose pth ele of every 5 element continuesly n then amont them, choos the pth element, (it may not be the pth ele but it will be near to that) complexity O(n).
@oOZz
Good one. Quicksort's partition subroutine is a wonderful subroutine. To me it appears to be one of the fundamental subroutines of computing that solves many types of problems in one shot. +1 for you for cleverly exploiting it's power. However, I would like to make slight modifications to your code as I see a small bug there.
The modifications are:
1. Getting rid of ktop variable. For the quick select algorithm, instead of passing k as input, you can directly pass n-k as input. (Instead of choosing the kth order static, we need to choose (n-k)th order static for this problem)
2. Adjust the value of k(added the statement k = k-p), when you recurse on the right part of the pivot.
static void quick_select(int[] array, int start, int end, int k) {
while (true) {
int p = rnd_partition(array, start, end);
if (k < p) {
end = p-1;
} else if (k > p) {
start = p+1;
k= k-p;
} else {
// at this point p == n-k is true
// so print array [k, n)
break;
}
}
}
Min-heap of size k is a right choice here. Quick select finds only kth largest element in the array. Consider array: {3,3,3, 4,4,4,4} and k = 6. The output should be = {3,3,4,4,4,4}. Quick select returns 4 and even if we iterate array to find all elements >= 4, we will not get correct result.
1. consider a minheap, with a count variable counting number of elements in heap.
2. iterate array, insert elements in heap until count is less than k+1;
3. continue iterating array, compare the ith element with top of minheap, if ele is less then continue
else
{ replace the top of minheap with the element. update the heap (logk) }
complexity O(n*log(k) ).
Use a min-heap of size k.
1. For the first k elements, add them to the heap as they seen.
2. For the rest of the elements if the given element is greater than the min-value of the heap, extract/remove the min element from the heap and add the new element.
3. In the end, all the elements present in the heap are the top k elements. Worst-case time complexity: O(nlgk)
THE BEST THAT WE CAN DO WITH MIN-HEAP..!!
Given: array of N elements.
To Find: Top K elements.
-Take first K elements from array and build-min-heap --- O(K) time.
-Now take the rest (N-K) elements and compare with root element. IF greater than Root insert into min-heap and perform min-heapify--- O(log K) time.
-In worst case we need to do (N-K) times min-heapify ---O((N-K)log K).
Total worst case time --- O(K) + O((N-K)log K)
This is a variation of the selection problem (see en.wikipedia.org/wiki/Selection_algorithm). Basically, you need to use the partition subroutine from quicksort. Since this is a common problem, there is even utilities in certain languages for this operation, e.g, std::partition in C++.
- oOZz June 19, 2013Partition the array until p = n-k. p is the pivot that is returned by the partition function.This will reorder the array such a way that all the elements that appear before index p (pivot) are less than A[p] and all elements after p are greater. At this point the reordered array A' will have all the "top" k numbers on the right side of the pivot.
"Expected" time complexity is O(n) for randomized partition subroutine. O(1) space.