buried.shopno
BAN USERUsing Radix sort on the value and the rank of each element, O(n) seems to be possible if we assume fixed (4 byte) int data type. For arbitrary length data type, this claim is invalid. It's obvious no comparison based approach can get an O(n) solution here.
A work through example to demonstrate the ranking computation.
A = [12, 2, 6, 14, 5, 9, 11]
consider rightmost bit LSB1
12, 2, 6, 14 => LSB1=0, so each with rank = 0
5, 9, 11 => LSB1=1, so each gets rank = 4 (size of element with LSB1=0)
so, after processing rightmost bit,
B = [0, 0, 0, 0, 4, 4, 4]
Next, consider 2nd rightmost bit LSB2
12, 5, 9 = > LSB2=0
use radix sort on the rank of these elements
current rank of 12 is 0, so it remains 0
current rank of 5,9 is 4, so both rank is assigned 1
2, 6, 14, 11 => LSB2 = 1
use radix sort on the rank of these elements
current rank of 2, 6, 14 is 0, so each assigned to rank 3
current rank of 11 is 4, so rank is assigned 6
so, after processing 2nd rightmost bit,
B = [0, 3, 3, 3, 1, 1, 6]
Next, consider 3nd rightmost bit LSB3
2, 9, 11 => LSB3 = 0
use radix sort on the rank of these elements
current rank of 9 is 1, so assigned to rank 0
current rank of 2 is 3, so assigned to rank 1
current rank of 11 is 6, so assigned to rank 2
12, 6, 14, 5 => LSB3 = 1
use radix sort on the rank of these elements
current rank of 12 is 0, so assigned to rank 3
current rank of 5 is 1, so assigned to rank 4
current rank of 6,14 is 3, so each assigned to rank 5
so, after processing 3rd rightmost bit,
B = [3, 1, 5, 5, 4, 0, 2]
Finally, consider 4th rightmost bit LSB4
2, 6, 5 => LSB4 = 0
use radix sort on the rank of these elements
current rank of 2 is 1, so assigned to rank 0
current rank of 5 is 4, so assigned to rank 1
current rank of 6 is 5, so assigned to rank 2
12, 14, 9, 11 => LSB4 = 1
use radix sort on the rank of these elements
current rank of 9 is 0, so assigned to rank 3
current rank of 11 is 2, so assigned to rank 4
current rank of 12 is 3, so assigned to rank 5
current rank of 14 is 5, so assigned to rank 6
so, after processing 3rd rightmost bit,
B = [5, 0, 2, 6, 1, 3, 4]
A simpler approach to solve this problem:
1. start with beginIdx = 0, move the endIdx as long as there is no duplicate in a window.
2. In a window, with no duplicate; keep track of minChar and maxChar corresponding to characters in window(i,j). If there is no duplicate, and if maxChar-minChar == j-i, it implies range(i,j) contains consecutive chars.
3. Keep update of max window found.
4. Move beginIdx to 1, 2,....
There are O(n^2) ranges in total. Maintaining a hash to keep track of chars in a window, time complexity is O(n^2).
You probably have not good understanding on DP. When you start up with problem size (i.e. substring size) of length 1, and then add a letter to substring, you get a problem of size 2. This 2-letter problem could be thought of as 2 sub-problems each of size 1. For example, "a" and "i" both are valid 1-letter word. For "ai" which is a 2-letter substring, we can re-use the information we computed previously.
Now, consider another substring "zoo". If we add another letter 'm', "zoom" becomes a 4-letter substring. Now, there is no valid partitioning of zoom (i.e. zoo+m, zo+om, z+oom); hence, "zoom" is looked up in dictionary for a valid match. As this is valid word, we get single partition for word "zoom". However, if it were something else, like "zool", then surely no valid way to partition it. Then, we have to take following letters to make up valid partitioning, for example "zoology".
If you are still not convinced, you could check my DP solution posted above. Take this as a challenge to break this code with some counter example, where DP fails. Modify the dictionary as you wish.
Modified code is here (with special cases handling):
ideone.com/jgQGK
@singhashish.osu: I couldn't comprehend your recursive definition. Therefore, I doubt if it's possible in O(n^2).
I solved it using DP that takes O(n^3). The approach is to solve subproblem (i.e. substring) of size 2, then size 3, ... up to n. Every step uses information of previous computed results.
Let, dict[i][j] is "true" if substring[i...j] is a word in given dictionary
else, dict[i][j] is "false". Computing dict[i][j] for all possible i/j values, takes O(n^3) if we store the dictionary in trie.
Let, word[i][j] is "true" if substring[i....j] can be partitioned into 1 or more parts such that each part is either (a) a word, or (b) comprises 2 or more words.
Initally, word[i][j] = dict[i][j]
Now, by recursive definition, word[i][j] is "true" if and only if there exists an index k such that word[i][k] and word[k+1][j] both are true, for i <= k < j. Use a path[][] matrix to keep the partition point to place "space".
You can check the implementation here : ideone.com/EIGjJ
Modify the dict[] array to test with other inputs. Welcome any catch in this code. Thanks.
@OP: you need not to maintain any ordering of elements. You only need to know the max and min of selected n items (one from each array). Your goal is to minimize the difference between max and min element. To understand why this is same to original problem, carefully follow my previous responses.
- buried.shopno July 02, 2011@OP: Your statement is contradicting. You are saying to consider nC2 difference. For n = 3, it's 3; for n = 4, it's 6. However, you said to consider 4 differences - as per my initial understanding.
- buried.shopno July 02, 2011Jeff Erickson (UIUC professor)'s lecture note is a good one. You might find other universities sites, specially on MIT OCW resources.
- buried.shopno July 01, 2011Well, I think it needs clarification (from original poster) if we really needs NC2 differences, or N differences. I'd work, and let know if find some efficient way for the NC2 differences.
- buried.shopno July 01, 2011You did a mistake here. I've mentioned that a >= b >= c >= d, so the way of your mapping is wrong. My solution gives 2*(max elem - min elem) = 2 *(8-2) = 12.
- buried.shopno July 01, 2011I assume the solution to the transformed problem has been discussed many times here. It's the same technique that is used for k-way merge (discussed in Cormen's Algorithm book). It needs maintaining a min-heap of current smallest elements of each array. At each step, the smallest one is popped, and the next larger one of that array is pushed on heap.
- buried.shopno July 01, 2011Let we've 4 arrays, and the solution is {a,b,c,d}. Let, a >= b >= c >= d.
Now, the expression for sum of the difference = abs(a-b) + abs(b-c) + abs(c-d) + abs(d-a).
We can write it as = (a-b) + (b-c) + (c-d) + (a-d).
First 3 components equates to (a-d).
Hence, the expression becomes 2*(a-d).
Hope this helps to understand how the two problems are related.
- buried.shopno July 01, 2011I guess the arrays are sorted. In that case, the problem is same as selecting one element from each of n arrays such that (max element - min element) is minimized. If total # of element in n arrays is m, then using a min heap it can be solved in O(m logn).
- buried.shopno June 30, 2011Well, there are basically 4 cases to handle:
Let, r1 = current node of 1st tree
r2 = current node of 2nd tree
Case 1 : r1->data < r2->data
2 subproblems to solve:
first, solve r1 and r2->left
second, solve r1->right and r2
Case 2 : r1->data > r2->data
similar
Case 3 : r1->data == r2->data
2 cases to consider here:
(a) current node is part of largest common BST
compute common subtree size rooted at r1 and r2
(b)current node is NOT part of largest common BST
2 subproblems to solve:
first, solve r1->left and r2->left
second, solve r1->right and r2->right
Why case (a) and (b) are different? Look at below 2 examples: ______________10______________
/ \
_______3______ ______14______
/ \ / \
0 ___6__ 12 18
/ \
4 7
\ \
5 8 ______________10______________
/ \
_______2______ 14______
/ \ \
1 ___6__ __18
/ \ /
4 7 16
\
8
largest common BST exists at node : 6 of size 4
Here, root (10) is not part of largest common BST ______________10______________
/ \
_______3______ ______14______
/ \ / \
0 ___6__ 12 18
/ \
4 7
\ \
5 8 ______________10______________
/ \
_______2______ 14______
/ \ \
1 ___5__ __18
/ \ /
4 7 16
\
8
largest common BST exists at node : 10 of size 3
Here, root (10) is part of largest common BST
I doubt if it's possible in linear time. lol's solution works for some specific case. Here's an example to show the difference:
______________10______________
/ \
_______3______ ______14
/ \ /
0 ___6__ 12
/ \
4 7
\ \
5 8
_______8______
/ \
___3__ 10__
/ \ \
1 _5 14
/ \ /
4 7 12
largest common BST exists at node : 10 of size 3
However, lol's solution gives output subtree rooted at 14 (of size 2). It depends what interviewer exactly asked for. To achieve above output, it's straight forward to get the solution in O(nm) time.
- buried.shopno June 23, 2011I second ftfish. Anonymous's comment is provocative in the sense that he has doubt in those who post smart logic, but don't post solution. Com'on dude, you've NOT hired people here to code the problems for you. If you are smart (i assume as you've perhaps "real world" experience), code yourself. If you are not, ask suggestion about coding efficiency (like, which data structure or STL would be best for a problem etc).
There are 1000+ algorithm problems. No candidate needs to write code for all those. Because coding is like maths, you need to grasp and practice as much as possible; but need not to solve all of those as the set of problems is infinite. It's one's responsibility to learn how to code efficiently. He can post his code here for comments. But, it should not be the practice that one (say ftfish for this problem) will post an algorithm, and also an efficient implementation for others. It is like spoon-feeding.
No offense please.
I hope this works in O(n logn) time. Plz dig into the solution, and point any possible flaws. Thanks.
1. let the points are represented in a sorted manner in a matrix p[N][N] such that
(a)in row i, each point has same x co-ordinate; sorted by y co-ordinate values
(b)rows are sorted based on x co-ordinate values, i.e. row (i+1)'s x co-ordinate > row (i)'s x co-ordinate
=> time: O(n logn)
2. For every point (x,y) compute the set S of points which lie in the open range (x-5, y-5). As points are sorted, it can be done in O(log n). Test if any point of S has less than 5 manhattan-distance (MD) wrt (x,y). If one exists, terminate the computation; and return FALSE. Otherwise, proceed with next point.
=> time: O(n logn)
3. This is the crucial argument. To grasp it, it's better to work on paper.
If it could happen that in each open range (x-5,y-5) for each point (x,y) there could be O(n) points in S, then effectively it'd be O(n^2) complexity.
But, the good is that it won't be the case. Each point Q(x',y') can be in at most 4 open ranges for 4 points, say, A,B,C,D (each of them have a MD >= 5, otherwise if MD(A,B) < 5, it'll return FALSE). The reason is that if Q lies in range (Ax-5,Ay-5), but not inside the square region denoted by 4 points (Ax+5, Ay), (Ax, Ay+5), (Ax-5, Ay), (Ax, Ay-5); then it could exist at maximum three other points (B,C,D)'s ranges, but not have a MD < 5 wrt either of B, or C, or D.
This argument establishes that in the worst case, maximum 4n points to be checked for the given n points. This ensures O(log n) per point complexity in an amortized sense.
@S: what did you mean by "null" & "empty". Empty means string of 0 length. How then you differentiate "null" ? Thanks.
- buried.shopno July 13, 2010Thanks for your illustration. Your approach seems perfect from my understanding.
- buried.shopno July 13, 2010@fiddler.g: would u plz explain the idea to balance the height of the BST? a simple approach could be to make the "median" node as "root" of the tree, and do that recursively so that the height is balanced.
It'd be more appreciable practice and helpful for others if you don't post the code, rather explain the idea. People should first attempt to code themselves :)...
Thanks.
Cool illustration!
You are smart, dude :)
Let S is given string, T is reverse of S. Build a generalized suffix tree GST (see Wikipedia) for S and T. Search for the longest common prefix in GST.
example: S = 1232, T = 2321
suffixes of S = {1232,232,32,2}, and of T = {2321,321,21,1}
"232" is the longest common prefix. Even though number of possible suffixes in GST can be O(n^2), it can be collapsed to represent in compact way. As compact GST can be be built in O(n) time with O(n) space, it's probably optimal solution (surely, not easy to code with space/time constraint).
Plz correct if I'm wrong.
Sorry, I couldn't get convinced yet. My confusion lies in the reasoning that if every instance of hamilotonian circuit problem (for the above problem) can be converted to corresponding instance of euler tour problem, doesn't it contradict with the definition of NP-Completeness? As the former lies in NPC, where as the latter lies in P. Could you illustrate this point kindly?
- buried.shopno July 10, 2010ftfish's approach seems correct.
@ibnipun10: your ans is wrong. it'd be 110100.
@ftfish: I think your formulation is incorrect. "string" should be treated as "node", an edge would exist between nodes (u,v), if string u ends with character 'x', and string v begins with same character 'x'.
Now, it becomes hamiltonian circuit problem (NP complete). So some backtracking approach would give a solution.
Plz correct if my interpretation is wrong.
I've no clue how this sort of range pruning works for this problem! Before designing a solution you should work out with both random and some particular patterns at least.
Try very simple input: 11111100000000000
Now, it seems convincing. But, I think it'd be O(n|T|) time solution, because for window[i,j] it needs to check if it contains all chars of set T atleast once. Please share if you think it can be truly doable in O(n) time, and explain the details to do the above check in O(1) time. Thanks.
- buried.shopno July 10, 2010@ftfish: your approach is not exactly correct.
"when the left border moves right, the right border also does." it fails in below case:
string: acccabeb
pattern: ab
windows should be [1,6], then [5,6], then [5,8]
smallest is [5,6]. your approach can't find this boundary, as you mentioned to forward right boarder whenever left boarder forwards.
@ftfish: Cool :)
- buried.shopno July 10, 2010Could you pls explain your approach a little to get O(k logn) solution?
The method that I mentioned maintains a k-element heap (smallest element from each of k sorted list), and extract min from this heap takes O(log k); so a total of O(n logk) to get all n elements in sorted sequence. Thanks.
@MaYank: How about the complexity of your code for n-way merge? Isn't it O(nk)?
How much time was given for the coding during the phone interview? For efficiency, you'd probably mention using a heap (that would take longer to code) to get an O(n logk) solution.
An example to support my approach:
3 8 5 1 7 18 16 4
sublist[3,8] => min=3,max=8,profit=5
sublist[5,1] => min=1,max=5,profit=-4
merge [3,8,5,1] => min=1,max=8, profit = MAX {5,-4, (5-3)} = 5
sublist[7,18] => min=7,max=18,profit=11
sublist[16,4] => min=4,max=16,profit=-12
merge [7,18,16,4] => min=4,max=18, profit = MAX {11,-12, (16-7)} = 11
final merge[3,8,5,1,7,18,16,4] => profit = MAX {5, 11, (18-1)} = 17
So it is an O(n) solution, lowest bound for this problem. Plz comment on any catches in this method.
Oh, a basic mistake I did in my previous explanation is that you need to do such recursion to get profit maximized:
max_profit = Max{ find_profit(1st half), find_profit(2nd half), merge(1st half, 2nd half)}.
I guess an O(n) solution may be achievable if we keep the sequence un-sorted, and keep 2 variables to keep min_val and max_val of each sub-lists (as it's recursively during merge we need comparing constant values).
[*]'sell' price is the rightmost (replaced by) 'sell' price is the rightmost in the 2nd half as we like to maximize the difference.
Also to keep the original index, we would need a structure to keep track of the index in given sequence. An alternative approach may work without sorting the list (to keep the given sequence unchanged). At that case, merge would be bit different: scan 1st half to find the minimum price, and 2nd half to find the maximum price (as before recursively).
I think a divide-and-conquer approach works here like merge-sort. We recursively divide the range in half:
1) divide steps:
[i] 'buy' and 'sell' lies in 1st half
[ii]'buy' and 'sell' lies in 2nd half
2) merge: 'buy' lies in 1st half and and 'sell' lies in 2nd half -> this takes O(1) time because 'buy' price is the leftmost one in 1st half, and 'sell' price is the rightmost. However, to merge two sorted list, it'll be O(n).
So time complexity is O(nlog n). Plz comment if seems this doesn't work for every input.
What if there are 3 'a', 5 'b'. XOR is not the solution I guess.
- buried.shopno May 27, 2010So, is it the convention even a similar question is asked in previous round, and asked in some later rounds; then you should tell it first that "A similar question was asked previously?" What if this new question may be superficially related to previous round's question, but not exactly same?
- buried.shopno May 27, 2010How it works in O(n) time? Creating the n*n matrix itself takes O(n^2) time. Also, k-way merge sort will take > O(n) time. Any idea?
- buried.shopno May 26, 2010I could come up an idea with 9 iterations:
1) in 1st stage, divide 100 balls in 3 set: A(33), B(33), C(34)
if, weight(A) = weight(B), C has the different ball.
else, remove 1 ball from C, and compare weight(A) with weight(C). If weight(A) =
weight(C'), then B contains the different ball; else, A contains.
so, Max. 2 iterations needed to cut the size 1/3rd of the actual size.
2) in second stage, we have either 33 or 34 balls' set that contain the different one. apply above procedure to down the size to 11 or 12 -> 2 more iterations
3) in third stage, we have either 11 or 12 balls' set that contain the different one.
apply above procedure to down the size to 3 or 4 -> 2 more iterations
4) For 3 balls' set it takes 2 iteration to find the different one (either heavy or light). For 4 balls' set it takes 3 iterations to find the different one.
So total max. iterations is 9. Anyone has a better idea? possibly 7 or 8? If the ball is known to be heavier or lighter than regular ones, then 5 iteration is optimum one.
would you mind to explain your answer?
- buried.shopno May 25, 2010I came up an idea similar to Vishal. However, the summation of the harmonic series is O(nlogn) [REF: corman]
Now, to find n-th largest sum among O(nlogn) elements can be done in O(nlogn) time using the median-of-median finding order statistic method [REF: corman]
Finally finding the n largest sum-pairs can be done by comparing with n-th largest sum. So total running time is O(nlogn).
However, still O(n) solution seems quite hard to figure out.
@Anonymous: with a code, explain your idea. that wud help others to catch flaw in ur logic.
Would u mind to check ur code for this input:
A = {100,89,75,52}
B = {71, 63, 48, 45}
Just compute the remainder. When you reach a remainder previously seen, you start a recurring sequence.
Example: 17/7 = 2.428571428....
Remainder sequence: 3->2->6->4->5->1->3
as length of this sequence is 6, repeating pattern : 428571.
max sequence length = (denominator-1)
Seems, your method doesn't work for below input:
A = {100,89,75,52}
B = {71, 63,62, 48}
pre_A = {0, 11, 14, 23}
pre_B = {0, 8, 1, 14}
largest 4 sum = {100+71, 100+63, 100+62, 89+71 }
Your method skips 100+62 pair, because you increase 'i' just after comparing pre_a[i]>pre_b[j].
A modified problem: print every m-th element from the last of the linked list.
A recursive solution:
int traverse (struct node *cur, int m)
{
if (cur == NULL) return 0;
int count = 1 + traverse (cur->ptr, m);
if ( count%m == 0) cout << cur->data << " ";
return count;
}
void base26(int n)
{
if(n<base) {
putchar('a'+n%26-1);
return;
}
base26(n/26);
putchar('a'+n%26-1);
}
If you get a same reminder during the process of division, then it'll repeat th quotient pattern. For example:
17/7 = 2.428571428571....
See the remainders:
3->2->6->4->5->1->3
It also say the length of recurrence is 6. The max. length is (denominator-1).
max throws # 43
8 (to check where 1st ball breaks, i.e. 1->2->4->8->16->32->64->100) +
35 at max (to check if it breaks from 65 to 99)
O(n) is claimed here as outer and inner radix sort - each runs a fixed number of iterations. For 4 byte integer, it's 32 if 2 bin is used for radix sort.
- buried.shopno October 05, 2011