cristian.botau
This is actually O(N), because you have the recurrence relation:
T(n) = 2*T(n/2) + O(1)
For O(log N) you should do only one recursive call, i.e:
T(n) = T(n/2) + O(1)
You can obtain that be reusing the result of "power(a, n / 2)" to compute "power(a, n  n / 2)", and avoid the second recursive call.
Cheng Q.'s approach is correct. Here is an implementation for this idea.
I use the array pos, where pos[i] = the left most position where count_1  count_0 = i
Time complexity: O(N)
Space complexity: additional O(N)
#include <iostream>
#include <algorithm>
#include <cstring>
const int MAX_N = 100;
const int INF = 0x3F3F3F3F;
using namespace std;
int solve(int a[MAX_N], int n) {
int b[2*MAX_N + 1];
int *pos = b + MAX_N; // pos points to the middle of b so that we can use negative indices for accessing pos elements
for (int i = n; i <= n; ++i)
pos[i] = INF;
int result = 0;
for (int i = 0, count = 0; i < n; ++i) {
count += a[i] ? 1 : 1;
result = max(i  pos[count], result);
pos[count] = min(i, pos[count]);
}
return result;
}
int main() {
int a[MAX_N] = {1,1,1,1,1,0,0,0,0,0,0,0,1};
//int a[MAX_N] = {0,0,1,0,1,0,1,0,1};
cout << solve(a, 13) << endl;
return 0;
}

cristian.botau
August 14, 2013 The decision to take out the a[start] out of the sequence if a[start] != more and a[end] != more doesn't seem to lead to a correct answer for some cases.
Take, for example, a = 1111100000001
@dmxmails: The complexity can be further reduced if you use a clever data structure that supports fast insertion at an arbitrary position.
This can be achieved using a modified skiplist (expected insertion time O(logN)) or a modified balanced binary tree for which we store the in each node the number of nodes in the subtree rooted at that node (worst case insertion time: O(logN)).
So, the final complexity of the algorithm would be O(N*log(N)).
You can solve the problem using the dynamic programming technique.
Construct the matrix match, where match[i][j] is true iff the first i characters of a match the first j characters of b.
The recurrence relation is the following:
match[i][j] =
(match[i1][j1] && character_matches(a[i  1], b[j  1])) 
/* eg: (abc, a?c) => (abcd, a?cd) */
(match[i1][j] && b[j  1] == '*') 
/* eg: (ab, a*) => (abc, a*) */
(match[i][j1] && b[j  1] == '*')
/* eg: (abc, ab) => (abc, ab*) */
And the basic case is: match[0][0] = true
Time complexity: O(A*B),
Space complexity: O(A*B), can be reduced to O(A+B)
where A,B = length of A, respectively B
Here is the code:
#include <iostream>
using namespace std;
bool isCharMatch(char a, char b) {
return (b == '?')  (b == '*')  (a == b);
}
// match[i][j] = (match[i1][j1] && matches(a[i  1], b[j  1])) 
// (match[i1][j] && b[j  1] == '*') 
// (match[i][j1] && b[j  1] == '*')
bool isMatch(const string& a, const string& b) {
bool match[a.length() + 1][b.length() + 1];
for (int i = 0; i <= a.length(); ++i)
for (int j = 0; j <= b.length(); ++j) {
match[i][j] = (i == 0) && (j == 0);
if (i > 0 && j > 0)
match[i][j] = match[i1][j1] && isCharMatch(a[i1], b[j1]);
if (i > 0 && j > 0)
match[i][j] = match[i1][j] && b[j1] == '*';
if (j > 0)
match[i][j] = match[i][j1] && b[j1] == '*';
}
return match[a.length()][b.length()];
}
void test(const string& a, const string& b) {
cout << "match(" << a << ", " << b << ") = " << isMatch(a, b) << endl;
}
int main() {
test("abab", "*b*");
test("abab", "a**b");
test("abab", "a**b");
test("", "");
test("ab", "");
test("ab", "*");
test("", "**");
test("", "*?");
return 0;
}

cristian.botau
July 31, 2013 I've updated answer to contain the program and tests used for the program. Hopefully I didn't leave any important test cases out. If you find any failing input data please post it ;)
 cristian.botau July 29, 2013@Apostle: Oh, sorry, I haven't paid attention to the requirement (I thought that the largest square was asked for).
 cristian.botau July 25, 2013@Apostle:
"1. No. It's an O(N^2) algorithm. each element of the matrix is traversed at most thrice."
I agree with Chih.Chiu.19. It seems to be O(N^3) by your explanation. What happens with your algorithm on an NxN matrix filled all with 1?
While doing the diagonal parsing (looking for corners) what happens after you have processed a corner? Continue parsing the diagonal or maybe you skip the whole diagonal? (otherwise I don't see how your algorithm runs in less than O(N^3)).
That's because it doesn't make sense in modular arithmetic (floating point numbers don't make much sense in modular arithmetic).
Actually, it would make sense if you would compute the modular multiplicative inverse of x^abs(y) if y is negative, but that can be computed only if x^abs(y) and z are coprimes (so the problem might not always have an answer).
Here is an O(log(y)) algorithm. It is the classical fast exponential algorithm. I won't get into details into it because it is a simple algorithm and can be found easily by googling it.
However, the trick to this problem is to watch out for arithmetical overflows:
 since x^y can grow really big, you can't just compute x^y and then apply % z, since it will likely overflow; so you need to apply modulo z operation on each multiplication;
 furthermore for high values of z even one multiplication can overflow (take for instance x = 2 billions  1, y = 2, z = 2 billions), so you need to use the long long type for each multiplication;
In order to make sure there is no arithmetic overflow happening, I've defined the modMul(x, y, z) operation which performs the operation "(x * y) % z" and guarantees there is no overflow.
inline int modMul(int x, int y, int z) {
int result = ((long long)x * y) % z;
return result;
}
int power(int x, int y, int z) {
if (y == 0)
return 1;
int sqrt = power(x, y / 2, z);
int result = modMul(sqrt, sqrt, z);
if (y % 2 == 1)
result = modMul(result, x, z);
return result;
}

cristian.botau
July 24, 2013 The line:
int y = power(x,n1/2);
is incorrect "/" takes precedence before "", so the expression n1/2 will actually evaluate to n. Btw, since n is odd you could just write n/2 (it will truncate the result, and it is equivalent to (n1)/2).
Also, your solution will likely overflow since x^y can easily get over 2^311. You need to apply "% z" to each multiplication inside the power() function.
What if x and z are very large, like 2^3110? Even if you use "% z" for each multiplication it is not ok, for example x*x will overflow. So, when performing a multiplication you need to use long long ints (which can hold numbers up to 2^631). For example, line "return y*y" should be "return ((long long)y * y) % z"
For a correct implementation, which also works for large int values for x, y, z check my answer.
I am not sorting and I'm not actually making any bucket. The buckets are used for algorithm explanation.
You basically need the following operations:
 find x  the smallest element in input vector that (x >= 1 and x < 2)  easily done in O(N) with a simple pass through the array
 take the smallest element & second smallest element from the input array (and ignore the first element from triple)  doable in O(N)
 find x the highest element s.t. (x >= 0.5 and x < 1)  O(N)
 etc.
To generalize, the algorithm uses operations like: find the first/second lowest/highest element which lies in the interval [a..b). These operations are doable in O(N).
Consider the following buckets: (0..0.5), [0.5..1), [1..2), [2..inf).
Obviously, we ignore numbers in [2..inf).
Now basically we need to treat all cases of choosing 3 numbers from the 3 buckets.
We only need to look at the following cases (the other cases are "worse" or are covered by these):
1. If possible, choose the smallest element from bucket [1..2) => for the 2nd and 3rd we need to take the smallest 2 elements available. If sum < 2 then return true.
2. If possible, choose the two smallest elements from bucket [0.5 .. 1) => for the 3rd we need to take the smallest element available. If sum < 2 then return true;
3. If possible, choose the highest element from bucket [0.5 .. 1) => if possible, for the 2nd and 3rd take the highest and the second highest from bucket (0 .. 0.5). If sum > 1 then return true.
4. If possible, choose the highest 3 elements from bucket (0..0.5). If sum > 1 then return true.
If none of the cases above found a solution then return false.
Space complexity: O(1), you don't need to explicitly store numbers in buckets.
Time complexity: each operation (e.g.: find smallest element from bucket [1..2), etc.) can be done in O(N). There is a constant number of these operations => overall complexity O(N)
LATER EDIT:
Since the answer was downgraded without any question or explanation why it would be wrong, here is the actual code and the associated tests. Hopefully I didn't forget any relevant test case.
The code could be optimized more and be more condensed, but I tried to make it as clear as possible (regarding to the explanations above and the space and time requirements).
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
using namespace std;
const float INF = 1000;
bool inInterval(float x, float st, float end) { return x >= st && x < end; }
bool findFirstSmallest(const vector<float>& a, float start, float end, float &res) {
int found = 0;
res = INF;
for (int i = 0; i < a.size(); ++i)
if (inInterval(a[i], start, end)) {
++found;
res = min(res, a[i]);
}
return found >= 1;
}
bool findFirstHighest(const vector<float>& a, float start, float end, float &res) {
int found = 0;
res = INF;
for (int i = 0; i < a.size(); ++i)
if (inInterval(a[i], start, end)) {
++found;
res = max(res, a[i]);
}
return found >= 1;
}
bool findSecondSmallest(const vector<float>& a, float start, float end, float &res) {
int found = 0;
float first = INF, second = INF;
for (int i = 0; i < a.size(); ++i)
if (inInterval(a[i], start, end)) {
++found;
if (a[i] <= first) {
second = first;
first = a[i];
} else if (a[i] <= second)
second = a[i];
}
res = second;
return found >= 2;
}
bool findSecondHighest(const vector<float>& a, float start, float end, float &res) {
int found = 0;
float first = INF, second = INF;
for (int i = 0; i < a.size(); ++i)
if (inInterval(a[i], start, end)) {
++found;
if (a[i] >= first) {
second = first;
first = a[i];
} else if (a[i] >= second)
second = a[i];
}
res = second;
return found >= 2;
}
bool findThirdSmallest(const vector<float>& a, float start, float end, float &res) {
int found = 0;
float first = INF, second = INF, third = INF;
for (int i = 0; i < a.size(); ++i)
if (inInterval(a[i], start, end)) {
++found;
if (a[i] <= first) {
third = second;
second = first;
first = a[i];
} else if (a[i] <= second) {
third = second;
second = a[i];
} else if (a[i] <= third)
third = a[i];
}
res = third;
return found >= 3;
}
bool findThirdHighest(const vector<float>& a, float start, float end, float &res) {
int found = 0;
float first = INF, second = INF, third = INF;
for (int i = 0; i < a.size(); ++i)
if (inInterval(a[i], start, end)) {
++found;
if (a[i] >= first) {
third = second;
second = first;
first = a[i];
} else if (a[i] >= second) {
third = second;
second = a[i];
} else if (a[i] >= third)
third = a[i];
}
res = third;
return found >= 3;
}
bool solve(const vector<float>& a, float& x, float &y, float& z) {
if (findFirstSmallest(a, 1, 2, x) &&
findFirstSmallest(a, 0, 1, y) &&
findSecondSmallest(a, 0, 1, z))
if (x + y + z < 2) return true;
if (findFirstSmallest(a, 0.5, 1, x) &&
findSecondSmallest(a, 0.5, 1, y) &&
(findFirstSmallest(a, 0, 0.5, z)  findThirdSmallest(a, 0.5, 1, z) ))
if (x + y + z < 2) return true;
if (findFirstSmallest(a, 0.5, 1, x) &&
findFirstHighest(a, 0, 0.5, y) &&
findSecondHighest(a, 0, 0.5, z))
if (x + y + z >= 1) return true;
if (findFirstHighest(a, 0, 0.5, x) &&
findSecondHighest(a, 0, 0.5, y) &&
findThirdHighest(a, 0, 0.5, z))
if (x + y + z >= 1) return true;
return false;
}
void test(const vector<float>& a) {
cout << "Test: ";
copy(a.begin(), a.end(), ostream_iterator<float>(cout, " "));
cout << endl;
float x, y, z;
if (solve(a, x, y, z))
cout << "Solution: " << x << " " << y << " " << z << endl;
else
cout << "Solution not found!" << endl;
cout << endl;
}
#define arrSize(a) (sizeof(a) / sizeof(a[0]))
int main() {
float test1[] = {0.1, 0.2, 0.2, 0.3, 2.0, 3.0};
test(vector<float>(test1, test1 + arrSize(test1)));
float test2[] = {0.1, 0.3, 0.3, 0.4, 2.0, 3.0};
test(vector<float>(test2, test2 + arrSize(test2)));
float test3[] = {0.1, 0.1, 0.2, 0.6, 2.0, 3.0};
test(vector<float>(test3, test3 + arrSize(test3)));
float test4[] = {0.5, 0.6, 0.6, 2.0, 3.0};
test(vector<float>(test4, test4 + arrSize(test4)));
float test5[] = {0.6, 0.6, 2.0, 3.0};
test(vector<float>(test5, test5 + arrSize(test5)));
float test6[] = {0.6, 0.6, 1.0, 2.0, 3.0};
test(vector<float>(test6, test6 + arrSize(test6)));
float test7[] = {0.1, 0.2, 0.5, 1.0, 2.0, 3.0};
test(vector<float>(test7, test7 + arrSize(test7)));
float test8[] = {0.1, 0.7, 0.6, 1.0, 2.0, 3.0};
test(vector<float>(test8, test8 + arrSize(test8)));
float test9[] = {0.5, 0.5, 0.6, 1.0, 2.0, 3.0};
test(vector<float>(test9, test9 + arrSize(test8)));
float test10[] = {1.6, 1.2, 1.0, 2.0, 3.0};
test(vector<float>(test10, test10 + arrSize(test10)));
return 0;
}

cristian.botau
February 22, 2013 It is not 3SUMhard. The data has other characteristics which might make the problem solvable in linear time:
 numbers are positive
 sum must lie in an interval
Check my answer on how we can "exploit" these relaxed requirements in order to obtain a linear algorithm.
No, it is O(N^2) because the largestArea() function runs in O(N) time. This is because if you count the total operations done in the most inner loop "while (!St.empty())" you'll see it is O(N) (you can't pop more than N elements).
 cristian.botau February 12, 2013@nitingupta180: Nice & optimal solution.
The algorithm for largestArea() could be made a little more faster, if you see that you only need the first loop (for computing L). That loop can be modified like so: whenever you pop an element from the stack you update the result with the rectangle corresponding to that element (you know where it starts and you know that it ends here at i). You also need to take care of elements not popped at the end of the loop.
However, I think that the solution that computes both L and R is easier to understand.
The solution is correct, but there is a small problem with the notation of the vector:
You are multiplying a 2x1 matrix with a 2x2 matrix, and that is not possible.
Either swap places of vector and matrix, or define the vector horizontally (i.e 1x2 matrix).
 f(n1) f(n2)  x  2 1  =  f(n) f(n1) 
 2 0 

cristian.botau
January 25, 2013 You only need the number of nodes in the left subtree of every node.
 cristian.botau January 23, 2013For that you need to use additional data (like a hash map) for determining effficiently the position in the heap. Try googling for how decrease key is implemented efficiently for a heap.
However I recommend that you use a std::set (actually multiset or map in order to deal with duplicate elements) instead of a heap. That will make implementation of the algorithm much easier. I used the term "heap" in the solution description because of its main purpose (to keep track of the minimum).
@arwin: I hope i understood your question properly. Here is the response:
When we have to delete elements from j to j' it doesn't take O(logN) time. It takes (j'  j)*O(logN).
However, if you count the elements that are deleted for all the steps the algorithm performs then there are at most N elements to delete (because when you delete an element you increase j', it is never decremented and it goes up to N).
Or to put it in another way: throughout the running of the algorithm, you heap.remove() each element of the array at most once.
Like your algorithm: simple, concise and general. However, no vote for you until you put a proper brief description in words of the algorithm.
 cristian.botau November 17, 2012Yeah, my bad :)
Although, if the order doesn't matter, I don't see how the fact that the list is sorted may be helpful.
I think the key to solving this problem is to use the information that the list of words is sorted (i.e.: you already have the word list preprocessed to help you with the query).
Consider (for complexity computation):
N  number of words in the word list (max. 1 million)
L  size of a word (max. 40)
A  size of the alphabet (= 26)
For 1 letter distance you can use the following algorithm:
1. Generate all the possible words that are 1 letter distance away from the query word
 this has the complexity O(L*A)
2. Look up each of the generated words in the word list using a binary search:
 the look up of an individual word is O(L*logN)
 we have O(L*A) lookups => final complexity is O(L^2*A*logN) which is roughly (not considering the hidden constant) about 832.000 operations which is better than O(L*N) which is roughly 40 million operations.
For distance = 2, this algorithm performs worse than the O(L*N) version.
Later edit: @warrior: in case your last reply was not referring to my comment then just ignore what I just wrote :)
I didn't say that your solution is incorrect (it is actually correct), but i don't like the fact that it uses backtracking.
Regarding to what I proposed you misunderstood one thing: it doesn't permute the remaining digits, it finds the next permutation for the whole number.
Here is an example:
X = [1, 6, 7, 3, 2], Y = [6, 7, 8, 9, 1]
Algorithm:
[6 ?] > [6, 7, ? ]  no remaining larger digits? > [6, 7, 3, 2, 1]  next permutation > [7, 1, 2, 3, 6]
Binary search trees have the property that the inorder traversal of the tree is a sorted array. The reciprocal of this property is also true.
So the easiest algorithm would be:
1. array a = inordertraversal(tree)
2. check if a is sorted increasingly
Of course, you can merge those two steps into one and not use the additional array.
The algorithm looks incorrect.
Please correct me if I didn't understand it properly: you basically check for every node if (direct left child < parent) and (direct right child > parent).
If so, in the case below your algorithm returns a false positive:
5
/ \
2 7
/ \
1 10

cristian.botau
November 01, 2012 evaluateExpressionPow has side effects. After a call of pow(a, b), if I call pow(c, d) where c != a it will use the pp computed for a (which is obviously wrong).
 cristian.botau November 01, 2012You don't need to backtrack if you are out of digits that are higher then the one in y.
Why not just generate the largest possible number with the remaining digits (even though it will be lower then y) and then run next permutation algorithm on the result?
You can solve it even more efficiently (O(N)) using a dequeue instead of minheap. Check my answer, it includes explanation for the minheap version as well as for the dequeue version.
 cristian.botau November 01, 2012@bambam:
Using '\0' to end an array of ints is a little bit creepy. Why not just use 0 instead? (it's basically the same value and you don't force the compiler to cast your char to int)
I haven't analyzed your solution in depth but those inner loops makes me a little bit skeptic about the O(N) complexity you're claiming.
(min2 >= min1) and (p + min1 >= k) implies that (p + min2 >= k)
Hence use of min2 is redundant.
This can be done in O(N*log(N)) time using a minheap or O(N) using a dequeue.
Basically, the algorithm works like this: for each index i in the array computes the longest subarray that ends at position i and satisfies the requested condition.
Now, let's consider we're at index i, and [j ... i1] is the longest subarray found in the previous iteration (for i1). In order to compute the subarray for this iteration we need to find the smallest j' >= j such that min(a[j'], .., a[i1]) + a[i] >= K.
Now, the trick is how to find j' efficiently.
A first approach is to use a minheap and start with j' = j and then increment j' and remove element a[j'] from heap until the condition holds (or you reach i). Since j is incremented at most N times => there are a total of N calls to heap.remove_element. Since i is incremented N times => there are N calls to heap.insert_element. => final complexity O(N*log(N)).
A second approach, which is a little bit trickier (I suggest getting a pen and paper for this) is using a deque instead of heap. The constructed deque will have these important properties:
 in the front of the deque is index of the minimum element in seq [j..i1] (just like the heap)
 the second element is the index of the minimum element in the sequence that remains after removing the first minimum along with the elements in front of it;
 and so on.
So basically if dequeue = [m1, m2, ...] then the initial sequence looks like this [j ... m1 ... m2 ... i1], and:
 m1 is the index of minimum of sequence [j .. i1],
 m2 is the index of minimum of sequence (m1 .. i1] (please note that the interval is open at m1)
I won't explain how you perform the operations on the dequeue in order to prserve those properties (try to think them yourself or look at the code / if you have any questions feel free to ask). You have the implementation below for the timeoptimal (dequeue) solution. The methods for updating the deque are push(i)  updates the deque by adding element a[i] and popBadMins() which removes minimums from dequeue and returns the new j'.
Friendly advice: If you're not familiar with dequeue trick, I suggest you try to understand it because it proved to be helpful in programming contests.
#include <iostream>
#include <vector>
#include <deque>
using namespace std;
#define MAX_N 10000
struct Sol {
int st, end;
Sol(int s, int e) : st(s), end(e) {};
};
int A[MAX_N], N, K;
vector<Sol> sol;
int maxLen = 0;
deque<int> q;
// adds the [st, end] interval to the solution set if it is maximal so far
void update_sol(int st, int end) {
int len = end  st + 1;
if (len > maxLen) {
maxLen = len;
sol.clear();
}
if (len == maxLen)
sol.push_back(Sol(st, end));
}
void read_data() {
cin >> N >> K;
for (int i = 0; i < N; ++i)
cin >> A[i];
}
void push(int index) {
int val = A[index];
while (!q.empty() && val <= A[q.back()])
q.pop_back();
q.push_back(index);
}
int popBadMins(int prevStart, int endIndex) {
int val = A[endIndex];
int result = prevStart;
while (!q.empty() && val + A[q.front()] < K) {
result = q.front();
q.pop_front();
}
return result;
}
void solve() {
for (int i = 0, j = 1; i < N; ++i) {
j = popBadMins(j, i);
push(i);
update_sol(j+1, i);
}
}
void print_result() {
for (int i = 0; i < sol.size(); ++i) {
const Sol& s = sol[i];
for (int j = s.st; j <= s.end; ++j)
cout << A[j] << " ";
cout << endl;
}
}
int main() {
read_data();
solve();
print_result();
return 0;
}
Note: Didn't test this thoroughly so I might have missed some corner cases.
Oh, and sorry for the long post.
Forgot to mention that there is no solution in case nextArrangment() method fails (i.e. this is the highest arrangement for x digits and yet still lower than y).
 cristian.botau October 31, 2012Note: I assume x and y have the same number of digits
This is an O(N) algorithm:
Here is pseudocode with explanations:
1. create digits histogram for x in order to be able to efficiently extract a given digit from it
for (int i = 0; i < x.size(); ++i)
++histogram[x[i]];
2. start from most significant digit (assuming its index is 0) and basically use the same digit from y on the same position in result. If at some point you don't have that digit, you select the smallest digit higher than the digit you're looking for and then put the remaining digits in increasing order and you have your answer. If there is no larger digit then put the remaining digits in decreasing order.
In this case you've got yourself the closest number to Y, but lower. So you need to generate the next lexicographic permutation  see step 3 (in C++ there is std::next_permutation that just does that).
for (i = 0; i < y.size(); ++i)
{
try to extract digit y[i] from histogram
if (y[i] found in histogram) then { result[i] = y[i] }
else if (there is a digit d in histogram s.t. d > y[i])
{
result[i] = the smallest digit d from histogram st d>y[i]
// put remaining digits in increasing order
result[(i+1)..y.size()] = histogram.sortIncreasing();
// found the number, woohoo!!
break for loop;
}
else /* there are only digits lower than y[i] */
{
// put remaining digits in decreasing order
result[i..y.size()] = histogram.sortDecreasing();
// found closest number smaller then y
break for loop;
}
}
3. Now the variable result is either:
 the result we're looking for, i.e.: the closest number greater or equal to y
 the closest number less than y, case in which we need to generate the next lexicographic permutation of digits
So we need to do this check:
if (result < y)
result = nextPermutation(result);

cristian.botau
October 31, 2012 Open Chat in New Window
The question asks for the *number* of pairs. In order to compute the number of pairs you don't necessarily need to iterate over each pair.
 cristian.botau October 04, 2013