0xF4
BAN USERHere's algorithm with worst case o(N^(1+epsilon)) time complexity and O(1) space, where epsilon is arbitrary small constant larger than zero. So algorithm is "almost linear".
The inner loop complexity is O(N) however inner loop runs only d(N) times where d(N) is well-known "divisor function" - number of divisors of N. d(n) = o(N^epsilon), so overall time complexity is o(N*N^epsilon)=o(N^(1+epsilon)).
bool checkPattern(const string& str)
{
int len = str.size();
for (int i = 1; i <= len/2; i++) {
if (len % i == 0) {
int j = i;
while (j < len && str[j] == str[j % i]) j++;
if (j == len) return true;
}
}
return false;
}
First of all, the example output in the question is wrong. Here's the correct output
[4 6 8] [7 6 8] [9 4 8]
[3 4 8] [4 8] [9 7 8]
[3 7 8] [7 8] [3 9 6]
[9 6] [3 6] [3 9]
No of groups: 12
OK, let's analyze the problem. The silly part of the problem is that we have to find and report ALL
the groups. That yields to non-polynomial time complexity algorithm in worst case.
If we had to find any group with a given criteria or prove that no such group exists or solve optimization problem
(e.g. find the group of a maximum length divisible by X) then we could provide a nice polynomial (or as some people say pseudo-polynomial) time
algorithm based on dynamic programming with the complexity O(X*N).
For example, if we want to maximise the length of the group, we could use the following recursive relationship for DP:
dp[i][s] = MAX( dp[i-1][r], dp[i-1][(x+s-arr[i-1])%x] + 1)
where dp[i][s] holds the maximum length of subset formed from the first i elements of the array with a sum s modulo x.
Note, that in my solution all sums are computed using modulo X. so s is a remainder of a division of a sum to X.
dp[arr.size()][0] will contain the length of a largest subset with a sum divisible by x.
Ok, let's solve the original problem. We want to print all groups efficiently and try to keep running time (pseudo)polynomial
if there is a few groups to report. The algorithm below is a hybridization of a dynamic programming with a
search with pruning and backtracking. dp[i][s] relationship mentioned above is used to avoid recursing into the costly branches
which don't lead to printable results.
Space complexity O(X*N), time complexity O(X*(N+k)) where k is the # of groups which will be found and printed.
If no groups exist with the given criteria or O(k) == O(N) then algorithm terminates on O(X*N) time.
If O(k) = O(N^c) for some constant c the algorithm terminates in polynomial time O(X*N^c)
Code:
class FindGroupsWithSumDivisibleByX
{
int x;
const vector<int>& arr;
int count;
// dp[i][j] holds maximum length of a subsequence of a first i elements of x
// which sum to j modulo X
vector<vector<int>> dp;
public:
FindGroupsWithSumDivisibleByX(const vector<int>& arr, int x): arr(arr), x(x)
{
dp.resize(arr.size() + 1, vector<int>(x, INT_MIN));
}
void PrintAllGroups()
{
count = 0;
vector<int> output(arr.size());
Backtrack(output, 0, arr.size(), 0);
cout << "No of groups: " << count << endl;
}
int Backtrack(vector<int>& output, int pos, int i, int sumModuloX)
{
if (i == 0) {
PrintGroup(output, pos, sumModuloX);
return 0;
}
int& cache = dp[i][sumModuloX];
if (cache != INT_MIN) {
if (pos >= x || cache == 0 || (pos == 0 && cache == 1)) {
// print accumulated group (if matches the criteria)
PrintGroup(output, pos, sumModuloX);
// pruning the current recursive search branch
return cache;
}
}
output[pos] = arr[i - 1];
// case a: adding arr[i-1] to the group
int a = Backtrack(output, pos + 1, i - 1, (x + sumModuloX - (arr[i - 1] % x)) % x) + 1;
// case b: not adding arr[i-1] to the group
int b = Backtrack(output, pos, i - 1, sumModuloX);
cache = max(a, b);
return cache;
}
void PrintGroup(vector<int>& output, int pos, int sumModuloX)
{
if (sumModuloX == 0 && pos > 1 && pos <= x) {
for (int j = pos - 1; j >= 0; j--)
cout << output[j] << " ";
cout << endl;
count++;
}
}
};
Some notes:
1) negative numbers: the code supports negative numbers, however according to C++ standard the sign of the % result for negative param is
implementation defined.
As written above, the code will work on mainstream compilers but if you care about portability
you have to replace arr[i]%x to abs(arr[i%x])*sgn(arr[i]).
2) space complexity can be optimized further: technically the program uses only 2 bits of the values stored in dp array,
so the storage can be reduced. This can be even reduced to a single bit-matrix without changing the asymptotic complexity
but at the cost of larger constants
3) I also took an assumption that numbers in array are unique. If this assumption isn't correct we need to also take care to avoid reporting duplicated groups unless this is permitted
OK, as many of us concluded it's impossible to construct algorithm for this problem with time complexity better than O(N^2) [in worst case]. What to do when interviewer still expects O(N) solution for worst case [where O(N^2) pairs has to be reported]?
1) Provide a formal proof that this is impossible in a classic computation model (turing machine) [providing an argument about # of pairs is enough].
2) Tell the interviewer that we're engineers and we are going to solve this no matter what. Show what constraints we can change to provide O(N) time complexity or better.
For instance we can escape from classical sequential model and design a system (either design a new hardware system or build a distributed systems with many processors/computers) - where we can leverage O(N) processors with very efficient communication between each pair of processors. In this case O(N) solution is possible. We can find O(N^2) pairs and report them in O(N) time into a distributed storage across our O(N) nodes.
If I was an interviewer at Amazon I'd be happy to hear such answer.
In practice QuickSort is usually implemented with random median selection and with optimizations like dutch-flag partitioning. That makes algorithm *expected* time complexity O(NLogN) regardless of the input!
You won't be able to construct anything like this for the problem we are discussing.
Even unmodified quicksort works well with random input. It has average complexity O(NlogN). For the problem we are discussing - no algorithm will work great with random input since in our case expected number of pairs in a random input will be O(N^2).
So this is fundamentally O(N^2) problem. The only thing you can do you is to optimize special cases where expected number of pairs is asymptotically lower than O(N^2).
Alternatively, you can cross the line and apply various form of cheating - see the idea of O(1) "algorithm" described in comments.
Interviewer will be very happy with this response, especially if you clarify assumptions, show why it is impossible to do it in "standard" way and articulate when exactly parallelized solution will be effective Namely, when comparison operation is more expensive than processor allocation and inter-process communication. Obviously, the proposed communication scheme must be rational (e.g. there is no need for re-organization but it is enough to send each request to each node every time)
When asking questions like this interviewer clearly want to see whether candidate is able to think out of the box. You can argue whether this question is good or not. Usually interviewer provide enough hints. It is open-ended question, the discussion can even end up with designing your own hardware architecture for this problem.
I like this question.
I assume here that data structure can hold arbitrary numbers (data), otherwise the problem isn't interesting.
First of all, your intuition is right:
> But not sure Max is even possible in O(1) with the presence of delete operation?
Such datastructure/algorithm is impossible on a turing machine.
Proof by contradiction:
suppose this data structure exists. If this is the case then we can build general purpose comparison sorting algorithm by pushing all data to this structure and then pulling one-by-one using max and delete operations with O(N) complexity. However we know, that no comparison sorting algorithm on a turing machine can give us better than O(NlogN). Q.E.A./Q.E.D.
Now, we're engineers let's solve the problem :)
Given the proof above, the only possible loophole here - is to avoid using turing machine model. Since question is asked at Amazon where interviewers are passionate about scale I'd propose to parallelize the computation and leverage O(N) processors/computers. In this model all requested operations (including search) can be implemented with time complexity O(1)
Victor: even if tree is self-balancing the complexity is still O(N^2).
Note, that you have to find ALL the pairs. In worst case (or on a random input) there will be O(N^2) pairs to report, so overall complexity is O(N^2). The fact that you propose to use self-balancing tree doesn't help here.
Basically I wanted to show that it is not necessary to look at all input string - if string is long enough (larger than some constant threshold) then the result is 'true'. The constant threshold only depends on the alphabet size and doesn't depend on the length of the input string. So the key part of this "solution" is "if (s.size() > threshold) return true;"
Anything we do after this check doesn't depend on the input string length, so the running time of the "ugly" part is bound with some constant.
In this specific case the "ugly" part is just a brute-force search which checks for all combinations of pairs.
This is a discrete bin packing problem - a variant of a classical bin packing ( en.wikipedia.org/wiki/Bin_packing_problem ), where bins are computers and items are tasks. Please note, that heuristic algorithms proposed in the discussion are incorrect solutions for this task (unless approximations are acceptable).
Generally, bin packing problem is NP-hard, however we can take advantage that weights and capacities are discrete (integers) and solve this with pseudo-polynomial dynamic programming algorithm with O(P) space and O(P) time where P is a product of capacities of all bins multiplied by a number of items. The storage is a multi-dimensional bit array which caches the partial solutions. dpCache[i][b0][b2]...[bk] is a bit which stores possibility of bin packing of first i items inside bins with capacity b0...bk.
Here's code:
class DiscreteBinPacking
{
public:
DiscreteBinPacking(const vector<int>& bins, const vector<int>& items)
: bins(bins), remainingCapacity(bins), items(items)
{
// computing size of bit-vector to store multidimentional matrix
// for caching partial solutions
int size = items.size();
for (int i = 0; i < bins.size(); i++) {
size *= bins[i] + 1;
}
dpCache.resize(size);
dpCacheMask.resize(size);
}
bool CanBePacked()
{
return CanBePacked(items.size());
}
private:
const vector<int> bins;
const vector<int> items;
vector<int> remainingCapacity;
// two bit-vectors - they both represent multidimentional bit arrays to cache partial solutions
vector<bool> dpCache;
vector<bool> dpCacheMask;
// returns true if bin packing is solvable for first i items
bool CanBePacked(int i)
{
if (i == 0) return true;
int cacheIndex = i - 1;
for (int b = 0; b < bins.size(); b++) {
cacheIndex *= bins[b] + 1;
cacheIndex += remainingCapacity[b];
}
if (dpCacheMask[cacheIndex]) return dpCache[cacheIndex];
bool result = false;
int itemSize = items[i - 1];
for (int b = 0; b < remainingCapacity.size() && !result; b++) {
if (remainingCapacity[b] >= itemSize) {
remainingCapacity[b] -= itemSize;
result = CanBePacked(i - 1);
remainingCapacity[b] += itemSize;
}
}
dpCacheMask[cacheIndex] = true;
dpCache[cacheIndex] = result;
return result;
}
};
I worked on this problem independently and found that my solution is exactly the same as the one from vgeek, however my implementation is shorter and easier to read;
O(N) time O(N) space
void printSequence(int n)
{
vector<int> v(n);
v[0] = 1;
int p2 = 0, p3 = 0, p5 = 0, p7 = 0;
cout << 1 << endl;
for (int i = 1; i < n; i++) {
int n2 = v[p2] * 2;
int n3 = v[p3] * 3;
int n5 = v[p5] * 5;
int n7 = v[p7] * 7;
int next = min(min(n2,n3),min(n5,n7));
v[i] = next;
if (next == n2) p2++;
if (next == n3) p3++;
if (next == n5) p5++;
if (next == n7) p7++;
cout << next << endl;
}
}
O(log^2(N)) solution!
I surprised that this question has so many votes and replies, but nobody posted the
solution better than O(sqrt(N)). Please vote this solution up so new participants can see it.
The solution below is O(log^2(N)) time and O(log(N)) space:
As for space it needs to store at most log(N) integers - which is just 32 integers if integer is 32bit.
The algorithm uses additions, subtractions and comparisons only.
It is based on the two-level one-sided binary search.
On the first level - it searches for the square root of N - it requires ~log(sqrt(N))) iterations.
Since we can't use multiplication, algorithm uses nested one-sided binary search to compute the square.
Thus, total time complexity is O(log(sqrt(N))*log(sqrt(N))) = O(log^2(N)/4) = O(log^2(N))
The most tricky part is to make the function working on all possible values of int, especially making
it robust for values near to INT_MAX. For simplicity of implementation I allowed integer to overflow on addition and used comparison with 0 as an overflow check.
bool isPerfectSquare(int n)
{
// Hadling edge cases:
if (n < 1) return false;
if (n == 1) return true;
// Filling a table with powers of two and simultaneously performing
// one-sided binary search for the [i,i*2] range which contains sqrt(N)
// The table will keep at most log(N)/2 integers
vector<int> powerOf2Table; // 1 2 4 8 16 ...
int left = 1, right = 1, powerOf2Idx = 0, sqr = 1;
do { // O(log(N)) iterations
powerOf2Table.push_back(right);
left = right;
right = right + right; // *=2
sqr = sqr + sqr + sqr + sqr; // *=4
} while (sqr > 0 && sqr < n);
if (sqr == n) return true;
// now use bisection to find the sqrt(N). O(log(N)) iterations
for (int p = powerOf2Table.size() - 2; p >= 0; p--) { // O(log(N)) iterations
int mid = left + powerOf2Table[p]; // = (left + right)/2
sqr = computeSquare(mid, powerOf2Table); // this call is O(log(N))
if (sqr == n) return true;
if (sqr > 0 && sqr < n) left = mid;
else right = mid;
}
assert(right == left + 1);
return false;
}
// Computes the square root of argument
// Returns some negative value in the case of overflow
int computeSquare(int v, const vector<int> &powerOfTwoTable)
{
// one-sided binary search for the result range [i/2*v .. i*v]
vector<int> multTable(powerOfTwoTable.size()); // 1*v 2*v 4*v 8*v ...
int left = 1, right = 1, idx = 0, square = v;
do { // O(log(N)) iterations
multTable[idx++] = square;
left = right;
right += right;
if (square > 0) square += square;
} while (right < v);
if (right == v) return square;
square = multTable[idx - 1];
if (square < 0) return -1;
// now performing bisection of the range
for (int p = idx - 2; p >= 0; p--) { // O(log(N)) iterations
int mid = left + powerOfTwoTable[p]; // == (left + right)/2
if (mid == v)
return square + multTable[p]; // might overflow but this is ok
if (mid < v) {
square += multTable[p];
if (square < 0) return -1;
left = mid;
}
else {
right = mid;
}
}
assert(right == left + 1);
return square;
}
// Asymptotic complexity - O(1) space, O(1) time. Not a joke :)
bool containsRepeating2CharSubsequence(const string& str)
{
const int alphabetSize = 26;
const int threshold = alphabetSize*2;
if (s.size() > threshold)
return true; // according to pigeonhole principle
int len = min(str.size(), threshold);
// let me write something purposefully ugly here to demonstrate
// that asymptotically it doesn't matter [still O(1) space, O(1) time] :)
for (int c1 = 0; c1 < len-3; c1++) {
for (int c2 = c1 + 1; c2 < len-2; c2++) {
for (int c3 = c2 + 1; c3 < len-1; c3++) {
if (str[c1]==str[c3]) {
for (int c4 = c3 + 1; c4 < len; c4++) {
if (str[c2] == str[c4]) return true;
}
}
}
}
}
return false;
}
I realized, that I was wrong. You algorithm is fine (and in fact identical to simon.zhangsm's solution if you trim the allocation to k+1).
The difference btw (simon's and yours) vs (my and anonymous') solution - we interpret values of k differently - you count buy-wait-and-sell as a single transaction, vs. counting buy and sell as separate
operations, but we do generate same results.
(to avoid confusing others - I deleted original post where I'm saying that your solution is wrong)
O(kN) time, O(k) space.
k - is # of total transactions (buy and sell count as separate operations)
int maximizeProfit(const vector<int> &v, int k)
{
vector<int> s0(k+1), s1(k+1);
s1[0] = INT_MIN;
for (int j = 1; j <= k; j++) {
s1[j] = -v[0];
}
for (int i = 1; i < v.size(); i++) {
for (int j = k; j>=1; j--) {
s1[j] = max(s1[j], s0[j-1]-v[i]);
s0[j] = max(s0[j], s1[j-1]+v[i]);
}
}
return s0[k];
}
Unfortunately, the 2nd part of your solution gives minimum connections between individual pair of ponds but doesn't give you the global minimum (as requested).
First of all you don't have to connect each pair of ponds (minimal spanning tree is enough), and second there could be a configuration of k (k>2) ponds which can be more efficiently connected using star-like (or cross-like) connection, instead of connecting individual pairs.
Consider following example. It shows that globally-optimal solution doesn't always consist of shortest distances between pairs of ponds.
.........
....0000.
00..0..0.
0...0....
0........
000...000
........0
....0...0
.0..0..00
.0000....
I noticed, that the only allowed change is to change 1 into 0 which makes the remaining part of the problem simpler (e.g. we can only connect ponds).
Here's the sketch of an algorithm:
1)
Finding ponds in NxN matrix is trivial - O(N^2) time with BFS traversal for finding connected components.
2)
Next step is to add all points which belong to the edges of all ponds to the queue [we can do this as a part of step 1] and continue running BFS extending all ponds until we fill all the matrix. We'll end up with regions similar to what we see on voronoi diagram with manhattan distance metric. This will require O(N^2) time.
3)
What's nice about the resulting data - the boundaries of computed areas give us all possible shortest connections between each pair of ponds including "star-like" connections between N ponds.
Based on that we can compute the graph where nodes are ponds and edges are connections between ponds. Graph might contain fake nodes for star-like connections. The weight of the edge - is number of changes (manhattan distance metric).
4)
Then we run a minimal spanning tree algorithm on the computed graph which will give us the minimal connections. The spanning tree search should be slightly modified to exclude fake nodes from tracking, so they aren't constrained to be part of the resulting tree.
----
Time complexity O(N^2 + P^2) where P is number of ponds.
Any other ideas? I personally don't like my solution - sounds too complicated as for interview problem. Please criticize and suggest alternatives.
As Victor said - this is open-ended question. I can give at least 5 explanations why this can happen.
However, most likely that happens because of the uninitialized variable on one of the future stack frames which is created after printf call.
Without printf unitialized variable has some value (whatever happens to be on the stack). call to the printf modifies the stack changing the future value of that variable. This triggers different runtime behavior.
The question has certain amount of uncertainty/ambiguity. It is actually pretty good question if the goal is to trigger the discussion to resolve that.
For instance, O(log(N)) possible when tree is balanced and one of the following condition is met
a) k << N, so we can use in-order traversal
b) each node store the count of nodes in the subtree rooted by the node, so we can find kth element recursively.
incorrect. This is generally unsafe.
It could be safe only in some edge cases.. E.g. on most compilers it will be safe IF: class already had a virtual method AND the destructor is never called on a base class pointer from the client code (directly or indirectly via all forms of inlining).
This is obviously a C++ question (mentions virtual destructor).
C++ ABI isn't standardized and answers depend on bunch of implementation details, but for most of the compilers all the changes a,b,c,d are generally unsafe.
For each case (a,b,c,d) there are numerous "safe" special cases.
For instance changing a signature of non-virtual method which was never called is safe, or introducing a data-field to a class whose layout is never observed by the client code (directly or indirectly via inlined functions), so on.
The code above is incorrect, since it doesn't handle an important edge case [of 0 and negative numbers].
Handling of such edge cases is important for these kind of interview questions. Here's correct version:
bool power2(int a)
{
return (a > 0) && ((a & (a-1)) == 0);
}
>The proof doesn't cover non-comparison based models
- 0xF4 January 05, 2015it doesn't need to. We're within comparison model, since data structure implements MAX() operation. Finding maximum of two (or more) numbers is, by definition, a comparison operation, so building sorting algorithm with the help of series of MAX() yields a comparison sorting algorithm.
In fact I don't even need to prove that it must be a comparison sorting.