Chris
BAN USERSWE
 0of 0 votes
AnswersA new species has been discovered and we are trying to understand their form of communication. So, we start by identifying their alphabet. Tough Mr. Brown managed to steal a vocabulary that has words in it in ascending order and a team already identified the letters. So, we have now ordered words as sequences of numbers, print every letter once in ascending order.
 Chris in United States
[3,2]
[4,3,2]
[4,1,1]
[1, 2]
[1, 1]
alphabet: [3, 4, 2, 1] Report Duplicate  Flag  PURGE
Software Engineer Data Structures
Treat it as a stream: when moving the window smartly k elemts leave the window and k enter. Let's suppose we do partition the first k*k window, to find the median, this takes O(k*k). Then I move right, I will know of the k elements I remove and I get k elements to add. If I only added k elements I could simply use two heaps and kind of calculate a "running" mean. But now, I have to remove k elements as well. So I'd need a heap where I can find and remove a specific element. One can do this with a custom bin heap implementation and use a HT as index.
Let's see, the heaps are O(k^2) in size. Initial creation is O(k*k*lg(k)). Then for each of the (nk)^2 moves it takes 2*lg(k). So O(k^2*lg(k^2)+n^2*lg(k)). That is O(n^2*lg(k)). Instead of O(n^2*n^2) or O(n^4*lg(n^2)) depending on brute force approach.
I think decent, improvement possible by Fibonacci heaps, but either way a bit ugly to code due to the special requirement on having a trackable heap element that can be referenced (indexes) by hash table.
what should it return: the index of max value, if max values occurs multiple times, should it have a state and cycle through the max values or should it return one of the indexes with equal probability?
if the question was with equal probability, then this would be the solution:
int getIndex(const std::vector<int>& numbers) { // suppose, MIN_INT doesn't occure in numbers
int maxNum = numeric_limits<int>::min();
size_t idx = 1;
size_t maxNumCtr = 0;
for(size_t i = 0; i<numbers.size(); ++i) {
int num = numbers[i];
if(num > maxNum) {
idx = i; // probability 1
maxNumCtr = 1;
} else if (num == maxNum) {
maxNumCtr ++;
idx = ((rand() % maxNumCtr) == 0) ? i : idx;
// P(idx=i) = 1/(maxNumCtr+1): 1/2 for 2nd, 1/3 for 3rd...
// you can prove it's uniform by induction
// if P((rand() % maxNumCtr) == 0) would be 1/maxNumCtr which it is not exactly.
// modulo will not generate uniform distributions
}
}
return idx; // 1 if empty array
}
the solution is O(n), the only improvement when knowing maxCount previously would be, to select which occurence of maxValue would be picked in advance. This results in less calles to rand (which is usually not a big problem because they are fast) and in O(n/2) average runtime which is still O(n). On large n certainly a plus, expecially if max values tend to be in the front of the vector opposed to the end.
Further if the function get's called repeatedly with the same vector<int> one could change the signature to a set and a query function and remember the maxValue indexes, so we had a O(n) scan first and O(1) subsequent calls.
Highscalability.com is a good source.
Http question: use long polling from the client, so he can get notifications
Scalability: I would not necessarily say it's not scalable, there is not enough info given here to judge that. But the storage of pants in fields in classical relational manner is a bit hard to scale and unnecessary. You could just store a object as whole which contains the two set of pans with coordinates or the field as a whole. This can then be easily cached as a whole. That's easier to scale.
The password: it's bad practice to store it with encryption, store the hash and a seed, LS :)
Highscalability.com is fun to read, then search for Google's Jeff Dean videos on YouTube, which I think is inspiring.
just do the symbolic multiplication by following the recursion
rec(res, i) = [
rec(res + {a}, i1) for all a in arr2d[i] if i >= 0
res if i < 0
]
result = {rec({}, len(arr2d))}
the code
#include <vector>
#include <iostream>
using namespace std;
void symbolicMulRec(const vector<vector<char>>& input, vector<char>& current, size_t& resCtr)
{
if (current.size() == input.size()) { // base case
if (current.size() == 0) return; // empty reesult, input.size() == 0
cout << (resCtr++ == 0 ? "{" : ",{") << current[0];
for (size_t i = 1; i < current.size(); ++i) {
cout << "," << current[i];
}
cout << "}";
} else {
current.push_back(' ');
for (char c : input[current.size()  1]) {
current.back() = c;
symbolicMulRec(input, current, resCtr);
}
current.pop_back();
}
}
void symbolicMul(const vector<vector<char>>& input)
{
cout << "{";
vector<char> current;
size_t resCtr = 0;
symbolicMulRec(input, current, resCtr);
cout << "}" << endl;
}
int main()
{
symbolicMul({});
symbolicMul({ {}, {} });
symbolicMul({ { 'a', 'b'}, {'c', 'd'} });
symbolicMul({ { 'a', 'b' },{ 'c', 'd' }, {'e', 'f'} });
/* output
{}
{}
{{a,c},{a,d},{b,c},{b,d}}
{{a,c,e},{a,c,f},{a,d,e},{a,d,f},{b,c,e},{b,c,f},{b,d,e},{b,d,f}}
*/
}

Chris
September 13, 2017 In O(1) because it sais "each element points to the index of the next element".
That means every element's value must be [0..n1]. So, it points either to itself or
an other element. Pointing to itself is a cycle. Thus every such array with at least
1 element will contain a cycle.
Can be proven by induction or by intuition. By intuition: try to form a linked list
without loop when there is no special indication for the nextpointer to mark the end.
That's probably not the intention of the question ;), So I change interpreting arr[i] = i as loop, but rather interpret it as vertex with no outgoing edge (end node).
For convenience I keep the assumptions there can't be elements with value < 0 or >= n,
when n is the number of elements in arr. Now the graph has more interesting properties:
a) it is directed
b) every vertex has at most one outgoing edge (linked list)
The question is, no extra space is allowed, but are we allowed to modify the array?
If not then use the brute force approach (O(n^2)).
If modification is allowed, next question is do I need to undo the modification after
the algo finishes. I assume I can modify and don't have to undo:
Assuming I start DFS at a single vertex, it will detect a cycle if it revisits a vertex.
If I start DFS from an other vertex, visiting an already visited vertex could mean:
 it was visited by a prior DFS run from an other vertex: cylce if prev. rund found cycle
 it was visited by this DFS run: cycle
Usually with DFS there is an enter and leave counter helping to distinguish between cycles forward edges and cross edges. Just a visited falg isn't enough (that's why some solutions here fail on test case 1 below). But since I have at most one outgoing edge, in a single DFS pass there can't be cross edges nor forward edges. But among different DFS passes I need to distinguish between whether I step into a cycle or hit a previous DFS tree.
I start at every vertext s: 0 <= s < n
I follow it's path until I get to a vertex with value >= n. While following that path
I change each vertext value to n + s. If the last vertex visited for this start
has a value of n + s, it was a cycle.
This will have O(n) time complexity because each vertex is visited exactly once.
#include <vector>
#include <iostream>
#include <limits>
using namespace std;
// solution 1: no extraspace, doesn't modify given vector, time complexity O(n^2)
// considers arr[i] = i as end node, not as cycle
// arr[i] < 0 and arr[i] >= n are not allowed
bool hasCycleBf(const vector<unsigned int>& arr) {
size_t n = arr.size();
for (size_t s = 0; s < n; ++s) {
size_t i = s, l = 0;
while (l <= n && i < n && arr[i] != i) {
i = arr[i];
l++;
}
if (l > n) return true;
}
return false;
}
// solution 2: marking visited elements, time complexity O(n)
bool hasCycle(vector<unsigned int> arr) // copy arr here, lazy programmer
{
size_t n = arr.size();
// do a kind of DFS from each starting index
for (size_t s = 0; s < n; ++s) {
size_t i = s;
while (i < n && arr[i] != i) {
int t = arr[i];
arr[i] = n + s;
i = t;
}
if (i == n + s) return true;
}
return false;
}
int main()
{ //0,1,2,3,4,5
cout << hasCycleBf({ 0,1,5,4,3,5 }); // cycle
cout << hasCycleBf({ 1,2,3,4,5,0 }); // cycle
cout << hasCycleBf({ 1,2,3,4,5,5 }); // no cycle
cout << hasCycleBf({ 1,2,4,4,5,5 }); // no cycle
cout << endl;
cout << hasCycle({ 0,1,5,4,3,5 }); // cycle
cout << hasCycle({ 1,2,3,4,5,0 }); // cycle
cout << hasCycle({ 1,2,3,4,5,5 }); // no cycle
cout << hasCycle({ 1,2,4,4,5,5 }); // no cycle
return 0;
}

Chris
September 13, 2017 @prashant.tah
maybe imagine it that way:
 if you extract the words from the input buffer and then compare each extracted word against w, what do you do:
 extract approximately n/m words in O(n)
 after (or while) doing that you do for each word at most m comparsions, which is again O(n)
 total of O(n)
or another way:
 while finding word separators in the input buffer, you could compare against w, knowing if w
is a match while you extract. This is O(n), too  slightly more obvious.
But if you try to search a pattern within each extracted word with a naive approach you would do:
 extracting n/m words in O(n)
 for each word of length m and the pattern of length p, perform p*(mp+1) comparisons.
For pattern matching in linear time KnuthMorrisPratt is especially interesting because it covers the whole prefix and suffix discussion which I think gives some insight (a pattern is a match if it has a common prefix on a common suffix  sounds strange until it makes click).
I think your solution idea is good. But it's a simple question. So, what was the interviewer interested in?
 the algorithm runs in O(n), and most time is consumed by fetching data from disc to memory.
 the queue contributes to memory usage and constant time factors. A ringbuffer could help on reducing constant factors on time and space complexity, (less copying around) but this optimization will come for the price of code complexity, so it needs to be well decided.
Other than that, I only see he was thinking of a pattern instead a word to look for, which would explain your O(n*m) time. In this case, preprocess the pattern so matching is O(n) again.
just for completeness, here the animated (console, take PS cons on windows) just to demonstrate how the solution would actually look like. Note that the Traveling Salesman part is implemented naive, so just adding 9 cakes will be already slow as it checks 362'000 routes. Of course there are much better implementations. But this shows the problem and a naive solution. It's fun, the real fun is optimizing it...
#include <vector>
#include <unordered_set>
#include <queue>
#include <cassert>
#include <list>
#include <iostream>
#include <string>
#include <thread>
using namespace std;
struct Point
{
int x_, y_;
bool operator== (const Point& rhs) const { return rhs.x_ == x_ && rhs.y_ == y_; }
bool operator!= (const Point& rhs) const { return !(*this == rhs); }
};
constexpr char toLower(char c) { return c >= 'A' && c <= 'Z' ? c  'A' + 'a' : c; }
void clearScreen() { cout << "\033[2J\033[" << 1 << ";" << 1 << "H"; }
void printAt(int c, int r, const string& str) { cout << "\033[" << r << ";" << c << "H" << str; }
class CockieSolver
{
string field_;
int m_; // matrix number of columns (x)
int n_; // matrix number of rows (y)
// coordinate of all destinations, destinations are {start, all cookies, end=friend}
vector<Point> destinations_;
vector<vector<int>> destDist_; // all pair distances of paths among destinations
// paths between all pair of destinations (memory wasting, due to lazy programmer)
vector<vector<list<Point>>> destPath_;
const pair<int, int> MOVES[4]{ {0, 1}, { 1,0 }, { 0,1 }, { 1,0 } }; // Manhatten moves
public:
// 'c' denotes a cookie, ' ' moveable, '#' wall, 'F' friend to finish,
// 'S' start (0,0) in the problem statement
CockieSolver(string&& field) : field_(field)
{
parseField();
}
void solveAndPrint()
{
if (!findAllPairShortesPath()) {
cout << "there are cookies I can't get, abort" << endl;
return;
}
auto bestRoute = findShortestRoute();
clearScreen();
cout << "solved the matrix with path length " << bestRoute.first << endl;
cout << field_ << endl;
for (int i = 1; i < bestRoute.second.size(); i++) {
int from = bestRoute.second[i1];
int to = bestRoute.second[i];
for (const Point& p : destPath_[from][to]) {
printAt(p.x_ + 1, p.y_ + 2, "*");
this_thread::sleep_for(200ms);
}
}
printAt(0, n_ + 2, "press any key");
getchar();
}
private:
// crazy O((destinations_.size()  2)!) solution that tries all possible routes to
// get all cheese cakes
pair<int, vector<int>> findShortestRoute()
{
unordered_set<int> usedDests;
vector<int> bestRoute;
vector<int> curRoute;
int bestRouteLen = numeric_limits<int>::max();
curRoute.push_back(0); // "start at start"
usedDests.insert(0); // can't take start anymore
findShortestRouteRec(curRoute, 0, usedDests, bestRoute, bestRouteLen);
return{ bestRouteLen, bestRoute };
}
void findShortestRouteRec(vector<int>& curRoute, int curRouteLen, unordered_set<int>& usedDests,
vector<int>& bestRoute, int& bestRouteLen)
{
if (curRoute.size() == destinations_.size()  1) {
curRouteLen += destDist_[curRoute.back()][destDist_.size()  1]; // end is fixed
if (curRouteLen < bestRouteLen) {
bestRoute = curRoute;
bestRoute.push_back(destDist_.size()  1);
bestRouteLen = curRouteLen;
}
}
for (int next = 1; next < destinations_.size()  1; ++next) {
if (usedDests.find(next) != usedDests.end()) continue; // can't use it
usedDests.insert(next);
curRouteLen += destDist_[curRoute.back()][next];
curRoute.push_back(next);
findShortestRouteRec(curRoute, curRouteLen, usedDests, bestRoute, bestRouteLen);
curRoute.pop_back();
curRouteLen = destDist_[curRoute.back()][next];
usedDests.erase(next);
}
}
// most primitive implementation, after this step, the distance between each
// destination is known as well as the path (how ever primitive this is, the bottle
// neck is to build the route
bool findAllPairShortesPath()
{
// in a undirected sparse graph, allpair shortest path can be done using V times
// BFS. That's okay, there are improvements although, using dual source BFS,
// A* etc. It will limit the number of vertices we need to discover for each pair
int destCt = destinations_.size();
destDist_ = vector<vector<int>>(destCt, vector<int>(destCt, numeric_limits<int>::max()));
destPath_ = vector<vector<list<Point>>>(destCt, vector<list<Point>>(destCt));
for (int start = 0; start < destCt; start++) { // for all destination nodes
// this is shortest path on the map between the coordinates of start
// and all fields of the map. This will tell us, how far are the other
// destinations from that start destination
vector<vector<int>> mapDist(m_, vector<int>(n_, numeric_limits<int>::max()));
vector<vector<Point>> mapParent(m_, vector<Point>(n_, { 1, 1 }));
Point startPt = destinations_[start];
mapDist[startPt.x_][startPt.y_] = 0; // start with 0 distance on map
deque<Point> q({ destinations_[start] });
while (!q.empty()) {
Point cur = q.back(); q.pop_back();
int curDist = mapDist[cur.x_][cur.y_]; // distance of curPt from start
for (const auto& move : MOVES) {
Point next = Point{ cur.x_ + move.first, cur.y_ + move.second };
if (canMove(next) && mapDist[next.x_][next.y_] == numeric_limits<int>::max()) { // not yet visited
mapParent[next.x_][next.y_] = cur;
mapDist[next.x_][next.y_] = curDist + 1;
q.push_front(next);
}
}
}
// read out for all pairs (start, end) the distance from start to end
// backtrack the path as well (total memory nonsense of course, but
// for simplicity to later print the path)
for (int end = 0; end < destCt; ++end) {
Point destPt = destinations_[end];
destDist_[start][end] = mapDist[destPt.x_][destPt.y_];
if (mapDist[destPt.x_][destPt.y_] == numeric_limits<int>::max()) {
return false; // can't reach this point, no solution
}
list<Point> path;
while (destPt != startPt) {
path.push_front(destPt);
destPt = mapParent[destPt.x_][destPt.y_];
}
destPath_[start][end] = move(path);
}
}
}
inline bool canMove(const Point& pt) const {
return pt.x_ >= 0 && pt.y_ >= 0 && pt.x_ < m_ && pt.y_ < n_ && field_[pt.y_ * (m_+1) + pt.x_] != '#';
};
// parse string field
void parseField() {
m_ = 1;
n_ = 0;
destinations_.clear();
destinations_.push_back({1,1}); // start
Point end{ 1, 1 };
int pos = 0;
int j = 0;
while (pos < field_.length()) {
int i = 0;
while (pos < field_.length() && field_[pos] != '\n') { // only \n, no \n\r etc...
switch (toLower(field_[pos])) {
case 'c': // cookie
destinations_.push_back({ i, j });
break;
case 's':
destinations_[0] = { i, j };
break;
case 'f':
end = { i, j };
break;
}
pos++;
i++;
}
assert(m_ == 1  m_ == i); // each j must contain same # of iums
m_ = i;
n_++;
j++;
pos++;
}
destinations_.push_back(end); // end is the last destination
assert(destinations_.front().x_ >= 0 && destinations_.back().x_ >= 0); // start and end must be set
}
};
int main()
{
CockieSolver cs(
"S C \n"
"### ######### \n"
" C # \n"
" # \n"
" C C## # F \n"
"## #### # \n"
" C # C \n"
" ##### # \n"
" C \n");
cs.solveAndPrint();
}

Chris
September 11, 2017 Hum, it's travelling salesman: shortest path to collect all cheese cakes.
1) find all cakes: O(n*m)
2) do all pair shortest path among all pairs of {cheese cakes, start, end}.
3) find optimal route (np hard problem): you can try all routes which is O(n!) or accept an error an do linear optimization (or solve it in P and prove p=np and get famous and watch the world turn upside down)
There are two problems to solve here, the pairs shortest path wich demands for a quadratic amount of shortest path, so the shortest path algos it self should be fast. The graph is sparse and directed, so with classical algos the best is dual source bfs, since it's a map, a* will outperform it if the blocking fields do not introduce a complex maze...
The other is traveling salesman, where the best is linear optimization, e.g. simplex. Focus on the subproblem you feel most comfortable in the 40 Minutes and mention the other briefly or does anyone code both well in 20 min?
@Alex, thanks, you're right the whole graph strategy code is pretty much crap ;( I wanted to do everythin in once, but that outcome is wrong. My first idea, that works, is to remove all cycles first. Then add flow from forward and cross edges to existing (shortest) path. I should stick with it. I still don't think that's the most efficient way to solve it in the graph, but the impl above was just wrong. need to remove it
 Chris September 08, 2017@tryingtolearn: an other thing, the offset is tempting but you need to take into account the following:
if you create a sum over k element, and each element is original value + offset, you have k offsets in the sum. if a+b+c=d, then sum(a+o,b+o,c+o) = d+3*o, but you would offset d only by one o (as it is in the array). For the dp to work, you either assume positive numbers or, create a dp range into the negative and positive (which actually should be possible)
@trying to learn: indeed there is a dp solution to calculate subsets that sum up to a target value (a pseudo polynomial solution). One can modify it for this problem, one can even get all subsets for a given target, but at this point you need to choose one or none of this subsets and repeat for all choices and all following choices to build the min. It's codeable, certainly, but it's ugly, more efficient then the other solutions here, but still not nice. There might be a more elegant step from the sumproblem to deciding which subsets to use, but I haven't really found that. It looks like a contest question, so there might be some hacky dp that works with special conditions, which are not really given here (maybe elements are unique, etc..)
btw. for the perfect sum dp, you do not need to sort  as for knapsack, no soting is required, because the question is only is the element in or not and basically what you do is iterate over the range of the desired sum and mark all sums that you can reach with (and without from previous iterations) that element. How ever, sort doesn't hurt, because it's an order of magnitude faster than the actual DP ;)
1. I need to find subsets that sum up to an element in the array, that is actually itself an NP complete problem. So I run this problem n times (for each element in the array). But it's worse, because I need the actual subsets which can be many (Teta(n!) .. e.g. assume 0,1,1,11,1,1,1,1)
2. of all this subsets I need to find out what the set of disjoint subsets is that has the most elements.
difficult, any hints, ideas?
A valid path of length n=4 would be 1,8,3,4 an other is 1,8,1,8 and yet an other is 1,8,3,8
I think there is no closed form, or at least no obvious one because from some fields you can do two moves, from others 3, so we might need to calculate using the recursion
rec(num, i) = rec(8, i1) + rec(6, i1) if i >= 0 and num = 1
rec(7, i1) + rec(9, i1) if i >= 0 and num = 2
etc..
0 if i < 0
result = sum(num, n1) for each 0 <= num < 9
instead of hardcoding the moves it might be neater to create a table of moves and use that one.
this recursion has a common subproblem, that is for each pair (num, i) the number of combinations. I can iteratively build it up in an O(n) with O(1) space or just insert memoization on the recursion using a HT (using O(n) memory)
 Chris September 04, 2017@Makarand: I had the same results except your last step. I think the last step is incorrect because at your second step b receives 689, and spends nothing, at the last step she receives only 204.
but neither can I conclude, how the get to the 482 flow from b to w as the question suggest as result. Especially not because b clearly receives money on all edges. I think there is an error with the question.
Further, if the graph is more complex, there will always be several options where you can reroute the cash flow, some will in the near future pay out to be better because it allows further reduction whereas the other path wouldn't allow this. Thus it seems there is no greedy approach to the problem.
Neither is clear if we can introduce new flows among nodes or if we can only work on existing edges. Then if an edge has only a positive flow, can we reverse it or would that be a new edge?
I would solve it calculating netflow in each node (sum of incomming  sum of outgoing). Then I would try to connect negative flow to positive flow such that the number of connections is minimal. I think that's a bipartite matching (graph) problem, I'm not fit enough with all the flow network algorithms to get a clear statement how to solve it optimally. Maybe someone could help here.
@Alex: agreed a better description would be useful.Anyway I started reading into the topic a bit and figured out it's a well known topic.
I read this survey, sis.uta.fi/cs/reports/pdf/A19983.pdf
or found knuths: cs.utsa.edu/~wagner/knuth/fasc4a.pdf enlighting paper ;)
Is the order fixed in which you have to visit the channels or can you pick the order?
what is the starting start and end channel used for? Does it mean if you start the channel "start" was running and channel "end" should be set after visiting all channels in that array?
I assume the order is fixed and we can create an array C [start, A[0], A[1], ... ,A[n1], end] that is the sequence I have to tune channels in.
I need seiling(log10(C[i]+1)) key hits to enter the number C[i]
I need ABS(C[i1]  C[i]) key hits to tune in using + or 
I need ABS(C[i2]  C[i]) + 1 key hits to tune in using back and + or 
I will have to use the sum of above min for each C[i] 0 < i < n key hits to tune in all assuming C[0] = start was tuned in when I started.
point update? do you mean element update?
fenwicktree, segmenttree if both queries have about same frequencies
an array with the calculated sum, if much more range queries occure than element ubdates
an array where I calculate the sum every time it is asked, if much more element updates happen than range queries
straight forward DP solution with O(n*m) space and time complexity.
the recursion is
max_rect(i, j) = min(max_rect(i1, j), max_rect(i, j1), max_rect(i1, j1)) if i > 0 and j > 0
0 if i < 0 or j < 0
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
void findMaxSquare(const vector<vector<int>>& matrix)
{
int m = matrix.size();
int n = m > 0 ? matrix[0].size() : 0;
vector<vector<int>> dp(vector<vector<int>>(m+1, vector<int>(n+1, 0)));
int max_sq = 0;
int max_i = 1;
int max_j = 1;
for (int i = 0; i < m; ++i) {
for (int j = 0; j < n; j++) {
if (matrix[i][j] == 1) {
int cur = min(dp[i][j], min(dp[i][j + 1], dp[i + 1][j])) + 1;
dp[i + 1][j + 1] = cur;
if (cur > max_sq) {
max_sq = cur;
max_i = i  cur + 1; // from top
max_j = j  cur + 1; // from left
}
}
}
}
cout << "edge length: " << max_sq << ", from top: " << max_i << ", from left: " << max_j;
}
int main()
{
findMaxSquare(
{
{1,0,0,1,0,0},
{1,0,1,1,1,1},
{0,1,1,1,1,0},
{1,1,1,1,1,0},
{1,0,1,1,1,0},
}
);
}

Chris
August 31, 2017 @Alex: it's not uniform, assume n = 3: P(1/3) for left size to be 0,1,2 in first recursion
if it is 1, we are done with the three
O
/ \
O O
but with P(1/3) left size = 0 you still have two options:
O O
\ \
O O
\ /
O O
so 4 of the 5 possibile trees have P(1/2*1/3) and one has P(1/3)
that is not uniform. but I thought of this one, too, it's tempting...
I would differentiate the sizes of the array, if A and B are about the same size we can sort both and perform a merge like operation:
int j = 0;
for(int i = 0; i < a.size(); i++) {
while(j < b.size() && b[j] <= a[i]) j++;
result[i] = j;
}
If we can do radix sort, we end up with a linear algorithm.
If B forms an unsorted sequence of consecutive elements such that B[i]1 is contained in B except when B[i] = min(B), we can create a histogram in O(B) and query that in O(1) for a total space complexity of O(B + A)
I don't think there is a truly linear approach to this problem because finding one rank in an unsorted array is O(n). Here we kind of need to find m ranks, so sorting seems the best to optimize querying the rank multiple times, if no special constraints are given.
How ever the emphasis of "unsorted" in the question is interesting.
You are right (@emb, @NoOne) the question is clearly pointing to selecting a binary tree from all possible binary trees (given a number of nodes) and not generating a random sequence to generate a BST from.
I had to read how to do it in linear time and I think it's definitely very hard to do it in an interview. Especially if you need to prove the uniformity.
How ever there are two ideas that could help:
1) Makarand's code generates a *full* random binary tree. That is easier.
2) From a valid string with parenthesis you can generate a binary tree (grammatical method)
following up with 2 would lead to the following challanges:
1) if we generate a string of length 2*n with open and closing parenthesis, what is the probability to produce an open or closed parenthesis at position i. While this in the end is a relatively easy formula r*(k+r+2)/2*k(r + 1), where r is the number of not closed open parentesis and k the number of symbols yet to pick (k=2*ni), I wouldn't be able to draw that formula in an interview.
What I could do is, enumerate all possible strings and pick one. This is exponential of course. One can actually enumerate recursively and pick while enumerating, so at the end of the process one has a valid random string of parenthesis:
n = 0
random_parent_string = ''
def rec(prefix, r, k):
if k>r: rec(prefix + "(", r+1, k1)
if k>0 and r>0: rec(prefix + ")", r1, k1)
if k == 0:
n += 1
if(rand(n) == n1): #assume rand generates uniform [0, n)
random_parent_string = prefix; # 1st with P(1), 2nd with P(1/2), 3rd with P(1/3) ...
then generate the binary tree from this random parenthesis string.
 Chris August 31, 2017an alternative approach using an AST.
Where was this question from? Looks like from a contest.
/*
the productions of the expression are
RootExpression = {Term}*
Term = Variable  Expression
Expression = '(' Term ')'
Variable = [azAZ]
the following structs represent a syntax tree, I omitted
access levels to have shorter code
the application of the transformation can be optimized as
SRRS = S
SRSRSRS = SRS
RSS = RS
etc...
*/
#include <vector>
#include <algorithm>
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
struct Term
{
Term() { }
virtual void simplify() { };
virtual void reverse() { };
virtual void print(ostream& os) = 0;
};
struct Variable : Term
{
char name_;
Variable(char name) : name_(name) { }
void print(ostream& os) override { os << name_; }
};
struct Expression : Term
{
vector<Term*> terms_;
// parses and creates expression structure
Expression(istream& is) {
char chr;
while (is.get(chr)) {
if (chr == '(') terms_.push_back(new Expression(is));
else if (chr == ')') break;
else terms_.push_back(new Variable(chr));
}
}
// reverses
void reverse() override {
std::reverse(terms_.begin(), terms_.end());
for (Term* t : terms_) t>reverse();
}
// simplifies
void simplify() override {
Expression *ex = dynamic_cast<Expression*>(*terms_.begin());
if (ex != nullptr) {
ex>simplify();
terms_.erase(terms_.begin());
terms_.insert(terms_.begin(), ex>terms_.begin(), ex>terms_.end());
}
for (Term* t : terms_) t>simplify();
}
// print
void print(ostream& os) override { os << "("; printSubExprs(os); os << ")"; }
void printSubExprs(ostream& os) { for (Term* t : terms_) t>print(os); }
};
struct RootExpression : Expression {
RootExpression(istream& is) : Expression(is) { }
void print(ostream& os) override { printSubExprs(os); }
};
void transformAndPrintExpression(const string& expression, const string& transformation)
{
RootExpression expr = RootExpression(stringstream(expression));
// optimize transformation string
bool reverse = false;
bool simplify = false;
bool simplifyReversed = false;
for (char t : transformation) {
if (t == 'S') {
simplify = !reverse;
simplifyReversed = reverse;
} else if (t == 'R') {
reverse = !reverse;
}
}
// apply
if (simplify) {
expr.simplify();
}
if (simplifyReversed) {
expr.reverse();
reverse = !reverse;
expr.simplify();
}
if (reverse) {
expr.reverse();
}
expr.print(cout);
cout << endl;
}
int main()
{
transformAndPrintExpression("(AB)(C(DE))", "SS"); // AB(C(DE))
transformAndPrintExpression("(((AB)C)D)E", "SS"); // ABCDE
transformAndPrintExpression("(((AB)C)D)E", "SR"); // EDCBA
transformAndPrintExpression("(AB)C((DE)F)", "SRS"); // FEDCBA
transformAndPrintExpression("(AB(CD((EF)G)H))(IJ)", "SRS"); // JI(H(GFE)DC)BA
transformAndPrintExpression("(A(B(C)D)E(FGH(IJ)))", "SRRSSRSSRRRSSRS"); // JIHFGE(D(C)B)A
transformAndPrintExpression("(A(B(C)D)E(FGH(IJ)))", "SRS"); // JIHFGE(D(C)B)A
transformAndPrintExpression("(A(BC(D(E((Z)K(L)))F))GH(IJK))", "S"); // A(BC(D(E(ZK(L)))F))GH(IJK)/S
}

Chris
August 29, 2017 I'd do a BFS to do topological sort and cycle detection. When I detect a cycle I delete the whole BFS tree I'm in and mark the modules as not buildable. When ever the BFS reaches a not buildable module it does the same as when it had entered the cycle directly (because it wouldn't further explore, I have the buildable flag)
Why ever the samples don't mention the last dependency, I print it as well.
e.g. master > tensorflow > numphy > ubuntu
@majia: does FB meanwhile use TF?
#include <unordered_map>
#include <stack>
#include <vector>
#include <string>
#include <algorithm>
#include <iostream>
using namespace std;
struct Module {
string module_; // name of the module
Module* parent_ = nullptr; // parent from DFS traversal tree
bool buildable_ = true; // can be built
size_t enter_ = 0; // DFS enter
size_t leave_ = 0; // DFS leave
vector<Module*> adj_; // dependencies
};
struct DFSStackItem {
Module* module_;
bool visited_;
Module* parent_;
};
void printBuildOrder(const vector<pair<string, string>>& dependencies)
{
// 1. create adjacency list rep for graph
unordered_map<string, Module> moduleDep;
for (auto& dep : dependencies) {
moduleDep[dep.first].module_ = dep.first;
moduleDep[dep.second].module_ = dep.second;
moduleDep[dep.first].adj_.push_back(&moduleDep[dep.second]);
}
// 2. perform dfs from each item, store departure time
size_t time = 1;
for (auto moduleIt = moduleDep.begin(); moduleIt != moduleDep.end(); ++moduleIt) {
if (moduleIt>second.enter_ != 0) continue;
stack<DFSStackItem> stk({ {&moduleIt>second, false, nullptr} });
while (!stk.empty()) {
auto u = stk.top().module_;
if (stk.top().visited_) {
stk.pop();
if (u>leave_ == 0) {
u>leave_ = time++;
}
} else {
if (!u>buildable_ 
(u>enter_ > 0 && u>leave_ == 0)) {
// cycle or not buildable
u>buildable_ = false;
u = stk.top().parent_;
stk.pop();
while (u != nullptr) {
u>buildable_ = false;
u = u>parent_;
}
} else {
stk.top().visited_ = true; // entered
u>parent_ = stk.top().parent_; // visited from
u>enter_ = time++;
for (Module* v : u>adj_) {
stk.push(DFSStackItem{ v, false, u });
}
}
}
}
}
// copy buildable Module into modules array
vector<Module*> modules;
for (auto& m : moduleDep) {
if (m.second.buildable_) {
modules.push_back(&m.second);
}
}
// topological sort
sort(modules.begin(), modules.end(),
[](Module* a, Module* b) { return a>leave_ > b>leave_; });
// print
for (Module* m : modules) {
cout << m>module_ << " ";
}
cout << endl;
}
int main()
{
printBuildOrder({ { "master", "ubuntu" },{ "numpy", "master" },{ "tensorflow", "numpy" } });
printBuildOrder({ { "python", "numpy" },{ "numpy", "python" },{ "tensorflow", "ubuntu" } });
printBuildOrder({ { "b", "c" },{ "c", "d" },{ "a", "b" },{ "d", "e" },{ "e","c" },{ "f", "g" } });
}

Chris
August 28, 2017 if it needs to be recursive, the best I can come up is:
const int DIRECTIONS[][2] { {1,0}, {0,1}, {1,0}, {0,1} };
void printMatrixCircularRec(const vector<vector<int>>& matrix, int x, int y, int x2, int y2)
{
if (y > y2  x > x2) return;
int dir = 0, cx = x, cy = y;
while (y <= y2 && x <= x2) {
cout << matrix[cy][cx] << " "; // a bit cheap
if (cx + DIRECTIONS[dir][0] < x
 cx + DIRECTIONS[dir][0] > x2
 cy + DIRECTIONS[dir][1] < y
 cy + DIRECTIONS[dir][1] > y2) {
dir++;
if (DIRECTIONS[dir % 4][0] > 0) x++;
if (DIRECTIONS[dir % 4][0] < 0) x2;
if (DIRECTIONS[dir % 4][1] > 0) y++;
if (DIRECTIONS[dir % 4][1] < 0) y2;
if (dir == 4) break;
}
cx += DIRECTIONS[dir][0];
cy += DIRECTIONS[dir][1];
}
printMatrixCircularRec(matrix, x, y, x2, y2);
}
void printMatrixCircular(const vector<vector<int>>& matrix)
{
if (matrix.size() == 0  matrix[0][0] == 0) return;
printMatrixCircularRec(matrix, 0, 0, matrix[0].size()  1, matrix.size()  1);
cout << endl;
}

Chris
August 26, 2017 Why would one union two sorted arrays in memory utilizing multiple core's (or even worse one core with multiple threads) to achieve that? The bottle neck is clearly memory and not CPU.
Did he tell the two sets were on different machines? Did he potentially want to start a discussion about when multi threading makes sense (in parctice I've seen lots and lots of multithreading approaches due to a lack of understanding on overlapped I/O and other concepts)?
Especially in Java, I can't imagine doing anything good in multithreading that task. A different thing is multiple arrays on multiple machines. That's more interesting.
@tnutty2k8: the getMinCost function is extremely inefficient. As it develops all possible spanning trees for one search. Doing min of all DFS trees is not a good approach for shortest path. Further your visited dictionaries v key is a string with c to string and r to string. Thus 11 concatenate 1 = 1 concatenate 11 will both lead to 111 but it's really not the same field...
suggested and fun reading on BFS:
geeksforgeeks.org/breadthfirsttraversalforagraph/
more fun:
youtube.com/watch?v=sCYnVzuh4
The recursion is:
dp(i) = max[dp(i2) + arrary[i], if i >= 0
dp(i1) + array[i], if i >= 0
0] if i < 0
result = max(dp(len(array)1), dp(len(array)2)
can implement it using memoization or construct a dp array in an iterative
manner, and with constant space, optimizing further (because the recursion
only access last elements)
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
int max_subsequence_sum_special(const vector<int>& arr)
{
vector<int> dp(arr.size() + 2, 0);
for (size_t i = 0; i < arr.size(); ++i) {
dp[i + 2] = max(dp[i] + arr[i], dp[i + 1] + arr[i]);
}
return max(dp[arr.size()+1], dp[arr.size()]);
}
int max_subsequence_sum_special_cs(const vector<int>& arr)
{
int dpi = 0;
int dpi1 = 0;
for (size_t i = 0; i < arr.size(); ++i) {
int temp_dpi = dpi1;
dpi1 = max(dpi1 + arr[i], dpi + arr[i]);
dpi = temp_dpi;
}
return max(dpi1, dpi);
}

Chris
August 25, 2017 1. scan the matrix to find the destinations. Store them in a list with (n, x, y)
2. sort that list so n is ascending
3. find shortest path from list[0]x/y to list[1]x/y, then from list[1]x/y to list [2]x/y etc..
4. you can do all sort of shortest path that work on a map. Again A* will be the best here if your matrix is large and the destinations you need to travel are reasonable far from each other. Alternatives are BFS, dual source BFS. But please don't use Dijkstra SP if you don't plan to add heuristics, Dijkstra SP is typically used when you have arbitrary positive distances between nodes in the graph. That's not the case here.
Treat it as a graph problem. start node is the start word. adjacent words are words where one of the characters is changed and the result is in the dictionary. Then you can perform a BFS on it to find the shortest path from start to end.
Optimization potential exists:
 dual source BFS
 Accelerate search with a heuristic (A*, guess the remaining distance as the number of characters to change to the end  which is never an overestimation, so you'll find the shortest path)
 Find adjacents: put the dictionary into a Trie where you can navigate the Nodes inside, so you can change characters at single positions (this wins on very large alphabets in practice and loose on reasonable small alphabets due to cache advantages of Hashtable)
for this simple questions, make sure all border cases, overflows, etc. are covered.
as well, maybe define the allowed token/string exactly (e.g. using regexp)
that would be "\s*0x[09azAZ]+\s*"
#include <string>
#include <iostream>
#include <locale>
using namespace std;
int hexCharToInt(char c)
{
if (c >= '0' && c <= '9') return c  '0';
if (c >= 'A' && c <= 'F') return 10 + c  'A';
if (c >= 'a' && c <= 'f') return 10 + c  'a';
throw "unexpected character";
}
unsigned long hex2int(const string& str)
{
size_t i = 0;
while (i < str.size() && isspace(str[i])) i++; // trailing spaces
if (i == str.size()) throw "unexpected end of string"; // string must not consist of only whitespaces
if (str[i++] != '0') throw "unexpected character"; // a 0x must initiate the hex string
if (i == str.size()) throw "unexpected end of string"; // 0 is not a valid hex string
if (str[i++] != 'x') throw "unexpected character"; // 0x to initiate hex string
if (i == str.size()) throw "unexpected end of string"; // string must not end now
if (isspace(str[i])) throw "unexpected character"; // no space allowed after "0x"
unsigned long value = 0;
while(i < str.size() && !isspace(str[i])) {
unsigned long or = value;
value <<= 4;
if (value >> 4 != or) throw "overflow";
value = hexCharToInt(str[i]);
i++;
}
while (i < str.size() && isspace(str[i])) i++; // trailing spaces, if any
if (i < str.size()  1) throw "unexpected character"; // no other characters expected
return value;
}
int main()
{
cout << hex2int("0x0") << endl;
cout << hex2int(" 0x80") << endl;
cout << hex2int(" 0xff ") << endl;
cout << hex2int(" 0xffffffff ") << endl;
cout << hex2int(" 0x1fffffff1 ") << endl; // should throw overflow exception
cout << hex2int(" 1fffffff1 ") << endl; // should throw unexpected char exception
}

Chris
August 23, 2017 I guess I would start with an O(n^2*p) version (n: #objects, p: #properties): compare for each pair all properties and output if at least one property equals
Alternative, one can build indexes over all objects and all properties (e.g. in a hash table). Then it's finding the sets of objects equal to a single property and union this sets for all properties. That is worst case O(p*n), but probably much better in average (depends on the number of objects that have equality)
how about looking at it as a graph:
 the nodes are connected if consecutive(e.g. if for node with value 3, if there is a
value 2 and/or 4 those would be connected to 3 with an edge)
 then it would be finding the largest connected component in the graph.
1. run through the graph and put all nodes into a HT.
2. loop over all the numbers in the array. If the number has been visited before,
repeat 2 with next number. Otherwise push that number onto a stack and find
component size using DFS, while marking all visited numbers.
3. max on component size
4. on the maxed component, walk to the min value, and add to result from there
I assume no value occurs multiple times, it would be easy to solve with
duplicates using a ht instead of a set where the value would be key and the
frequency value.
requires O(n) extra space, time complexity of O(n)
/*
#include <vector>
#include <stack>
#include <unordered_set>
#include <algorithm>
#include <iostream>
#include <limits>
using namespace std;
void print_longest_consecutive_sequence(const vector<int>& arr)
{
if (arr.size() == 0) return;
unordered_set<int> elements(arr.begin(), arr.end()); // O(n)
unordered_set<int> visited;
int max_component_size = 0;
int max_component_minElement = 0;
for (int e : elements) { // O(n)
if (visited.find(e) != visited.end()) continue; // compensate from 1)
stack<int> st({e});
int component_size = 0;
int min_element = numeric_limits<int>::max();
while (!st.empty()) { // O(n) for each loop, penaltize 1)
int u = st.top(); st.pop();
min_element = min(min_element, u);
visited.insert(u);
component_size++;
for (int adj : { u  1, u + 1 }) {
if (elements.find(adj) != elements.end()
&& visited.find(adj) == visited.end()) {
st.push(adj);
}
}
}
if (max_component_size < component_size) {
max_component_size = component_size;
max_component_minElement = min_element;
}
}
// at most O(n)
int e = max_component_minElement;
while (true) {
cout << e;
auto it = elements.find(e + 1);
if (it == elements.end()) break;
cout << ", ";
e = *it;
}
}
int main()
{
print_longest_consecutive_sequence({100, 4, 200, 1, 3, 2});
}

Chris
August 22, 2017 /*
an O(n^2) solution based on [en.wikipedia.org/wiki/Longest_common_substring_problem]:
the summarized idea is that the longest common sub string of two strings a, b is as well the longest
common suffix(LCS) of ALL prefixes of a and b.
the longest common suffix of two strings a, b is :
LCS(a, b, i, j) = LCS(a, b, i  1, j  1) + 1 if a[i] == b[j]
LCS(a, b, len(a), len(b))
the longest common substring is therefore:
max(LCS(a, b, k, l) for all 1 < k <= len(a), and all 1 < l <= len(b)
if a = b, the longest common substring would actually be a.
But I assume the question asks for all sub strings that are not prefixes or suffixes of a,
assuming "banana" is a suffix and prefix and substring of "banana".
therefore the valid prefixes must not start at the same index in the given string:
max(LCS(a, a, k, l)) for all 1 < k <= len(a) and all 1 < l <= len(b) where k != l
*/
string maxRepeatingSubstring(const string& str)
{
// that is the max common substring where start and end isn't equal
size_t n = str.length();
vector<vector<size_t>> dp(n + 1, vector<size_t>(n + 1, 0));
size_t max_start = 0;
size_t max_len = 0;
for (size_t i = 0; i < n; ++i) {
for (size_t j = 0; j < n; ++j) {
if (i == j) continue; // that would be the same start
if (str[i] == str[j]) {
size_t cur_len = dp[i][j] + 1;
dp[i + 1][j + 1] = cur_len;
if (cur_len > max_len) {
max_start = i  cur_len + 1;
max_len = cur_len;
}
}
}
}
if (max_len > 0) return str.substr(max_start, max_len);
else return string();
}
int main()
{
cout << maxRepeatingSubstring("banana") << endl;
cout << maxRepeatingSubstring("aaa") << endl;
}

Chris
August 22, 2017 extract all suffixes, sort them and check suffix[i] and suffix [i+1]... I can imagine that's something one can develop in an interview.

for banana:
suffixes[]= {anana, nana, ana, na, a}, sorted: {anana, ana, a, nana, na}
anana > ana: 3
ana > a: 1
nana > na: 2
tehre are:

to create suffixes O(n)
to sort suffixes O(n*n*lg(n))
to compare suffixes (O(n^2))
total: O(n^2*lg(n))
The recursive formula is good for small and constant k. Most recursion can easily be fixed for large k (nk = constant and small). Since the binomial coefficient grows exponential if k is for example n/2 all recursive solutions are of exponential time.
To solve this, there is a naive pseudopolynomial time algo with an upper bound O(target * n * k * k). It's much better than the recursive solutions which are exponential and it only works if the input is positive and no duplicates occur which was the problem statement.
It finds one solution if at least one exists.It works as follows:
It tracks all seen sums but at most one per number of elements and sum (because 4+2 = 6 and 5+1 = 6 both have two elements, there is no need to consider them independently in the future).
unordered_set<int> anySumOfK_poly(const vector<int>& arr, int target, int k)
{
vector<vector<unordered_set<int>>> dp(target + 1, vector<unordered_set<int>>(k));
for (int a : arr) {
for (int i = 0; i + a <= target; i++) {
for (int l = 0; l < k  1; l++) {
if (!dp[i][l].empty() && dp[i + a][l + 1].empty() && dp[i][l].find(a) == dp[i][l].end()) { // only one solution
dp[i + a][l + 1] = dp[i][l]; // copy (at most k elements)
dp[i + a][l + 1].insert(a); // add a
if (l + 2 == k && i + a == target) return dp[i + a][l + 1]; // found a solution
}
}
}
dp[a][0] = unordered_set<int>{ a };
}
return unordered_set<int>();
}

Chris
August 19, 2017 Can you please restate the problem:
Do you want to know, if there is a subtree D in the tree T that has value x in the root of D and value y somewhere in D. If you write binary tree you assume no BST (no order)?
In this case something like:
def find(root, values, i):
if root is None: return False
if root.value == values[i]: i = i  1
if i < 0: return True
return find(root.left, v, i) or find(root.right, v, i)
def find_if_descendant(root, x, y):
return find(root, [x, y], 1)

Chris
August 17, 2017 your matrix looks right, but it has overheads:
 at recursion [0][0] you create [0][1], which will create [0][1], and [1][1] (among others)
 but at [0][0] you create as well [1][0], which leads to create [1][1] (the second time)
you have elements created multiple times, I guess if you count your LList instances it will be 9 instead of 6
Rep
Rep
Repjudydschultz1234, Android test engineer at Eterno Infotech Pvt Ltd
Spent a weekend researching weebles in Nigeria. My current pet project is developing strategies for how to break someone's ...
RepAmber Van is the top rated company that offers friendly and professional removals services.Our featured services includes domestic moves ...
RepJames Gulledge, Dev Lead at ADP
Get you driving in comfort and class in Atlanta! Prestige Luxury Rentals offer a wide selection of premium vehicles Our ...
Open Chat in New Window
@PenChief: I think that's the best solution, as well with stopping when value doesn't toggle anymore. I was surprised he accepted the parent pointer in the Node. How ever it's easy enough to fix that.
 Chris September 20, 2017