Chris
BAN USERSWE
 0of 0 votes
AnswersA new species has been discovered and we are trying to understand their form of communication. So, we start by identifying their alphabet. Tough Mr. Brown managed to steal a vocabulary that has words in it in ascending order and a team already identified the letters. So, we have now ordered words as sequences of numbers, print every letter once in ascending order.
 Chris in United States
[3,2]
[4,3,2]
[4,1,1]
[1, 2]
[1, 1]
alphabet: [3, 4, 2, 1] Report Duplicate  Flag  PURGE
Software Engineer Data Structures
like this
#include <algorithm>
#include <iostream>
#include <string>
#include <utility>
#include <unordered_map>
#include <vector>
using namespace std;
vector<string> topNVideos(const vector<pair<string, unsigned int>>& videos,
size_t topN) {
// TODO: topN is a constant 10 in the given problem, this method only behaves
// nice if topN is small.
unordered_map<string, unsigned int> video_count;
vector<pair<string, unsigned int>> result_count;
for (const auto& v : videos) {
video_count[v.first] += v.second; // TODO: ensure no overflow
// notice here as well, the string as key, which makes the thing not O(n)
// but O(n*avg_string_length(video_title)), probabily in practice we would
// work with an integer e.g. a reference to the actual title ...
// As well notice the copying around of string's which is not nice
// maybe we would use somethign like std::string_view in practice with C++ 17.
}
for (const auto& vc : video_count) {
// insertion sort for small and constant topN ok, makes that thing
// O(N*10) which is O(N). There are better ways to cut off topN.
// (e.g. quicksorts partition, especially if topN doesn't need to be sorted)
result_count.push_back(vc);
for (size_t i = result_count.size()  1; i > 0; i) {
if (result_count[i].second <= result_count[i  1].second) break;
swap(result_count[i], result_count[i  1]);
}
if(result_count.size() > topN) {
result_count.pop_back();
}
}
vector<string> result(result_count.size());
transform(result_count.begin(), result_count.end(), result.begin(),
[](const pair<string, unsigned int>& vcount) > string { return vcount.first; });
return result;
}
int main()
{
vector<pair<string, unsigned int>> movies = {
{"Rambo", 1},
{"Indiana Jones", 2},
{"Lethal Weapon", 2},
{"Indiana Jones", 5},
{"Rambo", 1},
{"Lethal Weapon", 4},
{"Casablanca", 7},
{"Star Wars", 9},
{"The Godfather", 15},
{"Wallstreet", 3}};
for(const string& video : topNVideos(movies, 4)) { // top 4 instead of top 10
cout << video << endl;
}
}

Chris
September 04, 2018 The plain recursion is obviously O(2^n) because if the pattern is something like aa and the string is like aaaa at every position you have two valid options, advance in the pattern or recurse it. So, the naive is exponential. Whether you start at the end or beginning doesn't make a difference.Other examples are ababaa with aba... O(m*n) using memoization isn't that bad.
Better will be preprocessing the generator string, finding out which suffix is a common prefix, so you can optimize backtracking. KnuthMorrisPratt O(n+m) style algorithm... but in 30 Minutes, assuming you first want to be safe by doing some brute force is tough.
took @Gabriels code, made it purely recursive and added missing case.
If you start thinking recursive it's easy:
 for the +: must advance the pattern, but two options with the string, advance or not
 for the *: three options: advance pattern and string, advance only pattern, advance only string
 if not + and not *: if they match advance pattern and string
 termination: if pattern reached the end but string didn't (no match), if both reached the end (match)
do not forget all the border cases, so there is no buffer overrun
#include <iostream>
#include <string>
using namespace std;
using iter = string::iterator;
bool reg(iter is, iter ie, iter rs, iter re)
{
if (is == ie && rs == re) return true;
if (rs == re) return false;
if (*rs == '+') {
if (reg(is, ie, rs + 1, re)) return true; // advance pattern but not input (empty character)
return (is != ie) && reg(is + 1, ie, rs + 1, re); // advance input and pattern, if input not at the end
}
if (*rs == '*') {
if (reg(is, ie, rs + 1, re)) return true; // advance pattern but not input (empty sequence)
return (is != ie) && (reg(is + 1, ie, rs, re)  reg(is + 1, ie, rs + 1, re)); // advance input but not patter OR advance input and pattern
}
return (is != ie) && (*is == *rs) && reg(is + 1, ie, rs + 1, re); // if they match recurse
}
void test(string s, string regx, bool match) {
cout << "TEST CASE input: '" << s << "' expression '" << regx << "'"
<< (reg(s.begin(), s.end(), regx.begin(), regx.end()) == match ? " PASSED" : " FAILED")
<< endl;
}
int main() {
test("baaabab", "ba*a++", true);
test("baaabab", "ba*a+", true);
test("baaabab", "a*ab", false);
test("", "", true);
test("1", "", false);
test("", "+", true);
test(" ", "+", true);
test(" ", "+*", true);
test(" ", "+++", true);
test(" ", "+**", true);
test("", "+**", true);
test("", "+**+", true);
test("a", "+**+", true);
test("zzaazzuu", "+**+", true);
test("zzaazzuup", "+*p*+", true);
test("zzaazzuup", "+a**+", false);
test("aaaaaa", "+**", true);
test("aaaabbba", "+**", false);
test("aabc", "a+++bc", true);
return 0;
}
so what's space and runtime complexity (n is length string, m is length pattern)
space is the stack, and the stack depth will never exceed O(m + n) (actually O(max(n,m)) because it will always follow a branch to the end where it fails latest or succeeds.
worst case time is trickier:
 theoretically it branches of at the * case three times O(3 ^ (n + m))
 practically string: zzzq, pattern: ***z
how to improve is an interesting discussion here:
 preprocessing the pattern to only test the suffix if prefixes are common: this is not trivial with the * and +
 creating an automaton, in it's core the same as above
 no of them is trivial, there is a simpler dp way hidden in the recursion so one can prune some branches for example, if the pattern position and string position has already been senn, one can store the result already obtained. That way it should come down to O(m*n) when adding memoization.
I am not sure I fully understand your approach. What I understand:
 find components of the graph (connected islands of sensors that are within reach)
 if such a components max y and min y is beyond both corridor lines it will block the way through
Finding those components can be done in O(n^2) naively, there is an optimization to this that does not improve worst case, but is better with lots of sensors. The question is, how to optimize looking at n1 sensors to find adjacent sensors of a given sensor.
 rasterize the sensor positions and maintain a list of sensors located in a raster square
 for a given sensor, check the surrounding squares, if raster square is the sensor reach (diameter) you need to check 9 squares, accessing those sensors is O(1) e.g. with a hashtable on raster x,y... if all sensors are located in 9 or fewer squares, you do not win anything, if they are normally distributed over thousands of squares, you will end up in O(n*n/squarecount)
Cool question, thanks for sharing.
1. To only add a single person, do one pass and pick the largest gap, keep start index and gap. O(n) time O(1) space
2. To add multiple, pass once and build a max heap (prio. queue) with gaps and position, ordered by gap, when adding a person remove top element, calculate position where to add, if left or right gap exists, add those to the heap.
O(n) space, initially takes O(n*lg(n)) to form heap, later it takes O(lg(n)) to add a person
The question asks for O(1) access to a random element, you are allowed to preprocess. To pick a random element n O(1) you need to know how many quotes there are, create a random number and access in O(1). If the quotes do not change, you traverse once and store in a second file (the index file) the offset to the quote. You can read the index into memory or random access it (fixed record length). If you do not want to self create an index use a dbms. If you need to maintain the index after adding/deleting quotes, use a btree. Of the stuff doesn't fit on a single machine, shard.
 Chris March 27, 2018// let's assume only 26 characters, all lower case
// a >= b
bool is_greater_or_equal(const string& a, const string& b, const vector<int>& rank) {
size_t l = min(a.size(), b.size());
for (size_t i = 0; i < l; ++i) {
int ao = rank[a[i]  'a'];
int bo = rank[b[i]  'a'];
if (ao != bo) return ao > bo;
}
return a.size() >= b.size();
}
int main()
{
vector<string> words{"cc", "cb", "bb", "ac" };
vector<char> ordering{ 'c', 'b', 'a' };
vector<int> rank(26, 1);
int i = 0;
for (char c : ordering) rank[c  'a'] = i++;
for (size_t i = 1; i < words.size(); ++i) {
if (!is_greater_or_equal(words[i], words[i  1], rank)) {
cout << "not ordered";
return 1;
}
}
cout << "ordered";
return 0;

Chris
February 28, 2018 Theoretical all paths, which is NP hard and makes little sense. You can adapt A* (a star) which estimates the remaining distance e.g. by using euclidian distance. You just do not consider candidates in the queue that are longer (actual + estimate) than a certain value (e.g. two times the distance from your location to the airport)...
That way you could solve it more or less efficiently. A star is basically Dijkstra, but it adds the estimated remaining distance to the actual distance. And it will not end in your case with the first route but it will continue searching untill all paths got to the destination or exceeded the max distance you decide to search for.
The other problem, I assume we can assume the large file is chronological, so we might want to use binary search.
I would first define what a column in a tree is. I think a recursion can help:
#columns(root)=#columns(root.left)+length(root.value)+#columns(root.right),
#columns(empty tree)=0
The position at which to print node value is: (level of the node, #columns(root.left))
If I can't print at x,y positions I need a level order traversal to print line by line.
It is O(n) time and O(n) space (for the level order traversal and the position)
Note, there are better ways to layout the tree, if the tree has branches that can be kind of nested into each other.
Fb, Google wants a design of components, e.g. databases, load balancer, frontend server, Backend, Cache, etc. Then usually a sharding scheme, how do you deal with loads, server fails, etc... It's really versatile, depending on the question.
Think for example how would you store millions of pictures ala flicker, or how do you serve millions of news feeds for Facebook, etc. You will be the active part in this interview, so you will drive the conversation, if you know how to do a scalable database design, try to lead the discussion that way.
OO design might be a problem, as some of these are dogmatic, beauty contests  often not solving a problem  but then there are those situations where it's the right thing  but for YouTube, Facebook, flicker, Twitter etc. This are just not the problems they are faced with or faced with 99% of time.
Maybe ask your self, if I had to build Facebook, what would I care most, what's important. Then go and read/watch what they did  e.g. highscalability . Com.
Cheers and good luck
1. it's unclear why its relevant that push pushes to the head. Because if pop takes out all elements with the same probability, as soon as there is more than one element in it, the "list" acts like a perfect shuffler. What you get out is randomized, no order, nothing. So, you will need to sort in some way. That may be, you push the element and it's original position so you can reconstruct easy, or you sort the output.
2. There is no difference between 2 and 1. You see a shuffled array when you try to access the data. Where it has been shuffled is really not that much of a deal.
I sugest restating the question so it becomes clearer.
@Alex, two things:
1) I I defined it a round table, because I didn't like the special cases on the left and right end. so, 0 would be next to 6. First thing was a straight bench, came to mz mind, but a round one is more convenient with less corner cases and since that wasn't specified.
2) Any chance to PM? Maybe follow my profile link, if your interested.
pangram contains all letters of the alphabet Y. Palindrome reads the same forward and backwards.
Solution for palindrome is trivial, for every word find an inverse word or finding the inverse word missing the last character of the original word (even, odd palindrome) and minimize. If the words are in a dictionary it's O(n*m), n being the number of words, m being the average word length. How to imrpove: start with shorter words, needs some kind of order, use a trie (theoretically faster in constants, practically not due to cash misses)
For minimal pangram, the problem is a minimal cover set, I think it's called. It is known to be NPhard. A greedy approach exists, that is not optimal, one takes the word with the most characters not already taken. This way the problem get's O(Y*n*m), Y is the alphabet, thus constant (e.g. 26), n is the number of words and m is the average word length. If we preprocess the words in O(n*m) we can get the algorithm down to O(Y^2*n) thus to O(n) for constant alphabet.
Questions:
 is the input given as a flag array or array with taken seat positions, is it sorted?
 has the bench a start and an end or is it a circular bench around a circular table?
 Should it be optimzed to seat a single person or multiple people?
Assumption/Definition:
 start with an array of sorted, occupied seats, assume the array is sorted
 to avoid edge cases, assume it's a circular bench around a round table
 n: number of seats
Solution:
 create a binary heap that contains the min distance from a seating between two
taken seats
 when querying for a seat, use that heaps top element, remove it, create two new \
once (left and right) and reinsert it
 O(n) space, O(lg(n)) time per query, O(m*lg(m)) to prepare, assuming, m seats are
taken initially.
Additionals:
what if seats can get unoccupied again once they were occupied?
 in this case, it's better to use a tree instead of a heap, becaue the tree has an
order and it's easy to have a vector with treeitems so, I can access either
using the seat index to merge empty spaces if a person leaves.
 Besides this, it stais the same problem.
#include <vector>
#include <queue>
#include <iostream>
using namespace std;
class SeatFinder
{
struct QComp {
bool operator () (const pair<int, int>& a, const pair<int, int>& b) {
return a.second < b.second;
}
};
size_t n_;
priority_queue<pair<int, int>, vector<pair<int, int>>, QComp> heap_; // start, length
public:
SeatFinder(int seatCount, const vector<int>& takenSeats) : n_(seatCount) {
size_t m = takenSeats.size();
for (size_t i = 1; i <= takenSeats.size(); ++i) {
int start = takenSeats[i  1];
int end = takenSeats[i % m];
putSlot(start, end);
}
}
// returns next empty slot with highest distance to left and right
int getNextEmptySlot() {
if (heap_.empty()) {
putSlot(0, 0);
return 0;
}
if (heap_.top().second < 1) return 1; // returns 1, because there is no space left.
auto slot = heap_.top(); // start, length > start + length is the next taken seat, start is the previous taken seat, length must be >= 2
heap_.pop();
int seatIdx = (slot.first + 1 + slot.second / 2) % n_; // start + floor(length / 2) takes it to the middle
putSlot(slot.first, seatIdx);
putSlot(seatIdx, (slot.first + slot.second + 1) % n_);
return seatIdx;
}
private:
// closed interval [start, end] where start is the previous taken seat and end is the next taken seat
void putSlot(int start, int end) {
int length = (n_ + (end  1)  start) % n_;
heap_.push({ start, length });
}
};
int main()
{
SeatFinder sf(16, {1, 2, 3, 9});
for (int i = 0; i < 16; i++) {
cout << sf.getNextEmptySlot() << endl;
}
}

Chris
January 04, 2018 the question is totally unclear, the 2^32th fibonacci number is extremly large, a number with roughly 1 Billion digits. Even NoOnes refered closed form formula will never calculate it in O(1) unless n is considered to be a constant (2^32) which is a bit silly. In the other hand the question could be interpreted as well as telling for any number between 1 and 2^32 wheter it is a Fibonacci, which then is very easy, since there are 46 fibonacci numbers in this range and those can easily be precalculated.
 Chris January 02, 2018it's an easy question but not that simple:
 is the input an element in the tree or an arbitrary value, I assume an arbitrary value
 I assume the tree is balanced, so the time complexity of below code is O(lg(n)), otherwise O(n) is best. space complexity is here the same as time complexity unless you do the search by patching pointers temporarily.
The naive algo is inorder traversal, keeping the previous element and when the search value is passed, exploring the next element to see which value is closer. O(n)
More sophisticated is:
const Node* nearestElement(const Node* root, int value, const Node* nearest = nullptr)
{
if(root == null) return nearest;
if(root>value_ == value) return root;
if(nearest == nullptr  abs(root>value_  value) < abs(nearest>value_  value)) {
nearest = root;
}
if(root>value_ > value) {
return nearestElement(root>left_, value, nearest);
}
return nearestElement(root>right_, value, nearest);
}

Chris
December 26, 2017 Essentially the same question:
careercup.com/question?id=4509450021896192
Since interviewers gives milliseconds and might have a devices background, depending on allowed jitter, you might need to use RTC events, if that's easily available on your OS. You can craft this into the entioned algo.
Assumptions:
 There is no negative weight
 Multiple weights can occure as in the smaple given
 I assume first kweight means max the topk, according to summed weight
Approach:
 clearly, if the input was sorted according to weight, the highest is the one with all
balls (if no neg. weights)
 the next lower is when excluding the element with least weight
 the next lower is excluding the 2nd lowest weight
 the third is either excluding the 3rd lowest or the 2nd and lowest
The basic idea is pretty straight forward:
 sort input as described above
 maintain a priority queue with the highest combination on top
 When picking an element from that queue, calculate the next options, where it is
not straight forward to decide which will be the next higher one. Put all of them into
the queue, then next loop the highest is picked etc. That is similar to Dijkstra
shortest path. But in order to be efficient, I only calculate next elements when an
item gets picked and this needs a bith of thought, but I can create a tree, similar to
a Fenwick on the fly ...
When an item is picked there are at most two new options, one is to take the next
element, the other is to take the current element and the lowest possible
 For the time complexity, there is the issue we need to produce the results which is
O(n*k). Then there is the sort which is O(n*lg(n)) and the maintenance of the queue which
is O(k*lg(k)). The queue component is irrelevant since always smaller than the sort. Then
there are some copy operations inside the algo, which are executed at most 2k times, thus
resulting in O(k*n). Overall it is O(n*k + n*lg(n))  depending on the input one or the
other term dominiates.
Cool question, never saw this one, really nice! Although a bit hard to come up with the
"next element" algorithm in 30 minutes together with all the rest.
#include <string>
#include <vector>
#include <queue>
#include <iostream>
using namespace std;
using uint = unsigned int;
void print_top_k_subsets(const vector<pair<string, uint>>& input, uint k)
{
if (input.size() < 1  k < 1) return;
uint result_count = 0;
// sort
auto sinput(input); // sorted input, smallest weight first
sort(sinput.begin(), sinput.end(),
[](const pair<string, uint>& a, const pair<string, uint>& b) {
return a.second < b.second;
});
// queue with queitem having least loss on top
struct QueueItem {
int weight_loss;
int last_excluded;
int last_excludable;
vector<bool> excluded;
};
auto qcomp = [](const QueueItem& a, const QueueItem& b) {
return a.weight_loss > b.weight_loss;
};
priority_queue<QueueItem, vector<QueueItem>, decltype(qcomp)> q(qcomp);
q.push(QueueItem({0, 1, static_cast<int>(sinput.size()  1), vector<bool>(sinput.size(), false)}));
while (!q.empty() && result_count < k) {
QueueItem current = q.top(); // copies the vector, not so nice
q.pop();
// print result (should be put into a seperate function)
result_count++;
cout << result_count << "th: {";
int sum = 0;
bool first = false;
for (int i = 0; i < sinput.size(); ++i) {
if (!current.excluded[i]) {
sum += sinput[i].second;
cout << (first ? ", " : "") << sinput[i].first;
first = true;
}
}
cout << "} sum=" << sum << endl;
// get next potential candidates and put it into the queue
if (current.last_excluded > 0) {
QueueItem new_item(current);
new_item.weight_loss += sinput[0].second;
new_item.last_excludable = current.last_excluded  1;
new_item.last_excluded = 0;
new_item.excluded[0] = true;
q.push(new_item);
}
if (current.last_excluded < current.last_excludable) {
if (current.last_excluded >= 0) {
current.weight_loss = sinput[current.last_excluded].second;
current.excluded[current.last_excluded] = false;
}
current.last_excluded++;
current.weight_loss += sinput[current.last_excluded].second;
current.excluded[current.last_excluded] = true;
q.push(current);
}
}
}
int main()
{
cout << "sample 1" << endl << "" << endl;
print_top_k_subsets({ {"red", 4}, {"yellow", 1}, {"green", 2}, {"blue", 2} }, 5);
cout << "sample 2" << endl << "" << endl;
print_top_k_subsets({ { "red", 4 },{ "yellow", 1 },{ "green", 2 },{ "blue", 2 }, {"amber", 1}, {"black", 2}, {"white", 1} }, 15);

Chris
December 10, 2017 That's a systems design question, not coding, I assume.
 I define the thing I want the rolling statistics as fact (e.g. user id, hashtag, ...)
 Then I would split the time window into slots such a way that the timewindow / slots provides a reasonable update rate for the statistics while it keeps the number of slots small enough for a feasible solution.
 Now I have two things: a queue which represents a time slot. Each queue items holds a hash table which counts for the seen facts the occurrences (frequency)
 Further I keep a hashtable that will count the occurrences total, it will have the fact as key and a refrence to a treeitem as value. The treeitem has the frequency as key and the fact as value.
 If an item arrives, I will update the head of the queue or create a new queue item if a new slot is touched by the timestamp of the arriving item. I will as well increment the frequency of that item in my hashtable>treeitem structure
 I will check as well the oldest items in the queue and if they are older than the new item arrived, I will subtract the keyvalue pairs from my main statistic and throw away that queue item away
Now I can always query my tree for the topk keys and receive the the topk items.
This design has a few assumptions and weak points:
 it is optimized for a high query rate on top most k items.
 it is slightly wrong, because I add items as they arrive to the hashtable > treeitem structure, but remove them with a lower granularity of time per slot
 I only update the statistics if items arrive, assuming, items arrive always and if not, the statistics relatives don't change (which is true, but maybe top k should be an empty list if no items arrived longer than the moving window)
For high availability and fault tolerance of single servers:
 a number of 'frequency aggregatos' that receive the events and assign them to time slots (either synchronized time or fine grained ... the whole time discussion could fill a couple of minutes here). Here I could place a load balancer in front or if I control the clients assign different clients a different aggregator and fallback if primary for that client fails, etc..
 multiple 'redundant statistics runners' that will receive data from aggregator and sum up to final statistics as pointed out in prior section. The aggregator will translate thousand or hundereds of thouasands of requests to just one per time slot, which will be a list of keyvalue pairs it will forward to the main server. It might already cut away the long tail in order to minimize traffic... but that's another discussion because it's not that easy, one might need to consider reelevating longtail's that peak up or if very small time slots are considered, if a fact occurs exactly once in a time slot, it might need reelevation as well otherwise statistics might be wrong.
Now, I can query any one of the redundant statistics runner. How ever, I need to account for statistics runner that are temporarily overloaded or didn't receive all updates from the aggregators, because they will return a different and wrong statistics.
@cool guy
the hash table is more space effective than a trie due to pointers etc. and therefore a hashtable can be cash friendlier if short enough. If you use hashtable in the naive way, you will need to recalculate the hash for the string with every lookup, which is not so nice. If you write your own hashtable and hash function you can avoid this by reusing the hashvalue of k1 characters to calculate hash of k characters (expanding prefix). With a trie, you know if a prefix is not in the dict as soon as you expand a character c[k], with a hashtable you will have to expand every prefix to the max word size in order to make sure it's not in there. This makes the hash algorithm ineffective for some crafted sample input. The tries memory disadvantage can be improved using a patricia tree  sometimes compacted trie, but it will never be as compact in memory as a hashtable.
There was a question on LeetCode like this, and I tried to beat the fastest c++ implementation which used a hashtable approach. I tried using a trie and even a patricia tree implementation. It turned out, the time to create the trie dominated the advantage of the faster lookups because it was one dictionary per query, so, that's always a good thing to think about, how many queries per dictionary instance. An other thing I noticed there, the data in the testcases were just not advantageous for the trie implementation. Short dictionary (fits into cache) with relatively short words.
I could imagine a combination being good, create a sort of trie on substrings of words of fixed size, say 5 characters, so for short words it's like a single hashtable, long words that share a prefix, would behave like a trie etc... I haven't seen anybody doing that and haven't thought really about it, but it kind of jumped into my mind right now.
edited:
maybe I should have added, a ptrica trie can be more space effective than a hashtable for crafted input, but it's not the case for a regular western language.
1) calculate absolute times when a task get's added
2) push it into a priority queue with smallest absolute on top (binary heap)
3) have a runner that polls the queue once in a time interval (e.g. second) and checks if top task is to be executed, pop items from the heap that need to be executed.
4) now, the question is, how long does the run take? can I execute it synchronously or will this delay the scheduler? I could as well span into a thread pool to call run there.
a more sophisticated edge case is, if two items receive the same absolute time, is it required to maintain order of insertion when running? Might be relevant in some use cases. The mentioned edge case, is obvious. at 5 task b runs, at 10 task a runs.
runtime and space complexity is obvious.
@sachin323: You should pick the path with smallest p, where p is an element of each path from (0,{0..n1}) to (m1,x) where (c[i+1], {r[j+1], r[j1], r[j]} is the largest value). This sounds strange... surely possible, but what for? I see no connection to tallest horizon.
Anyway, if that's the task, I'd scan the matrix for minimal value that can be reached from any of the 3 left hand pixels. Then I know through which pixel the path goes. From there I'd expand right and left: O(n*m), no extra space, except to store the path.
I assume he wanted something else, like detecting a couture, like maybe an upper edge of a scanned page or so... but I can't see that in here.
anyway, sorry to hear it didn't work out this time.
It's hard to tell what his intentions were, but certainly questions are needed such as:
 multiple "directories" per user?
 multiple users per directory?
 if prefix matches, does it mean he has access to all sub. dirs.?
 can there be a revoke in a sub. Dir?
 how many grants / user typically, outliers?
 how many users, how many directories.
 how deep are those paths typically, where is the grant in this hirarchy
One approach can be, have an entity for a user, and a list of directory prefixes, load the user with grants and check if pathprefix is in there, maybe cache user. If there are thousands of grants per user but only few grants per directory, you may do the opposite on the directory, load per prefix the number of users.
Etc...
between a and b, I assume a half open interval [a, b)
question is the distribution, if it should be normal, the previous posts are probably right (needs a prove though). If we want uniform distribution things are different.
I could create a random function that creates a uniform random number in the range [0 and 2^n) by applying the given function to each bit. Then I could use this new function to create a random number in the range [0, 2^n) where n would be lg2(ba) + 1. If the random number exceeds ba, I try again, if not, I add it to a and return the result.
I figured a good approach is a maze runner that detects the outer contour of the rooms. Then it should zigzag through the interiour, similar like flood filling. It's going to be hard to prove it will really work in 30 minutes  as it is to implement it correct in 30 minutes.
 Chris December 03, 2017since it seems to be a contest question and I noticed people posting questions from live contests, I'd just advice to use a segment tree. The queries etc. should be easy with it. If you set a one and it was a zero, check if there is a segment on the left and right and merge accordingly.
 Chris December 03, 2017the question will be, random with what distribution?
I assume with a uniform distribution: each point in the final region has the same probability to be chosen. For a single rectangle this is straight forward:
x = rectangle.left + random(rectangle.width);
y = rectangle.top + random(rectangle.height);
For multiple rectangles that don't overlap, we can say, we choose the rectangle proportionally to it's area and once the rectangle is chosen, pick a point in there.
long long are_sum = 0;
for(Rectangle& r : rectangles) {
area_sum += r.width * r.height;
if(random(area_sum) < r.width * r.height) {
x = rectangle.left + random(rectangle.width);
y = rectangle.top + random(rectangle.height);
}
}
But if they overlap it's tricky. The easiest way I can think of is to union the new rectangle with the existing area and only consider the rectangles that are not already covered and apply method 2.
 Chris December 03, 2017Assumptions:
 the cell where the boat is starting has hight 1, the boat can float
 it takes one time unit to move let, right, up and down (Manhatten moves)
 in order to be able to float on a field the water level must be field hight + 1 when
I reach there. for example level is 2 when I start at field (0,0) I can go to
field (1,0) in one time unit if level of that field (1,0) is 2.
I'd solve it using a Dijkstras sort of shortest path. The cost to go to an adjacent field
is currentpathlength + min(1, fieldheight  currentpathlength). That is, the cost is defined by the field and not the weight of an edge. Thus I don't need the relaxation step from Dijkstra.
I could add heuristics that estimates remaining distance, to create an A* like thing.
#include <queue>
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
int min_time_to_reach_lower_right(const vector<vector<int>>& matrix)
{
const pair<int, int> MOVES[] = { {1,0},{1,0},{0,1},{0,1} };
struct QueueItem {
int row, col, pathlen;
};
int m = matrix.size();
if (m == 0) return 0;
int n = matrix[0].size();
if (n == 0) return 0;
auto qcomp = [](const QueueItem& a, const QueueItem& b) { return a.pathlen > b.pathlen; };
vector<vector<bool>> explored(m, vector<bool>(n));
priority_queue<QueueItem, vector<QueueItem>, decltype(qcomp)> pq(qcomp);
pq.push({0, 0, 0});
explored[0][0] = true;
while (!pq.empty()) {
auto qi = pq.top();
pq.pop();
if (qi.row == n  1 && qi.col == m  1) return qi.pathlen;
for (auto& move : MOVES) {
int r = qi.row + move.first;
int c = qi.col + move.second;
if (r >= 0 && r < m && c >= 0 && c < n && !explored[r][c]) {
explored[r][c] = true;
pq.push({ r, c, max(qi.pathlen + 1, matrix[r][c] + 1) });
}
}
}
return 1; // that's actually an error, it should always be able to reach the end
}

Chris
December 03, 2017 Let's assume my interpretation is correct (see my other post), the solution is using a DFSlike approach. Its more efficient than BFS because I expect a route to be found starting on a early row and if there is a choice, the upper row wins if there is a path to the right through that element. So, if I run BFS it will use that very primitive heuristic to expand the upper routes first.
A BFS, opposed to that, will explore all routes simultaneously looking for a shortest path. So it will expand a lot of routes it doesn't need to if the desired route doesn't start from the very bottom. How ever, worst case scenario is the same, has the same bigO properties.
That would be O(m*n) time and O(m*n) space complexity.
#include <vector>
#include <stack>
#include <algorithm>
#include <iostream>
using namespace std;
// return the sequence of rows to be choosen if columns start at 0 and end at n1
vector<int> find_highest_path(const vector<vector<bool>>& matrix)
{
vector<int> path;
int m = matrix.size();
if (m == 0) return path;
int n = matrix[0].size();
if (n == 0) return path;
vector<vector<int>> parent(m, vector<int>(n, 1));
int lr = 1;
for (int i = 0; lr == 1 && i < m; ++i) {
stack<pair<int, int>> s;
if (!matrix[i][0]) continue;
s.push({ i, 0 });
while (!s.empty()) {
int r = s.top().first;
int c = s.top().second;
s.pop();
if (c == n  1) {
lr = r;
break;
}
for (int nr = min(r + 1, m  1); nr >= max(0, r  1); nr) {
if (matrix[nr][c + 1] && parent[nr][c + 1] == 1) {
parent[nr][c + 1] = r;
s.push({nr, c + 1});
}
}
}
}
// back track path
int c = n  1;
while (lr != 1) {
path.push_back(lr);
lr = parent[lr][c];
c;
}
reverse(path.begin(), path.end());
return path;
}
template<class T>
ostream& operator <<(ostream& os, const vector<T>& c)
{
os << "{";
bool first = true;
for (auto e : c) {
if (!first) os << "," << e;
else os << e;
first = false;
}
os << "}";
return os;
}
int main()
{
cout << find_highest_path(
{
{ 0,0,1,1,0,1,1,0,0,0,0,0 },
{ 1,1,0,0,1,0,1,1,1,1,0,1 },
{ 0,0,1,0,0,0,0,1,0,0,1,0 },
{ 1,1,1,1,0,0,1,1,0,1,0,0 },
{ 0,0,0,1,1,1,0,1,1,0,0,0 },
{ 0,0,1,0,0,1,0,0,0,1,1,1 },
{ 1,0,0,0,0,0,1,1,1,1,1,1 },
}) << endl; // should return 1,1,0,0,1,0,0,1,1,1,2,1
cout << find_highest_path(
{
{0,0,1,1,0,1,1,0,0,1,0,0},
{1,1,0,0,1,0,0,1,1,0,0,1},
{0,0,1,0,0,0,0,1,0,0,1,0},
{1,1,1,1,0,0,1,1,0,1,0,0},
{0,0,0,1,1,1,0,1,1,0,0,0},
{0,0,1,0,0,1,0,0,0,1,1,1},
{1,0,0,0,0,0,1,1,1,1,1,1},
}) << endl; // should return 1,1,2,3,4,4,3,3,4,3,2,1
}

Chris
December 03, 2017 Are you saying you want lines L which are paths from column 0 to column m1 in a matrix where adjacent pixels are m[c+1][r1], m[c+1][r], m[c+1][r+1] from pixel m[c][r] if that pixel is 1. I assume with tallest line you mean the line l of L that reaches the highest point, that is minimum r (row), if we assume upperleft is r=0, c=0? If there are multiple with the same minimum r, pick the one that has most points on that height.
 Chris December 03, 2017Opposed to exact content, one might want to have a comparison that is robust against typos, layout dependent differences (e.g. table layout vs. column layout), intended introduction of minor differences, etc.
To minimize false positives, the approaches previously described work fine. If you want to have a balance between false positives and false negatives, the following approach is be better suited:
1) strip html tags off, take the text, strip filling words (words with no content, like "and", "it", "a", ...) from these words, do word stemming (reading = read, etc...) of remaining words and then count the occurrence of each word, so you get two multidimensional vectors (k dimensions, for each word stem a dimension)
2) feed the two vectors into a comparison function (like cosinesimilarity) and now define a threshold when do you consider the page same content and when not. Clearly, if you have a database to test against (where you know the result you want to achieve) you have an advantage.
recursive definition should do it in a tree, in a generic graph it would be NPhard.
int max_path_rec(Node* root, int& max_path)
{
if (root == nullptr) return numeric_limits<int>::min();
int left_branch = max(0, max_path_rec(root>left_, max_path));
int right_branch = max(0, max_path_rec(root>right_, max_path));
max_path = max(max_path, root>value_ + left_branch + right_branch);
return root>value_ + max(left_branch, right_branch);
}
int max_path(Node* root)
{
int mpath = numeric_limits<int>::min();
max_path_rec(root, mpath);
return mpath;
}

Chris
December 02, 2017 I assume connected is adjacent and the Node has no parent pointer.
Node* find_local_min(Node *root)
{
stack<pair<Node*, Node*>> s;
s.push({ root, root });
while (!s.empty()) {
Node* p = s.top().first; // parent of current, none = root if p = n
Node* c = s.top().second; // current
int cv = c>value_; // current value
s.pop();
if (c>left_ != nullptr && c>right_ != nullptr
&& cv < p>value_ && cv < c>left_>value_ && cv < c>right_>value_) {
return c;
}
if (c>left_ != nullptr) s.push({ false, c>left_ });
if (c>right_ != nullptr) s.push({ false, c>right_ });
}
return nullptr;
}

Chris
December 01, 2017 solution O(n*lg(n)) time O(n) space:
1) iterate j from 0 to n1
2) maintain a vector, e.g dp. That vector gets extended if element a[j] is smaller than
last element of vector
3) if element doesn't get extended, binary search in vector for the element that is smaller
than a[j] it's index will be i, max on ji
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
// returns index of largest element < value in a[dp[i]]
int largest_smaller_idx(const vector<int>&a, const vector<int>& dp, int value)
{
int l = 0;
int r = dp.size() + 1;
while (l + 1 < r) {
int m = (l + r  2) / 2;
if (a[dp[m]] < value) {
r = m + 1;
} else {
l = m + 1;
}
}
return l;
}
int max_dist_larger_right(const vector<int>& a)
{
if (a.size() < 2) return 1;
int max_d = 1;
vector<int> dp(1, 0);
for (int j = 1; j < a.size(); ++j) {
if (a[j] < a[dp.back()]) {
dp.push_back(j);
} else if (a[j] > a[dp.back()]) {
int i = largest_smaller_idx(a, dp, a[j]);
max_d = max(j  dp[i], max_d);
}
}
return max_d;
}
int main()
{
cout << "TC1: " << (max_dist_larger_right({ 9, 11, 7, 8, 3, 6, 6, 11, 9, 8, 7 }) == 7) << endl;
cout << "TC2: " << (max_dist_larger_right({ 1, 2 }) == 1) << endl;
cout << "TC3: " << (max_dist_larger_right({ 1, 2, 1 }) == 1) << endl;
cout << "TC4: " << (max_dist_larger_right({ 1, 1, 2, 1 }) == 2) << endl;
cout << "TC5: " << (max_dist_larger_right({ 1, 2, 1, 3 }) == 3) << endl;
cout << "TC6: " << (max_dist_larger_right({ 9, 8, 1, 2, 1, 2, 1, 3, 2, 1, 0 }) == 6) << endl;
cout << "TC7: " << (max_dist_larger_right({ 1, 1 }) == 1) << endl;
cout << "TC8: " << (max_dist_larger_right({ 2, 1 }) == 1) << endl;
}

Chris
December 01, 2017 is it like each element must be rounded to the next power or the sum of all elements? I assume each element:
unsigned long long memory_required(const vector<unsigned int>& elements) {
unsigned long long total = 0;
for(auto e : elements) {
unsigned long long mask = 1 << 31;
while(mask > 0 && (mask & e) == 0) mask >>= 1;
if(e == mask) total += mask;
else total += mask << 1;
}
return total;
}

Chris
November 29, 2017 @CoolGuy: I don't get the difference of Job runner and worker in your latest post. How ever, let's assume the job runner is a job scheduler and distributes the work to workers. So it acts like a Loadbalancer. A typical thing you can do to return results is IP rewriting. The job runner sends a request to the worker but the senderip is actually the client'sip .. but your interface states not a synchronous interface...
 Chris November 29, 2017
Rep
Rep
Repjudydschultz1234, Android test engineer at Eterno Infotech Pvt Ltd
Spent a weekend researching weebles in Nigeria. My current pet project is developing strategies for how to break someone's ...
RepAmber Van is the top rated company that offers friendly and professional removals services.Our featured services includes domestic moves ...
RepJames Gulledge, Dev Lead at ADP
Get you driving in comfort and class in Atlanta! Prestige Luxury Rentals offer a wide selection of premium vehicles Our ...
Open Chat in New Window
You need to find a path from start that does not end in the end node. You can brute force all paths for from start. You can optimize if you store for a given node if it has a path to end so you do not redo the same work over and over. Recursive solution should be straight forward.
 Chris January 02, 2019