Chris
BAN USER
SWE
- 0of 0 votes
AnswersA new species has been discovered and we are trying to understand their form of communication. So, we start by identifying their alphabet. Tough Mr. Brown managed to steal a vocabulary that has words in it in ascending order and a team already identified the letters. So, we have now ordered words as sequences of numbers, print every letter once in ascending order.
- Chris in United States
[3,2]
[4,3,2]
[4,1,1]
[1, 2]
[1, 1]
alphabet: [3, 4, 2, 1]| Report Duplicate | Flag | PURGE
Software Engineer Data Structures
@Alex, not sure I understood your question... Are you saying the queue is full with 1 Mio item because that's the limit per interval. Then at least a whole interval passes with no requests and now an other burst arrives?
Then, when the first of these requests arrives, it will see, the queue is full, it will only remove the oldest item and place itself in the back. The next item does the same until one item arrives that can't remove from the front because the front is not old enough - that would work with 1 Mio items, if you want to spend the memory for that. If it's a ring-buffer it's the same, nothing ever get's moved, copied or iterated. I would question the 1 Mio / interval and whether it's worth spending the memory for that purpose, but it would work. If it's 1 Mio, I definitely would start to think about compacting into sub-intervals and counting. Solution would still be O(1) but much more complicated.
I think that's the charm of the solution, it does never need to iterate, so it's always plain O(1) - not compensated, neat for high throughput.
I've interpreted the question as not using the same color more than twice in a row / column where the question was to not use the same color more than twice in adjacent fields on a row and on a column.
Anyways here's the backtracking algo for the Soduku like question:
vector<vector<uint>> generate_rnd_field(uint m, uint n, uint k)
{
vector<vector<vector<uint>>> options(m, vector<vector<uint>>(n));
vector<vector<uint>> result(m, vector<uint>(n, EMPTY));
vector<vector<uint>> rows(m, vector<uint>(k));
vector<vector<uint>> cols(n, vector<uint>(k));
stack<pair<size_t, size_t>> s;
s.push({0, 0});
while (!s.empty()) {
int r = s.top().first;
int c = s.top().second;
auto& field_options = options[r][c];
if (result[r][c] == EMPTY) {
// calculate options
for (uint i = 0; i < k; ++i) {
if (rows[r][i] + cols[c][i] < 2) {
field_options.push_back(i);
}
}
} else {
// an options was used but didn't succeed, put the value back into the
// rows and colums options
int value = result[r][c];
rows[r][value]--;
cols[c][value]--;
}
if (field_options.size() == 0) { // if it has no options
s.pop(); // backtrack
result[r][c] = EMPTY; // gets new options when coming back
} else {
size_t value_idx = rand() % field_options.size();
int value = field_options[value_idx];
rows[r][value]++;
cols[c][value]++;
result[r][c] = value;
field_options.erase(field_options.begin() + value_idx);
// next field
if (++c == n) {
c = 0;
if (++r == m) return result;
}
s.push({r, c});
}
}
return vector<vector<uint>>(); // empty to indicate no solution exists
}
the question asks for inter process communication but disallows it... that's a bit strange, as it is not a very clear term. Sockets, Pipes and named mutexes are cosnidered IPC? Is shared memory, a shared file or a memory mapped file as well IPC for the person asking? How about storing a value in the cloud and retrieving it?
- Chris November 29, 2017/*
There is a straight forward solution that is like 5 lines
- count the requests in a moving window of an interval.
- I do this by using a queue and limit it's size. When a token arrives:
1) I check if the oldest token in the queue can leave the window
2) I accept the token, if queue size < allowed requests
I add the item into the queue
*/
// returns true if process can be processed or false if not
bool processRequest() {
auto now = steady_clock::now();
if (!requests_.empty() && (now - requests_.front()) > interval_) requests_.pop();
if (requests_.size() < requestLimit_) {
requests_.push(now);
return true;
}
return false;
}
/*
This is O(1) time and O(allowed-requests) space
There are issues and things to consider:
1) bursts: say, you have 100 requests per minute, but in the first second 100
requests arrive, how should the system behave? maybe better scale down to
1 request every 300 ms?
2) synchronization: the need to secure that thing against concurrent access.
I would expect requests being processed on multiple cores. The C++ queue is a linked
list, so it needs protocetion using a mutex.
It's not good on a high throuhput system, first because mutex is expensive, second
because of what happens while it is locked: queue will access the heap which
is expensive and unnecessary (long locking period, long wait, long runtime)
Therefore the queue can be replaced with a fixed size ringbuffer.
A lockless version might be possible, but is tricky to get right.
- The injection,decorater,aspect,attribute thing I spared. If it needs to be fast
maybe use templates instead of polymorphism in c++
- Testing is an other interesting question. I think in order to do that properly I
need to mock/inject the chrono. So I'd design it to take a chrono interface or
type
I put the basic implementation into a monte carlo below.
*/
#include <chrono>
#include <vector>
#include <queue>
#include <thread>
#include <iostream>
#include <iomanip>
using namespace std;
using namespace std::chrono;
class RateLimitter
{
private:
queue<steady_clock::time_point> requests_;
milliseconds interval_;
unsigned int requestLimit_;
public:
RateLimitter(milliseconds interval, unsigned int requestLimit)
: interval_(interval), requestLimit_(requestLimit) {}
bool processRequest() {
auto now = steady_clock::now();
if (!requests_.empty() && (now - requests_.front()) > interval_) requests_.pop();
if (requests_.size() < requestLimit_) {
requests_.push(now);
return true;
}
return false;
}
};
int main()
{
RateLimitter r(milliseconds(1000), 10); // freq: 10/s
int produced = 0, processed = 0;
auto start = system_clock::now();
while (++produced < 1000) {
this_thread::sleep_for(milliseconds(rand()%100)); // avg. freq: 20/s
if (r.processRequest()) {
processed++;
auto elapsed = system_clock::now() - start;
if (processed % 10 == 0)
cout << setprecision(3) <<
"processed: " << processed << " "
<< static_cast<double>(processed) / (duration_cast<seconds>(elapsed).count() + 1)
<< " req./s \t produced " << produced << " "
<< static_cast<double>(produced) / (duration_cast<seconds>(elapsed).count() + 1)
<< " req./s" << endl;
}
}
}
Important questions may be:
- size of job
- character of job (CPU bound, I/O bound)
- where's the data for the job (needs to be read from other services, ...)
- notification (really callback, no long polling etc. - firewalls of clients? clients behind a NAT?)
- authentication & authorization model
- criticality of requirements on single processing, keeping order of requests (e.g from same client, over all,...) etc.
A loadbalancer would round robin (or distribute smarter e.g. on a hash ring) to a workers that execute jobs. Each worker has two slaves that the worker notifies when he gets the request. A worker is slave for an other worker as well. That means, he keeps the jobs of his slaves and could jump in in case a worker dies. Otherwise he wouldn't work on jobs he is slave for.
If jobs and / or results need to be persisted, the worker would be responsible to store the job in a database. The database would be sharded based on job id. Job Id would be something like user_id concatenated with a running job-number per user (assuming the per user frequency of jobs is at most a few a second and typically a few every hour). To prefent data loss, the database is replicated.
For notification I would try to notify from the worker itself. Before notifying I would persist the result of the job, if needed and notify the slaves that I am now notifying, so in case the worker crashes while notifying one of the slaves can take over.
But notification could fail (client not reachable). In this case I would retry from the same worker immediately once, after a few seconds a second time and after 1-2 Minutes again. If it still fails I'm either living in a network partition that would hopefully be detected by one of my slaves) or the client is gone for some time.
The fail-over from worker to a slave is critical and must be coordinated (depending on requirements how important it is to not process a job twice or notify a client twice in rare cases)
@Raphael: The approach is good, I think the code has an issue: when moving the "slow" pointer shouldn't we decrement the element count that leaves the window (not only the one that has been seen more than k times)
while (s.charAt(slow) != c) {
seen[s.charAt(slow)]--;
slow++;
}
e.g.
1,2,3,2,2,1,1,5,6,7,8, k = 2 --> 3,2,2,1,1,5,6,7,8
I assume the + in the BNF indicates repeat once or more. Actually to make sense, it
should be (becasue you need at least two operands to make sense):
expr := int | '(' op expr expr + ')'
further I assume integer is:
integer := ['-'] digit digit +
digit := '0' .. '9'
[] indicates optional
there are multiple ways to parse it. Since a BNF is given and since it's parsable without lookahead I would approach it like a simple parser, with a super primitive lexer to get some structure into the code:
#include <string>
#include <iostream>
using namespace std;
class Parser
{
private:
size_t offset_;
string expression_;
public:
Parser(const string&& expr) : expression_(expr), offset_(0) { }
int getValue() {
int result;
if (!expr(result)) throw "syntax error";
skipWhiteSpaces();
if (!isEOF()) throw "syntax error";
return result;
}
private:
// production expr := int | '(' op expr expr + ')'
bool expr(int& value) {
int operand;
int oper; // 0: +, 1: *
if (matchInteger(value)) { return true; }
if (!match('(')) return false;
if (!op(oper)) throw "syntax error, expecting operator";
if (!expr(value)) throw "syntax error, expecting integer";
if (!expr(operand)) throw "syntax error, expecting integer";
do {
if (oper == 0) value += operand;
else if (oper == 1) value *= operand;
} while (expr(operand));
if (!match(')')) throw "syntax error, expexting ')'";
return true;
}
// production op := '+' | '*'
bool op(int& op) {
op = -1;
if (match('+')) {
op = 0;
} else if (match('*')) {
op = 1;
} else {
return false;
}
return true;
}
// -- from here, primitive lexer functionality --
bool matchInteger(int& value) {
int o = offset_;
bool neg = false;
bool numbers = false;
while (o < expression_.size() && expression_[o] == ' ') o++; // read spaces
if (o < expression_.size() && (neg = (expression_[o] == '-'))) o++; // read '-'
while (o < expression_.size() && expression_[o] == ' ') o++; // read more spaces
value = 0;
while (o < expression_.size() && expression_[o] >= '0' && expression_[o] <= '9') {
value = value * 10 + (expression_[o++] - '0');
numbers = true;
}
if (!numbers) return false;
if (neg) value = -value;
offset_ = o;
return true;
}
bool match(char c) {
skipWhiteSpaces();
if (isEOF()) return false;
if (expression_[offset_] == c) {
offset_++;
return true;
}
return false;
}
void skipWhiteSpaces() {
while (!isEOF() && expression_[offset_] == ' ') { ++offset_; }
}
bool isEOF() const { return offset_ >= expression_.size(); }
};
int main()
{
cout << (Parser("3").getValue() == 3) << endl;
cout << (Parser("(+ 1 2)").getValue() == 3) << endl;
cout << (Parser("(+ 3 4 5)").getValue() == 12) << endl;
cout << (Parser("(+7 (*8 12) (*2 (+9 4) 7) 3)").getValue() == 288) << endl;
}
@Scavi: youre O(n) is O(n*k) because of the substring. Easy to solve. I had the same idea there.
Followup 2 is an instance of superstring-problem (known NP-hard problem). I think the order matters (besides deciding whether to extend left or right considering common prefix, suffix) with the brute force making it exponential in m=2^k. The naive greedy algorithm will look for the two elements that overlap the most. This already creates a better order.
Your described runtime is only exponential in k, but the original problem is exponential in m. Therefore the algorithm is most likely not correct. But it will find a shorter solution than the first approach.
I doubt anybody comes up with the approximation algorithm in 30 minutes if he hasn't just learned it.
it can be solved in O(n) as pointed out by @dubai_data_science but there is as well a O(lg(n)) solution involving binary search:
- you go into the middle and check the element in the middle and it's preceeding. If you find a rising value, you know left of you is increasing, so you element in in the range [middle..end]. So, you repeat that step on this range until you're on the peak ;-)
Logic is: the number in the input is the rank of the element you need to place. So you need an array with unique elements sorted ascending (e.g. 1..n, or 0..n-1)
you keep picking the k-th rank, where k is given by the input array (n-a[i]). Why, because then you will have n-k elements being larger that you can place on the right.
If you use an array, it's O(n^2) time due to the remove operation that copies n/2 elements worst case average.
If you use a tree, (sorted map, red black tree etc) you can get away with O(n*lg(n)) time.
I assume:
- jobs are identified using an integer
- I get dependencies in the form of pairs (a, b) and the set of all a and all b
unioned gives me the set of all jobs.
- Semantics of the directed edge (a,b) is: in order to start b, a must be completed
- The dependencies form a DAG, no need to check for cycles
- you can call get_next_stage anytime and you will get the set of jobs that can
be started due to the calls of job_done since the last call of get_next_stage
(maybe a better name would be get_unlocked_jobs)
- you can only compete jobs that get_nextStage returned
- you can't complete a job twice
#include <vector>
#include <algorithm>
#include <iostream>
#include <unordered_map>
#include <unordered_set>
using namespace std;
ostream& operator <<(ostream& os, const vector<int>& v);
class JobScheduler
{
private:
unordered_map<int, unordered_set<int>> adjList_; // jobid -> list of dependent jobids
unordered_map<int, int> incidentDeg_; // job, in-degree, jobs that have
// in-degree 0 and are done are removed from HT
vector<int> nextStage_;
public:
JobScheduler(const vector<pair<int, int>>& deps) {
for (auto dep : deps) {
incidentDeg_[dep.first] += 0; // ensure they're in, don't change
if (adjList_[dep.first].insert(dep.second).second) {
incidentDeg_[dep.second]++; // skip dublicate dependencies
}
}
for (auto indeg : incidentDeg_) {
if (indeg.second == 0) {
nextStage_.push_back(indeg.first);
}
}
}
vector<int> get_nextStage() {
vector<int> res(move(nextStage_)); // clears nextStage_
return res;
}
void jobDone(int jobId) {
auto it = incidentDeg_.find(jobId);
if (it == incidentDeg_.end() || it->second != 0) throw "error"; // see assumptions
incidentDeg_.erase(it); //
for (auto next : adjList_[jobId]) {
if (--incidentDeg_[next] == 0) {
nextStage_.push_back(next);
}
}
}
};
int main()
{
JobScheduler s({
{1, 2},
{5, 2},
{2, 3},
{1, 2}, // dublicate dependency
{3, 4},
{4, 9},
{7, 4},
{2, 7},
{2, 4}, // redundant 2, 3 -> 3, 4
{2, 8} });
cout << s.get_nextStage() << endl; // {1, 5}
s.jobDone(1);
cout << s.get_nextStage() << endl; // {}
s.jobDone(5);
cout << s.get_nextStage() << endl; // {2}
s.jobDone(2);
cout << s.get_nextStage() << endl; // {8, 3, 7}
s.jobDone(8);
cout << s.get_nextStage() << endl; // {}
s.jobDone(3);
cout << s.get_nextStage() << endl; // {}
s.jobDone(7);
cout << s.get_nextStage() << endl; // {4}
s.jobDone(4);
cout << s.get_nextStage() << endl; // {9}
s.jobDone(9);
cout << s.get_nextStage() << endl; // {}
}
ostream& operator <<(ostream& os, const vector<int>& v)
{
os << "{";
bool first = true;
for (auto e : v) {
if (!first) os << "," << e;
else os << e;
first = false;
}
os << "}";
return os;
}
I'ts "just" a binary search. The problem getting binary search correct are:
- it must terminate
- it must work correctly in corner cases
I found it helps to use the notion of loop-invariant, to get it right:
#include <vector>
#include <iostream>
using namespace std;
// returns index to last element in arr that is <= val
// OR if all elements are > val, returns arr.size()
size_t lower_bound_idx(const vector<int>& arr, int val)
{
size_t l = 0;
size_t r = arr.size();
// loop-invariant:
// - [l..r): half open intervale of size >= 1: r-l > 1
// - arr[l] <= val
// - arr[r] > val or r = arr.size()
// the following if ensures loop-invariant of the following while loop.
// i could skip this "if" and let the following while loop return 0, but this would
// violate loop-invariant and make lower_bound behave unexpectedly and
// thus is a nice source for further errors...
if (!(arr[l] <= val)) return arr.size();
while (r - l > 1) {
size_t m = (l + r) / 2;
if (arr[m] <= val) l = m;
else r = m;
}
return l;
}
int find_closest(const vector<int>& arr, int val)
{
if (arr.size() == 0) return -1;
auto i = lower_bound_idx(arr, val);
if (i >= arr.size()) return arr[0]; // indicates arr[0] > val
if (i == arr.size() - 1) return arr.back(); // arr.back() < val
if (val - arr[i] <= arr[i + 1] - val) return arr[i];
return arr[i + 1];
}
int main()
{
cout << (find_closest({ 2,5,6,7,8,8,9 }, 5) == 5) << endl;
cout << (find_closest({ 2,5,6,7,8,8,9 }, 11) == 9) << endl;
cout << (find_closest({ 2,5,6,7,8,8,9 }, 4) == 5) << endl;
cout << (find_closest({ 2,5,6,7,8,8,9 }, 2) == 2) << endl;
cout << (find_closest({ 2,5,6,7,8,8,9 }, 1) == 2) << endl;
cout << (find_closest({ 2,5,6,7,8,8,9 }, 1) == 2) << endl;
cout << (find_closest({ 10 }, 10) == 10) << endl;
cout << (find_closest({ 10 }, 5) == 10) << endl;
cout << (find_closest({ 10 }, 12) == 10) << endl;
cout << (find_closest({ 1,1 }, 1) == 1) << endl;
cout << (find_closest({ 1,1 }, 2) == 1) << endl;
cout << (find_closest({ 1,1 }, 0) == 1) << endl;
cout << (find_closest({ 1,4 }, 3) == 4) << endl;
cout << (find_closest({ 1,4 }, 4) == 4) << endl;
cout << (find_closest({ 1,4 }, 5) == 4) << endl;
cout << (find_closest({ 1,4 }, 2) == 1) << endl;
cout << (find_closest({ 1,4 }, 1) == 1) << endl;
cout << (find_closest({ 1,4 }, -1) == 1) << endl;
}
@Alex came up with a cool idea in an other thread to solve this kind of question in O(n*m^2):
- scan from top to down, line by line
- for each line, remember each combination of 2 1's and push that into a hash-set
- if you ever find that combination again in a later line, you get your rectangle
I think the solution has a lot of charm, especially it's a sparse matrix
@Matviy Ilyashenko:
I think you can do it in O(sqrt(n)) and avoid many expensive sqrt
note that your list doesn't really help you much (it safes you a n^2 - n multiplications for the cost of memory/cache access), you may as well want 4*4 + 0*0 = 16, so maybe we should start at 0. You could as well just repeat the loop of your sqaure calc:
for(long i=0;i*i<=n;i++)
for(long j=i;j*j<=n;i++)
if(i*i+j*j=n) return true;
return false;
But if you put i*i into a hash-set - I think you don't need to test all pairs and you can avoid sqrt calcs. Note sqrt calc is faster on modern CPUs than maintaining a hash-set.
unordered_set<int> s;
for(int i = 0; i*i <= n; ++i) {
s.insert(i*i);
if(s.count(n-i*i) > 0) return true;
}
return false;
if the relation is transitive(from aRb and bRc follows aRc) and symetric (from aRb follows bRa), then it's finding the connected component's in a undirected graph and just pick one vertex from each component.
I assume the relation is not to be treated as transitive, so, I don't see much else then a brute force, maximizing the recursion where you exclude directly related elements from the set of potential candidates. Every element has two options: it is in the set or it isn't. Since it's only about finding the size, I assume we can use some dp-techniques, similar like knapsack.
where those interviews phd level?
An Euler tour visits each edge exactly once. A hamilton path visits each vertex exactly once. Not all graphs have euler path or hamiltonian paths. But what is a two way Euler tour that can be drawn such it does nt cross itsefl at any vertex? Is this a hamiltonian path that is an euler route?
- Chris November 12, 2017Straight forward BFS to find shortest path from s to s. Just ensure, the start node is not put into the visited nodes at first. Since there is no relaxation (unweighted graph) needed, the order of first discovery from BFS is as well the shortest path to that node. Store it's parent and don't care about rediscovery (as usually). If I get back to S, I have the shortest cycle.
Runtime is O(|V|+|E|), especially if there is no cycle, this might happen (e.g. if you
give me the root of a tree as start). Space is O(|V|).
The core of the algo is this code, assuming s is the given node:
unordered_map<Node*, Node*> parents;
deque<Node*> q{ 1, s };
while (!q.empty()) {
Node* u = q.back();
q.pop_back();
for (auto v : u->adjacents_) {
if (parents.count(v) > 0) continue;
parents[v] = u;
if (v == s) {
q.clear();
break;
} else {
q.push_front(v);
}
}
}
the whole solution is
#include <string>
#include <queue>
#include <unordered_set>
#include <unordered_map>
#include <iostream>
using namespace std;
struct Node
{
string label_;
unordered_set<Node*> adjacents_;
Node(const string& label) : label_(label) {}
void link(Node* v) { adjacents_.insert(v); }
};
vector<Node*> shortest_cycle_path(Node* s)
{
unordered_map<Node*, Node*> parents;
deque<Node*> q{ 1, s };
while (!q.empty()) {
Node* u = q.back();
q.pop_back();
for (auto v : u->adjacents_) {
if (parents.count(v) > 0) continue;
parents[v] = u;
if (v == s) {
q.clear();
break;
} else {
q.push_front(v);
}
}
}
// recover path
vector<Node*> path;
auto current = s;
do {
path.push_back(current);
current = parents[current];
} while (current != nullptr && current != s);
if (current != nullptr) path.push_back(s);
if (path.size() == 1) path.clear(); // no cycle
reverse(path.begin(), path.end());
return path;
}
ostream& operator << (ostream& os, const vector<Node*>& v) {
os << "{";
bool first = true;
for (auto n : v) {
if (first) {
cout << n->label_;
first = false;
} else {
cout << "," << n->label_;
}
}
os << "}";
return os;
}
int main()
{
/*
/------(f)<- /--\
v ---/ \ v |
(a) / ----->(e)---/
| / / ^
vv | /
(b)-->(c)-->(d)-->(g)->(h)
\ ^
-------/
*/
Node a("a");
Node b("b");
Node c("c");
Node d("d");
Node e("e");
Node f("f");
Node g("g");
Node h("h");
a.link(&b);
b.link(&c);
c.link(&e);
c.link(&d);
d.link(&e);
e.link(&f);
e.link(&e);
f.link(&b);
f.link(&a);
d.link(&g);
d.link(&h);
g.link(&h);
cout << shortest_cycle_path(&b) << endl; // {b,c,e,f,b}
cout << shortest_cycle_path(&a) << endl; // {a,b,c,e,f,a}
cout << shortest_cycle_path(&e) << endl; // {e,e}
cout << shortest_cycle_path(&g) << endl; // {}
}
I assume there are no negative values for gas stations, so only direction you
can go, is right. It's not greedy, because in order to make the best decision you
need to look further then just the next gas station.
(Note:I assumed I can fill gas in, but after reviewing the given sample data I think the original question was, that you can take the gas, but you don't fill your tank up, you kind of replace the tank... if it's like this, there is a linear solution as well (I think)... anyway understanding this one will help creating the other one)
/*
0-----3-----6---8----------15-----20
(17) 15 4 5 6 (0)
the problem is more convenient to talk about if we include start gas and end position
int the arrays and subtract at the end one from the solution (one stop less).
pos = [0, 3 , 6, 8, 15, 20]
gas = [17, 15, 4, 5, 6, 0]
at a certain point you have two options, take the gas or leave it there.
rec(i, g) = min{
infinity, if g < 0 // can't reach
0, if i == n-1 and g >= 0 // no more stop needed
rec(g - (pos[i+1] - pos[i]))
rec(g - (pos[i+1] - pos[i]) + gas[i]) + 1
}
rec(0, 0) would then be the solution of the recursion
note that this has time complexity of 2^n and will most likely exceed the time given
to find a solution.
next question is how to improve it.
look for a common subproblem... i think about it like this, I can make it to each gas
station with index i doing at most i-1 stops. But I eventually can get there with i-2
stops in multiple ways, i-2 to be exact. How ever, I only am intersted in the max gas
of all i-2 stops. etc.
so I keep an array gas_stops[n][n] which holds the max amount of gas for a certain amount
of stops per iteration (later we can optimize this to two arrays of size n)
it starts with gas_stops[0][0] = start_gas, all other elements must be -1
for(int i = 1; i < n; ++i) {
int dist = pos[i] - pos[i-1]; // how far to travel
for(int stops = 0; stops < n; ++stops) {
if(gas_stops[i-1][stops] >= dist) { // can I make it to this stop
gas_stops[i][stops] = max(gas_stops[i][stops], gas_stops[i-1][stops]-dist); // don't pick up gas at i
gas_stops[i][stops+1] = max(as_stops[i][stops + 1], gas_stops[i-1][stops]-dist+gas[i]); // pick up gas at i
}
}
}
// get the least stops for the last station
for(int stops = 0; stops < n; ++stops) {
if(gas_stops[n-1][i] >= 0) return stops;
}
since we only access gast_stops[i-1] and [i] we only need two arrays at a certain time
so we end up with an O(n^2) time and O(n) space complexity solution for this common problem
*/
#include <vector>
#include <iostream>
#include <limits>
#include <algorithm>
using namespace std;
// -- recursive brute force version --
int min_stops_rec(const vector<int>& pos, const vector<int>& gas, int g, int i)
{
if (g < 0) return numeric_limits<int>::max(); // ran out of gas
if (i + 1 == pos.size()) return 0; // reached last station
int dist = pos[i + 1] - pos[i]; // distance
return min(min_stops_rec(pos, gas, g - dist, i + 1), // don't pick up gas
1 + min_stops_rec(pos, gas, g - dist + gas[i], i + 1)); // pick up gas
}
int min_stops_rec(const vector<int>& pos, const vector<int>& gas)
{
return min_stops_rec(pos, gas, 0, 0) - 1;
}
// -- optimized version (space optimization not done for better readability) --
int min_stops_opt(const vector<int>& pos, const vector<int>& gas)
{
int n = pos.size();
vector<vector<int>> dp(n, vector<int>(n, -1));
dp[0][0] = gas[0];
for (int i = 1; i < n; ++i) {
int dist = pos[i] - pos[i - 1];
for (int stops = 0; stops < n; ++stops) {
if (dp[i - 1][stops] >= dist) {
dp[i][stops] = max(dp[i][stops], dp[i - 1][stops] - dist);
dp[i][stops + 1] = max(dp[i][stops + 1], dp[i - 1][stops] - dist + gas[i]);
}
}
}
for (int i = 1; i < n; ++i) {
if (dp[n - 1][i] >= 0)
return i;
}
return numeric_limits<int>::max(); // can't make it
}
int main()
{
cout << min_stops_rec({ 0, 3, 6, 8, 15, 20 }, { 17, 15, 4, 5, 6, 0 }) << endl; // 1 stop at 15 or 1 stop at 3
cout << min_stops_opt({ 0, 3, 6, 8, 15, 20 }, { 17, 15, 4, 5, 6, 0 }) << endl; // 1 stop at 15 or 1 stop at 3
// other example
cout << min_stops_rec({ 0, 2, 4, 5, 10, 15 }, { 5, 10, 20, 10, 5, 0 }) << endl; // 1 stop at 5
cout << min_stops_opt({ 0, 2, 4, 5, 10, 15 }, { 5, 10, 20, 10, 5, 0 }) << endl; // 1 stop at 5
}
Assume there are two directed graphs G1, G2 with an adjacency list representation.
The question is, is there a function g1tog2(vg1_i) that will return a vertex of G2 vg2_j
if a vertex vg1_i of G1 is given, such that for all edges (vg1_u, vg1_v) of G1
there is a corresponding edge (vg2_u, vg2_v) of G2 where vg2_u = g1tog2(vg1_u)
and vg2_v = g1tog2(vg1_v).
The problem is known to be NP complete. This means there is a polynomial verification
function.
That's the way I would approach it, knowing it's NP complete:
- create a function that creates all possibles mappings of vg1_i to vg2_j where
out-degree(vg1_i) = out-degree(vg2_j)
- call a verification function that verifies if all edges given by graph g1 have a
corresponding edge in g2 under one given mapping (g1tog2)
example graph's
g1:
(a)-->(d)
/
v /---\
(b)-->(c)<-/
g2: a=2, b=4, c=1, d=3
(2)-->(3)
/
v /---\
(4)-->(1)<-/
permutations:
1)a->1: doesn't work, differnt out-degree
2)a->2,b->1,c->3, doesn't work, different out-degree
3)a->2,b->1,c->4,d->3, valid permutations, will fail validation method because
(a,d), (a,b); g1tog2(a) = 2, g1tog2(d) = 3, g1tog2(b) = 1, (2,3), (2,1)
but it's (2,3) (2,4) in g2
3) a->2, b->3: doesn't work, differnt out-degree
5) a->2, b->4, c->1, d->3
obviously this will find for edges in g1 a corresponding edge in g2
#include <string>
#include <stack>
#include <unordered_set>
#include <unordered_map>
#include <iostream>
using namespace std;
struct Node
{
string label_;
unordered_set<Node*> adjacents_;
Node(const string& label) : label_(label) {}
void link(Node* v) { adjacents_.insert(v); }
};
// this function verifies if the both graphs g1 and g2 are isomorphic
// taking into account the function g1tog2 which has a 1:1 mapping of
// each node in g1 to a node in g2
bool are_graphs_isomorphic_verification(unordered_set<Node*>& g1, // graph 1
unordered_set<Node*>& g2, // graph 2
unordered_map<Node*, Node*>& g1tog2) // node g1 -> node g2
{
unordered_set<Node*> visited;
for (auto start : g1) { // standard DFS, outher loop
if (visited.count(start) > 0) continue;
stack<Node*> stk;
stk.push(start);
while (!stk.empty()) { // inner loop, to build a single DFS tree with root=start
auto ug1 = stk.top();
stk.pop();
if (visited.count(ug1) > 0) continue;
visited.insert(ug1);
auto ug2 = g1tog2[ug1];
// I assume ug1, ug2 adjacents.size() are equal, as this is ensured by
// permutation method below,
// otherwise: if (ug1->adjacents_.size() != ug2->adjacents_.size()) return false;
for (auto vg1 : ug1->adjacents_) {
if (ug2->adjacents_.count(g1tog2[vg1]) == 0) return false;
stk.push(vg1);
}
}
}
return true;
}
// this function crates all possible mappings of any node in g1 to any other node in
// g2. It will call the vierifcation method above for all possible mappings.
// If with one of those mappings isomorphism is detected, the function returns true
bool are_graphs_isomorphic(unordered_set<Node*>& g1, unordered_set<Node*>& g2)
{
if (g1.size() != g2.size()) return false;
if (g1.size() == 0 && g2.size() == 0) return true;
unordered_map<Node*, Node*> g1tog2; // node g1 -> node g2
for (auto u1 = g1.begin(), u2 = g2.begin(); u1 != g1.end(); ++u1, ++u2) {
g1tog2[*u1] = *u2;
}
// create all valid mappings between nodes of g1 to nodes of g2
// a primitive first indication if a mapping may be valid or not if the size
// of the adjacency lists don't match, it makes no sense to further develop
// that permutation or to verify it.
struct StackItem {
decltype(g1tog2)::iterator i, j; // the two nodes to swap to create permutations
bool leave; // mark, if entered that common prefix or if about to leave it
};
stack<StackItem> stk;
// first step, what node of g2 could be at first position so that adj-list-sizes match
for (auto j = g1tog2.begin(); j != g1tog2.end(); ++j) {
if (j->second->adjacents_.size() == g1tog2.begin()->first->adjacents_.size()) {
stk.push({ g1tog2.begin(), j, false });
}
}
while (!stk.empty()) {
if (stk.top().leave) {
swap(stk.top().i->second, stk.top().j->second); // swap back
stk.pop();
} else {
stk.top().leave = true; // recurse, remove next time
swap(stk.top().i->second, stk.top().j->second); // swap
auto ni = next(stk.top().i);
if (ni == g1tog2.end()) {
if (are_graphs_isomorphic_verification(g1, g2, g1tog2)) {
return true;
}
} else {
for (auto nj = ni; nj != g1tog2.end(); ++nj) {
if (ni->first->adjacents_.size() == nj->second->adjacents_.size()) {
stk.push({ ni, nj, false });
}
}
}
}
}
return false;
}
int main()
{
Node a("a");
Node b("b");
Node c("c");
Node d("d");
a.link(&b);
a.link(&d);
b.link(&c);
c.link(&c);
unordered_set<Node*> g1{ &a, &b, &c, &d };
Node n1("1");
Node n2("2");
Node n3("3");
Node n4("4");
n1.link(&n1);
n2.link(&n3);
n2.link(&n4);
n4.link(&n1);
unordered_set<Node*> g2{&n1, &n2, &n3, &n4};
cout << "g1, g2: " << are_graphs_isomorphic(g1, g2) << endl;
cout << "g2, g1: " << are_graphs_isomorphic(g2, g1) << endl;
cout << "g1, g1: " << are_graphs_isomorphic(g1, g1) << endl;
cout << "g2, g2: " << are_graphs_isomorphic(g2, g2) << endl;
n1.adjacents_.clear();
cout << "g1, g2: " << are_graphs_isomorphic(g1, g2) << endl;
cout << "g2, g1: " << are_graphs_isomorphic(g2, g1) << endl;
}
If perimeter is the number of steps to walk around an island, you can just search for connected components and count, how many of those are pieces that border to the water.
This will detect islands and islands within islands and the perimiter for the case of islands with a lake just include the inner shore line.
An alternative, is to use a maze runner, it's more complicated tough. This way, it detects only the outer shore line and ignores the inner part of the island. That way it can get fast on large islands, but it will not detect islands within islands.
/*
000
010
000
perimeter = 1
0000
0110
0110
0000
perimeter = 4
00000
01110
01010
01010
01110
00000
perimter = 10
0000000
0*****0
00**1*0
000***0
0000000
* indicates perimeter (would be '1's)
perimeter = 11
0000000
0*****0
0000000
perimeter = 5
0000000
0*000*0
0*****0
0**0*00
perimeter = 10
*/
#include <vector>
#include <queue>
#include <array>
#include <iostream>
using namespace std;
const int DIRECTIONS = 4; // up,right,down,left
const array<array<int, 2>, DIRECTIONS> MOVES{ { { -1, 0 }, { 0, 1 }, { 1, 0 }, { 0, -1 } } };
const array<array<int, 2>, DIRECTIONS> LEFT_FIELD{ {{0, -1}, { -1, 0 }, {0, 1}, { 1, 0 } } };
ostream& operator <<(ostream& os, const vector<int>& v);
ostream& operator <<(ostream& os, const vector<vector<int>>& vv);
bool is_within_map(vector<vector<int>>& map, int x, int y);
bool is_unvisited_land(vector<vector<int>>& map, int x, int y);
bool is_water(vector<vector<int>>& map, int x, int y);
bool is_unvisited_water(vector<vector<int>>& map, int x, int y);
bool is_waterLeft(vector<vector<int>>& map, int x, int y, int d);
vector<int> calc_perimeter(vector<vector<int>>& map)
{
vector<int> result;
deque<pair<int, int>> s;
for (int x = 0; x < map.size(); ++x) s.push_front({ x, -1 });
while (!s.empty()) {
int x = s.back().first;
int y = s.back().second;
s.pop_back();
if (is_unvisited_land(map, x, y)) {
// start maze runner here
int perimeter = 0;
int d = 0, ox = -1, oy = -1, od = 0; // originates from
do {
if (is_unvisited_land(map, x, y)) { // if it hasn't been visited
map[x][y] = 3; // visited land
perimeter++; // count the field as perimeter
}
if (is_unvisited_water(map, x, y + 1)) { // keep scanning
s.push_front({ x, y + 1 });
}
// adjust direction to not walk into water
for (int i = 0; i < DIRECTIONS; ++i, d = (d + 1) % DIRECTIONS) {
if (!is_water(map, x + MOVES[d][0], y + MOVES[d][1])) break;
}
if (ox == -1 && oy == -1) { // is it the first step?
ox = x;
oy = y;
od = d;
} else {
if (x == ox && y == oy && d == od) break; // walked around once
}
x += MOVES[d][0];
y += MOVES[d][1];
if (!is_waterLeft(map, x, y, d)) { // no water on the left-> turn left
d = (d - 1 + DIRECTIONS) % DIRECTIONS;
}
} while (true);
result.push_back(perimeter);
} else if(is_water(map, x, y)) { // it's water
if (is_within_map(map, x, y)) {
map[x][y] = 2; // visited water
}
if (is_within_map(map, x, y + 1)) { // move further right
s.push_front({ x, y + 1 });
}
}
}
return result;
}
int main()
{
vector<vector<int>> map{
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, },
{0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, },
{0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, },
{0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, },
{0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, },
{0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, },
{0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, },
{0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, },
{0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, },
};
cout << "before:" << endl << map << endl << endl;
cout << "result:" << calc_perimeter(map) << endl;
cout << "after:" << endl << map << endl << endl;
}
/* output
before:
{
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
{0,1,0,0,0,1,1,1,0,0,1,1,1,1,1,0,0,1}
{0,0,0,1,0,1,0,1,0,0,0,1,1,1,1,0,0,1}
{0,0,0,0,0,1,0,1,0,0,0,0,0,1,1,0,0,1}
{0,1,1,0,0,1,1,1,0,1,1,1,1,1,1,1,1,1}
{0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
{0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,1,1,0}
{0,1,1,1,1,1,0,0,1,1,1,1,1,0,0,1,0,0}
{0,0,0,0,0,0,0,0,1,1,0,1,0,0,0,0,0,0}
}
result:{1,4,5,1,10,10,23,3}
after:
{
{2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2}
{2,3,2,2,2,3,3,3,2,2,3,3,3,3,3,2,2,3}
{2,2,2,3,2,3,2,3,2,2,2,3,3,3,3,2,2,3}
{2,2,2,2,2,3,2,3,2,2,2,2,2,3,3,2,2,3}
{2,3,3,2,2,3,3,3,2,3,3,3,3,3,3,3,3,3}
{2,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2}
{2,2,2,2,2,2,2,2,3,2,2,2,3,2,2,3,3,2}
{2,3,3,3,3,3,2,2,3,3,3,3,3,2,2,3,2,2}
{2,2,2,2,2,2,2,2,3,3,2,3,2,2,2,2,2,2}
}
*/
bool is_within_map(vector<vector<int>>& map, int x, int y)
{
return (static_cast<unsigned int>(x) < map.size()
&& static_cast<unsigned int>(y) < map[x].size());
}
bool is_unvisited_land(vector<vector<int>>& map, int x, int y)
{
return is_within_map(map, x, y) && map[x][y] == 1;
}
bool is_water(vector<vector<int>>& map, int x, int y)
{
return !is_within_map(map, x, y) || map[x][y] == 0 || map[x][y] == 2; // visited or unvisited water
}
bool is_unvisited_water(vector<vector<int>>& map, int x, int y)
{
return is_within_map(map, x, y) && map[x][y] == 0;
}
bool is_waterLeft(vector<vector<int>>& map, int x, int y, int d)
{
return is_water(map, x + LEFT_FIELD[d][0], y + LEFT_FIELD[d][1]);
}
ostream& operator <<(ostream& os, const vector<int>& v)
{
os << "{";
bool first = true;
for (auto e : v) {
if (!first) os << "," << e;
else os << e;
first = false;
}
os << "}";
return os;
}
ostream& operator <<(ostream& os, const vector<vector<int>>& vv)
{
os << "{" << endl;
for (auto v : vv) {
os << " " << v << endl;
}
os << "}";
return os;
}
for some *intuition* on why longest path is np-hard on a un-directed, general graph:
1. It's not a decision problem: if somebody would give you a longest path, you can't verify it in polynomial time, because in order to do so, you would need to know if there isn't an other, longer path. A strong argument it's not an NP-complete problem.
2. Why can't BFS / DFS solve the problem: Because the order in which you traverse the graph matters, by example:
(b)
/ \
(a)--(c)--(e)
\ /
(d)
let's assume some DFS traversals:
i) it did d->a->b->c->e it had it: is obviously a longest path
ii) d->a->c->e, a->b: isn't but one can argue, similar like the longest path in a DAG
works, one just needs to extend the existing DFS-tree when detecting a forward node,
so let's consider:
iii) d->c->b->a, c->e: maybe if we start from each node a dfs? e.g. from e) we must find
the longest path?
iv) e->c->d->a->b: cool, how about
v) e->c->a->d, a->b: hmm... maybe it aint that easy
3) you can derive the problem from the hamiltonian path, because if a hamiltonian path exists, that's the longest one. Deciding if a hamiltonian path exists in a graph is a well-known NP hard problem.
@studip.innovates: "neighbors are one direction mapped" this is a minimal change in wording, but a complete different problem: in a DAG (directed a-cyclic graph) the problem of longest path is in P, the original problem is NP hard tough.
[en.wikipedia.org/wiki/Longest_path_problem]
Assumptions:
- "unique path" means a path given by label is non-ambiguous
- longest path means the path that includes most labels / visits most vertices, without visiting vertices twice (hamiltonian path if exists is the best solution)
- as the name sais it's an un-directed graph, I assume, the given adjacency list rep. of
the graph ensures all eges (u,v) originating at u have an edge (v,u) originating
at v in it's "neighbors" list
- If I encounter a cycle, I do not consider infinte paths as a solution and try to find
the longest path that is not going through a cycle (I ignore the cycle)
- I rename "neighbors" to adjacents
The problem is NP hard. So, an exponential solution is okay. I am not sure, if there are approximation algorithms and how they look like.
Therefore, I'd completely forget about graph algorithms and do some sort of counting
using the adjacency list repr. of the graph:
- try all node's as first nodes
- now for the second node, try all adjacents of the first node, that aren't already used
- repeat until no more options are available, do this recursively, so all options are
evaluated. Stop if a hamiltonian path (all vertecis visted once) was accidentaly found.
#include <unordered_set>
#include <vector>
#include <algorithm>
using namespace std;
class Node;
int max_path_rec(const vector<Node*>& graph, Node* previous, unordered_set<Node*>& usedNodes)
{
int max_sub_path = 0;
for (Node* next : previous->getAdjacents()) {
if (usedNodes.count(next) == 0) {
usedNodes.insert(next);
max_sub_path = max(max_sub_path, 1 + max_path_rec(graph, next, usedNodes));
usedNodes.erase(next);
if (max_sub_path == usedNodes.size() + 1) break;
}
}
return max_sub_path;
}
int max_path(const vector<Node*>& graph)
{
int max_graph = 0;
for (Node* first : graph) {
unordered_set<Node*> usedNodes{ first };
max_graph = max(max_graph, 1 + max_path_rec(graph, first, usedNodes));
if (max_graph == graph.size()) break;
}
return max_graph;
}
class Node
{
private:
int value_;
unordered_set<Node*> adjacents_;
public:
const unordered_set<Node*>& getAdjacents() const { return adjacents_; }
void link(Node* other) {
other->adjacents_.insert(this);
adjacents_.insert(other);
}
};
This can be nicely implemented, if the dictionary is in a trie.
Just start finding words in the trie, and put all word breaks onto
a stack, together with the number of words found (1 at the beginning)
Then go back to the stack pick the next, try to extend until you eventually
reach the the end of the string.
the question is when to stop. If you want to be sure to minimize, you have to
work through the complete stack. Intuitively, I would say, the algorithm will first
try longest words (with common prefixes), if he can make it through, the first solution
has a high chance of being the best. "High chance" needs to be investigated, if your life
depends on speed and accuracy.
Here an implementation, with the hash set for dictionary, slower, but doesn't need a Trie
implementation.
#include <string>
#include <unordered_set>
#include <stack>
#include <algorithm>
#include <limits>
#include <iostream>
using namespace std;
int min_word_breaks(const string& inputStr, const unordered_set<string>& dict)
{
int min_word_breaks = numeric_limits<int>::max();
size_t max_word_len = 0;
for (const auto& word : dict) max_word_len = max(max_word_len, word.size());
stack<pair<size_t, int>> s({ {0, 0} });
while (!s.empty()) {
auto idx = s.top().first;
auto word_ct = s.top().second;
s.pop();
if (idx == inputStr.length()) {
min_word_breaks = min(min_word_breaks, word_ct - 1);
} else if (word_ct + 1 < min_word_breaks) { // explore if I could beat current best
for (size_t len = 1; idx + len <= inputStr.length() && len <= max_word_len; ++len) {
if (dict.count(inputStr.substr(idx, len))) {
s.push({ idx + len, word_ct + 1 });
}
}
}
}
return min_word_breaks;
}
int main()
{
cout << min_word_breaks("CatMat", { "Cat", "Mat", "Ca", "tM", "at", "C", "Dog", "og", "Do" }) << endl;
}
For the first question, just travers the vote list and if vote.T <= T increment
the vote for person vote.Name. While doing that maximize the vote number.
(O(n*l) time, O(c*l) space, c is the number of candidates, l is average length of name)
follow-up 1: instead of maximizing one, keep the hashtable with votes[person] = no. votes
now, put that into a vector and find the k-th element (using e.g. quicksort's partion
method which is linear)
(O(n*l) time, O(c*l) space)
follow-up 2: I assume given are the top K candidates at a certain time T I have to find.
I have to keep all candidates sorted at each step and compare the top k of them with
the given list. The first part (keeping candidates sorted at each step) can be done
using a balanced binary-tree, so I have O(n*lg(n)+n*l) for creating and maintaining that tree.
(I will have to transpose the name into an integer, and have a nameId instead of the
string in the tree)
Then I would have k compare's per iteration, which is then O(k*n*lg(n)+n*l). the factor k
I can get rid of if I implement the tree in a way, so I monitor when elements enter and
leave the top k. If one of the desired candidates enters top k, I reduce the amount of
candidates I need in top k, if one leaves, I increment back. If that counter (which
starts with k) is 0 I'm done, I found the first time where the desired condition happend.
#include <string>
#include <vector>
#include <unordered_map>
using namespace std;
struct Vote
{
int time_;
string name_;
};
string find_top_at_time(const vector<Vote>& votes, int time) {
unordered_map<string, int> votes_per_name;
string max_votes_name;
int max_votes_count = -1;
for (const Vote& vote : votes) { // O(n)
if (vote.time_ <= time) {
auto it = votes_per_name.find(vote.name_); // O(l)
if (it == votes_per_name.end()) {
it = votes_per_name.insert({ vote.name_, 0 }).first; // O(l) compensated
}
it->second++;
if (it->second > max_votes_count) {
max_votes_count = it->second;
max_votes_name = vote.name_; // O(l)
}
}
}
return max_votes_name;
}
select the first with P(first) = 1 and keep that as selected number
if the 2nd arrives select that with P(2nd) = 0.5, run the experiment (e.g. if(random(2) == 0)) and take that as selected number accordingly, otherwise keep the other
if the 3rd arrives, select that with P(3rd) = 0.333, e.g. if(random(3)==0) ...
etc...
you need to prove uniformity, given your random(int max) is uniform:
P(selected = first) = 1/1 - (1/2 + 1/3 + ... + 1/n) = 1/n
P(selected = 2nd) = 1/2 - (1/3 + 1/4 + ... + 1/n) = 1/n
P(selected = k) = 1/k - (1/(k+1) + 1/(k+2) + ... + 1/n) = 1/n
void selectFromStream(intstream s) {
size_t read_so_far = 0;
int selected = numeric_limits<int>::min(); // suppose min never occures in the stream
while(!s.is_eof()) {
read_so_far++;
if(random(read_so_far) == 0) { // suppose random is uniform, returns [0, max)
selected = s.get_current();
}
s.next();
}
return selected;
}
what's a higher programming language? Is Perl? Is Bash? You basically only need a way to parse the file and an associative dictionary (ideally a hash table) and then you need a block function you can somehow call.
Does it make sense to block on requests per IP only? It might be a proxy server or many users sitting behind a NAT. ... anyway, iterate the logs and count per IP using a hashtable should do it.
1) it's just the Manhattan distance, first move m-1 fields down, then n-1 left. Probably the original question had some fields in the matrix one couldn't touch. Then we would need some shortest path algorithm, simple to implement would be single source BFS, elegant and fast would be A*. In the middle is bi-source BFS (easier to implement, still quite stupid on a map)
2) There are infinite amount of paths, because there is no requirement that limits the path length or prohibits going in cycles, so, if there is one path, there are infinite many paths (you can cycle between any two fields as long as you want)
If you limit either the path length or set a constraint of not visiting a field twice, the problem has a DP solution.
it's a common problem when serializing/de-serializing, just on the deep-copy example.
There are multiple pointers to the same Node, so I need to only create the Node once. I don't know which is first, so I need a kind of a factory method that creates it if it's not already created or returns the pointer to the already created. Object Identification in this example is it's pointer value.
Memory cleanup is not done hear, code will leak of course.
#include <unordered_map>
using namespace std;
class Node
{
char value_;
Node* next_;
Node* random_;
public:
Node(char value) : value_{ value }, next_{ nullptr }, random_{ nullptr } {}
Node* deepCopy() const {
unordered_map<Node*, Node*> refs; // hashtable, orig-ptr -> copy-ptr
Node* current = this->next_;
Node* thisCpy = getOrCreate(refs, current);
Node* currentCpy = thisCpy;
while (current != nullptr) {
currentCpy->random_ = getOrCreate(refs, current->random_);
currentCpy->next_ = getOrCreate(refs, current->next_);
current = current->next_;
currentCpy = currentCpy->next_;
}
return thisCpy;
}
private:
static Node* getOrCreate(unordered_map<Node*, Node*>& refs, const Node* orig) {
if (orig == nullptr) return nullptr;
auto it = refs.find(orig);
if (it == refs.end()) {
Node* copy = new Node{orig->value_};
refs[orig] = copy;
return copy;
}
return it->second;
}
};
Cool, I like the randomization approach, but would still combine it with skip list, imagine you have to find the last empty slot in 1 Mio, you will, in average, try 1 Mio Times. There are two issues, if the empty slots get few, it takes longer and longer to find it, the second is, it's non deterministic in terms of run time - that is all easy with few active sessions.
The other issue is, a 6 digit session number by random doesn't really introduce security, meaning it's still easy to guess a session id if there is no other security measure. Randomized session key that introduce some security are significantly larger and thus the space of used numbers is sparse.
But randomized with skip list would be cool :-) although still nondeterministic, but that might be ok... ...
@shilpitaroy87: distribution is no an easy problem, you have only 1 Mio sessions, active players will probably move with time zones around the world (maybe at night at a certain location...). I guess at this point the 10^6 sessions make maybe not so much sense anymore (technically). Maybe you have servers in each continent, just to improve gaming experience due to round-trip times and to save costs. So, a session might be local to a geographical location as well. That's the first and maybe easiest catch.
Then even in a geographical local zone, you might end up with the need to have two or three such session servers (replicas). If you can have a master and a hot-standby, then the only thing you need to make sure, is, that your slave is "almost always" in sync with the master and a load balancer that will switch to the slave if the master is offline (and a strategy how the master can get back online).
But what if one master can't deal with the traffic anymore? You could start with the DNS server again and do round robin etc. How ever, this will partition your session space, etc. etc.
I think at this point there are so many questions to ask, a generic answer is not possible anymore. What infrastructure does the company have, do they host on a public cloud? what does that cloud offer? how many concurrent sessions in a time zone do we expect? why 6^10 sessioin id's, why don't we extend? Should we do a perfect solution that only few can pay for or a pragmatic approach that we can grow later, if we have money? Are we in a startup or are we Google? etc. etc. etc.
updated:
He may wanted a bit array instead of HT. Why? Because a HT to store a single session id will use more than a 32 bit value, so, when all sessions are assigned you have more than 4*1 Mio. That's probably between 4 MB and 8 MB, not a big deal actually. Lookup is O(1), etc... But you can just do better in terms of memory usage and constant factors if you expect many many concurrent sessions (not good for 10 or 100).
Use an integer array, let's say 64 bit: each int can store 64 used/not used Infos. You will use 1MBit, that's 32 times less space. Put and set is bit shifting, direct memory access and bit masking. No compensated O(1) when growing HT, no hash calculation, no re-allocation.
something else to consider is the get function. How to get quickly a not used session. O(n) is obvious but not good. We could introduce a skip list (I think it is called) which would be an array that marks if there is an empty slot in the layer below, each layer reducing the size by, let's say 64.
at the bottom layer you have ceiling(10^6 / 64) 64 bit integers = ~16'000 ints, 128 KByte
the next layer above could store in a 64 bit if any of below 64 integers have a single slot free, that's 250 64 bit integers, 2 KBytes
the next layer above would be 4 64 bit integers
... you get log64(n) layers, so you could find an empty session in log64(1Mio) * 64 bit manipulation steps (worst case) 5 * 64 ~ 300 instructions, little cache misses, should be fast
/* sample with 4 bit integers instead of 64 (all binary)
1. layer 0001
||||
|\\\---------\
| \\----- \
| \ \ \
2. layer 0101|1001|0000|1111
3. layer 0001|1111|0111|1111|1111|0101|0011|1111
note the upper layer has only a 1 if all 4 of below layers have a 1, otherwise a 0 marks there
is some empty slot
*/
The 64 I used because I assume a 64 bit architecture, so, it makes little sense to fetch 32 bit values from the memory / cache while I can 64 with the same speed. Bottlenect will be memory, cache I/O anyway, so use the bus width.
As well interesting would be a follow up, how to distribute/scale across machines if you had many more sessions and want some stability against single machine failure / network partitioning.
I understood there is a service that receives an SMS with a keyword and sends back flights via SMS? or available seats?
I suppose you want to mock the SMS service, so you can test the system without actually needing the roundtrip over the cell network?
Depends when you want to test that on which infrastructure. You might go as far as having fake-numbers, that your text message gateway will reroute to a e-mail or any other queue for testing. But then, you might want to make sure those are not accessible from outside, not that you give away your seats to fake-customers and end up in non-fake-news ...
not sure how this design will scale, but anyway, I understood we have:
- enum integer values that are hirarchical, e.g. each decimal place represents a hirarchy
- there is a way to lookup the textual representation of the enum value e.g. there is a hashtable with integer key and text representation value
- you want the parent-key of a key? that is integer division by 10
- I wonder what you do if there are more then 10 children .. but anyway, here the code
unordered_map<int, string> enums_{{1, "car"}, {11, "toyota"}, {111, "prius"} /*, ... */ };
pair<int, string> get_parent(int value) {
if(value >= 10) {
auto it = enumns_.find(value/10);
if(it != enums.end()) {
return it->second;
}
}
throw "error, has no parent (?)";
}
1. Assumtions:
- coordinates are integers (no floating point)
- forming rectangles means I need the 4 edge points
- parallel to x,y means there is either no difference in x or in y direction between two points adjacent to each other (horizontal or vertical lines)
- points and lines count area 0
2. Put all points into a hashtable with key x << 32 | y
3. for each pair of point check if other edges exist, if so maximize on the area
This is O(n^2), I guess a better runtime is feasible.
in c++ with major overflow issues covered
#include <vector>
#include <unordered_set>
#include <iostream>
using namespace std;
using ull = unsigned long long;
constexpr ull key(int x, int y) noexcept {
return static_cast<ull>(x) << 32 | static_cast<unsigned int>(y);
}
ull find_max_area(const vector<pair<int, int>> points) // vector of (x,y)
{
ull max_area = points.size() > 0 ? 1 : 0;
unordered_set<ull> points_ht;
for (auto& p : points) points_ht.insert(key(p.first, p.second)); // could remove doublets here
for (int i = 0; i < points.size() - 1; ++i) {
for (int j = i + 1; j < points.size(); ++j) {
ull area = static_cast<ull>(abs(points[i].first - points[j].first)) // subtraction's can still lead to overflow
* static_cast<ull>(abs(points[i].second - points[j].second));
if (area > max_area
&& points_ht.count(key(points[i].first, points[j].second)) > 0
&& points_ht.count(key(points[j].first, points[i].second)) > 0) {
max_area = area;
}
}
}
return max_area;
}
int main()
{
cout << find_max_area({ {0,0}, {0,3},{0,6},{2,0},{2,3},{2,6},{5,6},{6,0},{6,4} }) << endl; // 12
}
As mentioned by others, I need to maintain the intervals that cover and don't overlap.
This can be done using a BST keeping start in order. I need to ensure nothing overlaps,
that is, when I insert an interval, I might need to merge. If I merge, I only count
the part of the interval that was not already covered.
time complexity: O(n*lg(n))
space complexity: O(n)
I think the implementation can be significantly less complex. Didn't test corner
cases tough.
It turns out it's slightly easier to implement using half open intervals.
With Fenwicks (Binary index tree) we can't do much good here because the intersection function of intervals is not reversible (unlike e.g. sum).
#include <vector>
#include <map>
#include <limits>
#include <iostream>
using namespace std;
struct Interval
{
int start_, end_;
};
void countCoveredLength(const vector<Interval>& stream)
{
int covered_space = 0;
map<int, int> covered({ { numeric_limits<int>::min(), numeric_limits<int>::min() },
{ numeric_limits<int>::max(), numeric_limits<int>::max() } }); // map<start, end>
for (const Interval& i : stream) {
if (i.start_ != i.end_) { // ignore empty intervals
auto it = prev(covered.upper_bound(i.start_)); // it points to interval that starts before i
if (it->second < i.start_) { // if it ends before i starts, add a new one
it = covered.insert(it, { i.start_, i.start_ });
}
while (it->second < i.end_) { // if i ends after it, extend tree
auto it_next = next(it);
if (it_next->first <= i.end_) { // next interval in tree starts before i ends, merge them
covered_space += it_next->first - it->second;
it->second = it_next->second;
covered.erase(it_next);
}
else { // next interval in tree starts after i ends
covered_space += i.end_ - it->second; // new end - old end
it->second = i.end_;
}
}
}
cout << covered_space << " ";
}
}
int main()
{
countCoveredLength({ {1,3},{4,7},{5,8},{3,6},{9,12} });
}
Maybe this question is inspired by Google big table paper. But I am not sure what the question really is. Do you want all the urls ending in .com or ending in Google.com?
The question asks for substring, which could be "e.c" ... But then the fact those strings are urls doesn't matter.
Let's assume the list of URLs is huge, is updated rarely whereas there are many queries. Let's further assume the query asks for a common suffix like give me all urls ending in .com or give me all urls ending in "e.com"
In this case, we could revert the urls "com.google..." and sort them once. Then we could binary search into the sorted lists. That can be parallelized neatly (all steps).
That's basically what @Alex does, just he inverts the string when constructing the Trie and constructing the Trie is favorable in O-notation. Just keep in mind, Tries are difficult to scale for big data...
Assumptions:
- Result's are sets, meaning a number can occures at most once (e.g. {1,2,3} is a set,
{1,1,2} is not a multi-set, later is not taken into account as valid result)
- The empty set is considered to sum to 0, so there is always at least one solution,
the empty set, which is a valid subset of a set
- I simplify the problem by using a set as input. n = number of elements in the set.
Brute force solution:
- There are 2^n possibilities, which includes the empty set.
- The exponential brute force approach is therefore just testing those options
- The problem is known NP-complete problem
time complexity (2^n), space complexity O(n)
Pseudo polynomial time algo:
- One can construct a solution using two times the knap-sack algorithm
- Lets assume S* is the powerset of the input A and S is a set in that powerset
- If sum(S) = 0, I can say sum(P) = -sum(N), whereas P contains positive element from S
and N the negative once.
- So, If I had a hashmap with all sums possible with all the subsets of A consisting of
only positive numbers and a hashmap with the same for the negative numbers of A. I
could calculate in a linear pass all possible combinations taht will sum to 0
- There is a complication, which is "0" itself, first the set with {0} is a solution,
second, every other combination could always be with 0 element or without.
So If 0 exists in A, it doubles the amount of solutions and adds 1. Therefore I don't
use 0 neither in positive nor negative subsets.
the time and space complexity is
O(n*min(sum(A[i], where A[i] > 0), sum(-A[j], where A[j] < 0)))
Below are both implementations in C++ 11
#include <unordered_map>
#include <unordered_set>
#include <algorithm>
#include <iostream>
using namespace std;
// --- brute force algo ---
size_t count_subsets_sum0_bf_rec(const unordered_set<int>& input,
unordered_set<int>::iterator pos, int sum)
{
if (pos == input.end()) return sum == 0 ? 1 : 0;
int value = *pos;
++pos;
return count_subsets_sum0_bf_rec(input, pos, sum + value) +
count_subsets_sum0_bf_rec(input, pos, sum);
}
size_t count_subsets_sum0_bf(const unordered_set<int>& input)
{
return count_subsets_sum0_bf_rec(input, input.begin(), 0);
}
// --- poly time algo ---
unordered_map<int, int> sum_count(const unordered_set<int>& input, int max_sum)
{
unordered_map<int, int> previous;
previous[0] = 1;
for (auto it = input.begin(); it != input.end(); ++it) {
unordered_map<int, int> current;
for (auto it2 = previous.begin(); it2 != previous.end(); ++it2) {
current[it2->first] += it2->second;
if (it2->first + *it <= max_sum) { // sum is within range
current[it2->first + *it] += it2->second;
}
}
swap(current, previous);
}
return previous;
}
size_t count_subsets_sum0_poly(const unordered_set<int>& input)
{
int pos_sum = 0, neg_sum = 0;
unordered_set<int> pos, neg;
for (int e : input) {
if (e > 0) {
pos.insert(e);
pos_sum += e;
} else if (e < 0) {
neg.insert(-e);
neg_sum -= e;
}
}
int min_sum = min(pos_sum, neg_sum);
unordered_map<int, int> pos_sums = sum_count(pos, min_sum);
unordered_map<int, int> neg_sums = sum_count(neg, min_sum);
int count = 0;
for (auto& ps : pos_sums) {
if (ps.first == 0) continue;
auto nsit = neg_sums.find(ps.first);
if (nsit != neg_sums.end()) {
count += nsit->second * ps.second;
}
}
if (input.size() != neg.size() + pos.size()) { // contains input a 0-element?
count = count * 2 + 1; // every already found set that sums to 0 + the 0 element, plus the set consisting of only {0}
}
count++; // the empty set
return count;
}
void test(const unordered_set<int>& input) {
cout << "test case: {";
for (int e : input) cout << e << " ";
cout << "}" << endl;
int poly = count_subsets_sum0_poly(input);
cout << " poly result: " << poly << endl;
int bf = count_subsets_sum0_bf(input);
cout << " brute force result: " << bf << endl << endl;
}
int main()
{
test({ });
test({ 0 });
test({ 0, -1, 1 });
test({ 0, -1, -2, 1, 2 });
test({ 0, -1, -2, -3, 1, 2, 3, 4, 5, 6 });
test({ -1, -2, -3, 1, 2, 3, 4, 5, 6 });
test({ 0, -1, -2, -4, -6, 1, 2, 3, 4, 5, 6 });
}
without use case this is hard to answer out of the blue, a few questions that might help moving it forward:
- How big is the data?
- Is it structured data or a stream (e.g. video, audio)?
- Is it just a configuration issue?
- What do you plan to do with the data once received?
- Does it fit in memory on the requesting server (the one that looks up the data in the cache)? How about, if it handles multiple requests? Does it just send the data to an other machine, server?
- can you break it down into smaller pieces and get those pieces individually (multi get)?
- can you work on parts of the data?
- can you compress the data? (does this make sense)
- maybe the server just streams the data back to client, so, I might help using ip-rewriting and make the cache servers send individual chunks to the client... not sure if memcached can do that...
... maybe you can add a use case and some numbers?
neat!
my approach (assuming the same keyid can open a door from both sides)
Node in the graph is the room. The room has two attributes, a keyId (int) and a treasure
(bool). The edges are the doors. The door has another attribute which is the keyId (int)
it needs in order to open it. KeyId -1 is a sentinel for "no key".
To solve it, I choose a DFS which I perform until I reach a door I can't open. Then
I put the DFS branch I was exploring into a HT with key-Id as HT-key and a list of rooms
that I can continue with when I get that key id as HT-values.
It will be linear in the number of rooms if the graph is sparse, or linear in the number
of doors if the graph is dense.
#include <iostream>
#include <vector>
#include <stack>
#include <algorithm>
#include <unordered_map>
#include <unordered_set>
using namespace std;
struct Room;
struct Door
{
Room* next_;
int keyId_;
}; // not a good representation of the real world, but convenient for the algo
struct Room
{
int keyId_ = -1;
bool hasTreasure_ = false;
vector<Door> doors_;
Room(int keyId) : keyId_(keyId), hasTreasure_(false) { }
Room(bool hasTreasure) : keyId_(-1), hasTreasure_(hasTreasure) { }
void connect(Room *next, int keyId = -1) {
doors_.push_back({ next, keyId });
next->doors_.push_back({ this, keyId });
}
};
bool hasPuzzleASolution(const Room* start)
{
unordered_map<int, vector<const Room*>> lockedRooms; // keyid -> locked rooms
unordered_set<int> keys{ { -1 } }; // collected keys
unordered_set<const Room*> explored; // rooms explored
stack<const Room*> s{ {start} }; // rooms exploreable
while (s.size() > 0) {
auto current = s.top();
s.pop();
if (explored.count(current)) continue; // multiple paths could have led me here
explored.insert(current);
if (current->hasTreasure_) return true; // found a / the one treasure
if (keys.count(current->keyId_) == 0) { // a new key
keys.insert(current->keyId_);
auto lrIt = lockedRooms.find(current->keyId_);
if (lrIt != lockedRooms.end()) { // unlocked a room
for_each(lrIt->second.begin(), lrIt->second.end(),
[&s](const Room* r) { s.push(r); });
lrIt->second.clear();
}
}
for (auto door : current->doors_) {
if (explored.count(door.next_) > 0) continue;
if (keys.count(door.keyId_) > 0) { // already have the key
s.push(door.next_);
} else {
lockedRooms[door.keyId_].push_back(door.next_);
}
}
}
return false;
}
int main()
{
Room a(false), b(1), c(3), d(-1), e(2), f(true);
a.connect(&b);
b.connect(&c, 6);
a.connect(&c, 1);
a.connect(&d, 1);
d.connect(&e, 3);
e.connect(&f, 7);
cout << hasPuzzleASolution(&a) << endl;
c.connect(&f, 2);
cout << hasPuzzleASolution(&a) << endl;
return 0;
}
Rep
Rep
Repjudydschultz1234, Android test engineer at Eterno Infotech Pvt Ltd
Spent a weekend researching weebles in Nigeria. My current pet project is developing strategies for how to break someone's ...
RepAmber Van is the top rated company that offers friendly and professional removals services.Our featured services includes domestic moves ...
RepJames Gulledge, Dev Lead at ADP
Get you driving in comfort and class in Atlanta! Prestige Luxury Rentals offer a wide selection of premium vehicles Our ...
Open Chat in New Window
@Alex, that's right as well. I'd as well create a histo if I had to reduce space and can't shorten the interval.
- Chris November 29, 2017