## Facebook Interview Question

Software Engineer / Developers**Country:**United States

**Interview Type:**In-Person

There could be an exponential number of paths, so this would not be fast. Finding all *independent* paths isn't right either, because the new shortest path to a destination need not be independent of the original shortest path; it needs to simply not contain 1 edge.

Can you talk a little more about it? If the graph has cycles. It maybe runs forever to find all paths from source to destination.

Run dijkstra's on the graph (assuming we don't have negative edge cycles). For each node keep an adjacency list for best paths for ex: I can reach node 'f' directly from s, but when I also know next best path is via 'g' so adjacency list of 'f' will have s->g->.... Once we've this adjacency list setup, we just have to find path from destination to source traversing best available paths (fetched from our list).

There are many ambiguities in that definition of the problem. First of all is the graph weighted or unweighted? If graph is weighted all you need to do to avoid considering some edge is mark it somehow and then skip it during traversal. Dijkstra with binary heap would give you O(n lg n). Simple solution, something must be missing.

I saw this problem "Going Office" on Interviewstreet.

PS - This is not my solution. Credit goes to Richard Kempe

The main idea behind the problem is simple: given a weighted undirected graph G=(V,E) and two nodes s,t∈V, we wish to answer Q queries of the following form:

What is the shortest path from s to t given that edge e∈E is removed from the graph?

(This type of problem is also known as the "most vital arc" problem )

Let N=|V| and M=|E| as usual. In this case, we are guaranteed that N,M,Q≤200000.

Prerequisites

You should definitely know Dijkstra's algorithm before attempting to solve this problem. Also, knowing about segment trees with lazy propagation would be good, but is not required to understand the solution.

Building Bridges

First of all, it should be clear that naively running Dijkstra's isn't going to run in time, because it requires O(QNlogN) operations. In fact, the size of the input suggests that we should look for at worst an O(NlogN) solution. What can we do then? We may first try to solve an easier problem: can we determine whether an edge e, when removed, will increase the shortest path length between s and t?

Definitions:

An edge is optimal if it is on a shortest path from s to t.

An edge is a bridge* if it is on all shortest paths from s to t. (Here we refer only to shortest paths in G, without having removed any edges.)

Let ds(u) (resp. dt(u)) be the shortest-path distance from s (resp. t) to u, and let OPT be the shortest-path distance from s to t. Then we can show the following properties:

Lemmata:

e=(u,v) is optimal if and only ds(u)+length(e)+dt(v)=OPT.

e=(u,v) is a bridge if and only if it is optimal and for all other optimal e′=(u′,v′), we have either ds(v′)≤ds(u) or ds(v)≤ds(u′). In other words, when we travel along an optimal path from s to t, then between lengths ds(u) and ds(v), we must be traveling along e. By convention we denote a bridge with the letter b.

As a corollary of the above statement, e is a bridge if and only if, when removed from G, the shortest path distance from s to t increases.

(Proofs are left to the reader.) These properties should give you some idea of how to find all the bridges in G. If you still need some hints, an algorithm for computing bridges is included at the end of this post.

Islands

Now we have a way of identifying which edges, when removed, affect the shortest path length. This still leaves one major question unanswered: given a bridge b, how exactly do we determine the shortest path length from s to t once b is removed? Running Dijkstra's repeatedly is too slow, since we can easily construct examples in which every edge is a bridge. (Consider a graph consisting of only a path from s to t.)

Let K denote the number of bridges. Note that Lemma 2 above implies that we can uniquely order the bridges b0,b1,…,bK−1 based on their relative distances from s. Could we use this to help us?

Definition: Let the i-th island be defined as the set of all vertices v, such that there exists a shortest path from s to v using no more than i bridges. (If v is not connected to either s or t, then we can ignore it -- it lies on some strange continent somewhere far away.)

The intuition behind islands and bridges is this: bridges are the fastest way to "travel" between the islands when going from s to t. If we remove an edge that is not a bridge, our shortest path is not affected. However, when we take out a bridge, we are forced to travel along another (possibly much longer) path.

Proposition: Consider the bridge bi, connecting the i-th and i+1-st islands. Let Ei be the set of all edges e connecting an island with index ≤i to an island with index >i. Then the shortest path from s to t that bypasses bi must have length

mine=(u,v)∈Ei{ds(u)+length(e)+dt(v)}

Proof: Any path P from s to t bypassing bi must include some edge in Ei. Now consider some edge e=(u,v)∈Ei -- the shortest path from s to t that goes through e must have length ds(u)+length(e)+dt(v). Lastly, by the way we have constructed the islands, we know that bi is not on the shortest paths from s to u or from v to t.

A Data Structure for Bypass Length

Let bypass(i) denote the length of the shortest path that bypasses bi. The above observations suggest that we can run the following algorithm to compute bypass(i):

For each non-bridge edge e=(u,v) connecting islands i and j, where i<j:

For each k∈{i,i+1,…,j−1}:

bypass(k)←min{bypass(k),ds(u)+length(e)+dt(v)}

However, there is a slight problem with this algorithm: it's too slow. Through some tricky analysis, you can show that in the worst case, we can construct a graph that will cause the above subroutine to take O(M3/2) operations.

What we need is a data structure that supports the following two operations efficiently:

update(i,j,val), which runs bypass(k)←min{bypass(k),val} for all k∈{i,i+1,…,j−1}

query(i), which returns bypass(i)

One possible data structure that can support both operations in O(logK) time is called a segment tree with lazy propagation or a segment tree with range update. There are several tutorials online about this type of tree, and it's a good data structure to be familiar with. (The main problem right now is that the structure is rather complicated and the online tutorials are not very good. I may do an article on it some time, if there is sufficient demand.)

Putting It All Together

The previous observations suggest the following algorithm for solving the problem:

Compute ds and dt using Dijkstra's algorithm.

Find the bridges:

First find all the optimal edges by iterating over all edges e=(u,v), and storing the edges such that ds(u)+length(e)+dt(v)=OPT.

Sort all the optimal edges by ds(u) and ds(v). Use the criterion from above to find the bridges.

Find the islands.

One way (the hard way) is to run a modified version of Dijkstra.

A smarter and shorter alternative is to just run multiple depth-first searches from s and from the far end of each bridge. It's up to the reader to figure out how to do so.

Find the bypassing paths.

Initialise your favourite range-update data structure.

Iterate over all the non-bridge edges. If e=(u,v) is an edge crossing from island i to island j>i, then range-update bypass(i),bypass(i+1),…,bypass(j−1) such that they are no more than ds(u)+length(e)+dt(v).

Process all the queries. If the edge in question is not a bridge, then we simply return the optimal length. Otherwise, we query the range-update data structure to find the best bypass length.

All of the above steps run in require O(RlogR) operations where R=max{N,M,Q}, so the algorithm has complexity O(RlogR).

Wait...what? If there's only 1 path, and you remove one of the edges on it, won't that mean there now definitely won't be any paths.

?

Guys the graph is weighted .. and we cannot sort and store all pahts as the number of paths could be 10^22.

You should clarify the problem more. What did you mean when you said there's only one path between any two cities? If that's true, it sufficies to check whether the deleted edge is on the only path, and depending on that, the shortest path will either be the path you had before or will not exist.

If there is only one path between any two cities, there are cycles in the graph. It's a minimum spanning tree. Can you revisit the problem and clear the ambiguities.

Sounds like the edges are weighted (representing the distance), such as e1 = (v1,v2,d) and there exists at most 1 edge between any two nodes? Then a weighted shortest path algorithm helps...but do we want to execute it fully n times, once for each round, for the entire graph?

Suppose on round 1, some edge e1 = (v1,v2,d) is missing and the shortest path is computed from the start to end node. but we could also compute the shortest path from

the start node to every node and save those paths (or at least the cost).

Then on round 2, e1 is placed back in and another edge, e2 = (v3,v4,d) is removed. If e2 is on the shortest path in round 1, then the solution for round 1 no longer holds, otherwise the solution from round 1 -may- still hold.

By placing e1, is e1 now on the new shortest path...seems like we could focus on the cost of the path from (a) the start node to v1, (b) cost of e1, (c) the cost of v2 to the end node. (a) and (b) are already known, so we compute (c), add them together to get the cost of the new path containing e1.

So maybe something along these lines, where shortest path is recomputed, not for the whole graph, but from the vertices of the two changed edges.

Suppose we have run the Dijkstra algorithm on original graph: g0;

We have got the shortest distance array dist0[n].

Now, we have a new graph: g with the edge (w, v) removed;

dist0[n] and dist[n] are two global arrays, which store the shortest distance array from source of the original and removed edge graph.

Algorithm should be as followed;

void shortest(int s, int t) // s source, t destination

{

// 1. initialize

// copy dist0[n] to dist[n]// dist[n] is the array storing the new shortest distance;

for (int v = 0; v < n; v++)

{

dist[v] = dist0[v];

}

// 2. compare dist0[v] - dist0[w] and weight(w, v), if weight(w, v) == dist0[v] - dist0[w], we need // to update dist[v];

if (dist0[v] - dist0[w] == weight(w, v))

{

dist[v] = 10000000;

for ( j in adjR[v] ) // adjR is the reversed graph adjacency list

dist[v] = min(shortest, dist[j] + weight(j, v));

} // new dijkstra greedy score dist[v] is the minimum score of all other vertices scores + edge weights;

// recursively update the dijkstra greedy score vertices connected to vertices v;

update(s, t, v);

}

// recursive program.

void update(int s, int t, int v)

{

int u;

for (u in adj[v])

{

if (dist0[u] - dist0[v] = weight(u, v))

{

dist[u] = dist[v] + weight(u, v);

update(s, t, v);

}

}

One way could be to store the path lengths from source to dest in in a min heap. now whenever you remove the an egde , you check if the edge belongs to the shortest path(O)1) time). If yes then delete the egde, swap the shortest path with the path stored at the end of heap array, and heapify the array(excluding the swapped array). continue this process till you find a the shortest path which is not a part of removed edge.

This can work for the cases when edge is added back. Just figure out the path sums which were part of that edge , increase the size of heap array by 1 and heapify again to get the shortest path

find all paths from source to destination, sort them and store.

- Anonymous April 22, 2012remove one edge, go through the stored paths and find the first which doesn't contain the removed edge