Interview Question
Software Engineer / DevelopersCountry: United States
Yep, this was a CodeSprint RocketFuel problem. It's probably harder than anything you'll encounter in an entry-level interview.
This is an interesting problem. So I spent some time writing it out. It is harder than a general interview question. But if you know dijkstra and dp well, the solution is still not obscure. But it is true that I cannot finish this code in 45 mins.
The code pasted here is correct. It's a complete standard C++ program with very clear output. Everyone here could copy it and run it yourself to see the correctness.
I extended the original problem a little bit-- the costs of decreasing a place and increasing it are different. In my implementation, you can control the cost of decreasing 1 unit of the hills as well as increasing them. So in the 2nd test of my code, I choose the increamental cost as 100, so the ouput effectively avoid any increasing move.
The algorithm is simple combination of DP and Dijkstra. Instead of doing dijkstra once, I did it (elevation_of_lake - elevation_of_dam) times. You can consider it as replacing the inner loop of a 2-d DP algorithm with the dijkstra traversal. That's it.
So obviously, the time complexity is O(dijkstra * (elevation_of_lake - elevation_of_dam))
There are many ways to implement dijkstra, depending on the propotion of edges to nodes. In my implementation, I chose a basic way, so the time complexity is O(matrix_size) . A prioriety queue implementation could reduce it by a constant.
In short, the time complexity of my algorithm is O(matrix_size^2 * (elevation_of_lake - elevation_of_dam)) and it is optimal (other implementation of dijkstra can only improve it by a const number, not affecting the big-O)
The space complexity is O(matrix_size) because I did a standard DP memory compression.
If you are willing to understand this non-interview question. I may talk more.
Here is the code.
#include <iostream>
#include <limits>
#include <vector>
using namespace std;
const static double INFINITY = numeric_limits<double>::infinity();
struct Position
{
int m, n;
Position (int m_ = -1, int n_ = -1) : m (m_), n(n_) {}
Position left() const {return Position( m , n - 1);}
Position right() const {return Position (m ,n + 1);}
Position above() const {return Position (m - 1, n );}
Position below() const {return Position (m + 1, n );}
};
class DamLakeSolver
{
int m_row, m_col;
int m_level;
double m_inc_cost;
double m_dec_cost;
vector<int> m_matrix;
vector<double> m_cost_matrix;
vector<bool> m_done_matrix;
vector<Position> m_traceback_matrix;
inline int idx(Position pos) const{ return (pos.m) * (m_col) + (pos.n) ;}
void relax(Position pos0, Position pos1);
void dijkstra(Position lake_pos) ;
Position extractMin() const;
void printPath(Position lake_pos) const;
public:
inline void setIncrementalCost(double cost) {m_inc_cost = cost;}
inline void setDecrementalCost(double cost) {m_dec_cost = cost;}
double solve(Position lake_pos, Position dam_pos);
DamLakeSolver(int * matrix, int row, int col)
:m_row(row), m_col(col), m_level(0), m_matrix(matrix, matrix + col * row), m_inc_cost(1.), m_dec_cost(1.)
{
}
};
void DamLakeSolver::printPath(Position lake_pos) const
{
Position cur_pos = lake_pos;
cout<<"Position: "<<cur_pos.m + 1 <<" "<<cur_pos.n + 1;
Position pre_pos = cur_pos;
cur_pos = m_traceback_matrix[idx(cur_pos)];
while (cur_pos.m != -1)
{
double cost = m_cost_matrix[idx(pre_pos)] - m_cost_matrix[idx(cur_pos)];
cout<<" Cost: "<<cost<<endl<<"Position: "<<cur_pos.m + 1 <<" "<<cur_pos.n + 1;
pre_pos = cur_pos;
cur_pos = m_traceback_matrix[idx(cur_pos)];
}
cout<<endl;
}
double DamLakeSolver::solve(Position lake_pos, Position dam_pos)
{
//programmers count index from 0!
lake_pos.m --;
lake_pos.n --;
dam_pos.m --;
dam_pos.n --;
m_cost_matrix.assign(m_row * m_col , INFINITY);
m_traceback_matrix.assign(m_col * m_row, Position());
for (m_level = m_matrix[idx(dam_pos)]; m_level <= m_matrix[idx(lake_pos)]; ++ m_level)
{
dijkstra(dam_pos);
}
printPath(lake_pos);
return m_cost_matrix[idx(lake_pos)];
}
Position DamLakeSolver::extractMin() const
{
Position min_pos(0, 0);
double min_cost = INFINITY;
for (int i = 0; i < m_row ; ++i)
for (int j = 0; j < m_col; ++j)
{
Position cur_pos(i, j);
if (m_done_matrix[idx(cur_pos)])
continue;
if (min_cost > m_cost_matrix[idx(cur_pos)])
{
min_cost = m_cost_matrix[idx(cur_pos)];
min_pos = cur_pos;
}
}
return min_pos;
}
void DamLakeSolver::dijkstra(Position dam_pos)
{
m_cost_matrix[idx(dam_pos)] = 0;
m_done_matrix.assign(m_row * m_col, false);
m_done_matrix[idx(dam_pos)] = true;
Position cur_pos = dam_pos;
for (int i = 0; i < m_col * m_row; ++i)
{
if (cur_pos.n > 0){ relax(cur_pos, cur_pos.left());}
if (cur_pos.n < m_col - 1) {relax(cur_pos, cur_pos.right());}
if (cur_pos.m > 0) {relax(cur_pos, cur_pos.above());}
if (cur_pos.m < m_row - 1) {relax(cur_pos, cur_pos.below());}
cur_pos = extractMin();
m_done_matrix[idx(cur_pos)] = true;
}
}
void DamLakeSolver::relax(Position pos0, Position pos1)
{
int path_cost = m_cost_matrix[idx(pos0)];
double &cur_cost = m_cost_matrix[idx(pos1)];
double new_cost = 0;
if (m_matrix[idx(pos1)] < m_level)
new_cost = path_cost + m_inc_cost * (m_level - m_matrix[idx(pos1)]);
else
new_cost = path_cost + m_dec_cost * (m_matrix[idx(pos1)] - m_level);
if (new_cost < cur_cost)
{
cur_cost = new_cost;
m_traceback_matrix[idx(pos1)] = pos0;
}
}
int main()
{
int matrix[] = {2, 7, 3, 1, 0,
2, 1, 1, 7, 1,
7, 7, 7, 2, 1};
DamLakeSolver dls(matrix, 3, 5);
dls.setIncrementalCost(1); // the cost of decreasing 1 unit of elevation of the hill.
dls.setDecrementalCost(1); // the cost of increasing 1 unit of elevation of the hill.
cout<<dls.solve(Position(1,1), Position (3,4))<<endl;// lake --> dam
//if you change the cost ratio, the answer changes.
dls.setIncrementalCost(100);
dls.setDecrementalCost(1);
cout<<dls.solve(Position(1,1), Position (3,4))<<endl;
}
A typo, the time complexity of my implementation of Dijkstra is O(matrix_size^2 ) instead of O(matrix_size). It is a typo. You can see that in the conclusion part, I gave the right time complexity: O(matrix_size^2 * (elevation_of_lake - elevation_of_dam))
Why is this the best a Dijkstra's algorithm can achieve? I believe O(matrix_size * (elevation_of_lake - elevation_of_dam)) should be achievable. By matrix size, you mean M*N?
because, the basic way of implementing Dijkstra has a O(V^2) time complexity. To use a min-heap-based priority queue, we could do it in O(ElgV). This only improves the algo when E = O(V^2/lgV), which in this case, does not really hold. It could reduce it by a constant only.
Of course, if we use most advanced Dijkstra algo, which uses a Fibonacci heap, we could do it in O(VlgV + E). However, I don't want to post 500lines to implent the Fibonacci heap.
I suppose my question would then be why your graph has more than O(M*N) edges. I'll have to dig into your solution to find out.
Yes, you are right. Thanks for coming up with this question. After I reevaluated the algorithm, I found my graph does have only O(M*N) edges. So to do Dijkstra using a min-heap-based priority queue will improve the running time by lg(matrix_size) and to do it using a Fibonacci heap, we could reach the best total running time: O(matrix_size*log(matrix_size) * (elevation_of_lake - elevation_of_dam))
The above solution outputs a minimum cost of infinity for input matrix {2, 9, 9, 9, 9,
2, 2, 2, 2, 9,
7, 7, 7, 7, 9}, lake pos(1,1), and dam pos(3,4).
If I understand the constraints correctly, dam height can be decreased, so min cost should be just be 7-5=2, the path then is just through the 2's.
More specifically, the input that will be fed into your program will be formatted as follows:
N M
lakeLocationX lakeLocationY
damLocationX damLocationY
elevationX1Y1elevationX1Y2...elevationX1YM
elevationX2Y1elevationX2Y2...elevationX2YM
.
.
.
elevationXNY1elevationXNY2...elevationXNYM
where lakeLocationX and damLocationX are integers in the interval [1, N], and lakeLocationY and damLocationY are integers in the interval [1, M], and all elements are integers in the interval [0, 9]. N
Given such an input, your program should print out the minimum cost to build an irrigation path from the lake to the dam. The constraints are as follows. The lake and dam will not be at the same location. The elevation of all points except for the lake can be increased or decreased at a cost of one for every unit of change (you may leave the elevation the same for a cost of 0). N and M will each be at most 300. After paying for any excavation that is necessary, you can build a path at 0 additional cost if there is a sequence of points starting at the lake and ending at the dam such that the following are true:
1. (Contiguous path) Each point in the sequence is adjacent to the previous point (no diagonal adjacency -- points in the interior of the map have exactly 4 adjacent points)
2. (Downhill path) Each point in the sequence, including the lake and dam, has an elevation that is at most that of the previous point in the sequence.
For example, if the input is the following:
3 5
1 1
3 4
27310
21171
77721
I don't think I completely understand the question, but if you can decrease and increase the landscape as you want, then the minimum path is basically a straight line from the closest points of the lake and dam.
However if you are going to increase and decrease a height, then the problem directly becomes NP hard.
As I have already commented, if the cost associated with reducing or adding altitude is not significant, just draw a straight line between those two points (use any of line drawing a logs for this), that is basically your cheapest path.
If not, then for each path, we have one additional dimension of cost, which is cost taken to take that point (that is cost of reducing/increasing height of the path).
If you create a graph, this is similar to shortest path problem with weighted edges.
That will give you a N2rd solution.
3-dimensional DP problem: M[i][j][k] - minimal cost to build a canal from the lake to location (i,j) having elevation k.
Accordingly if k > 'height of the lake' then M[i][j][k] = +infy
In fact the problem is 2-dimensional because the elevation can vary from 0 to 9
Though one challenge here is that M[i][j] will depend on all 4 neighbours M[i-1][j], M[i][j-1],
M[i+1][j] and M[i][j+1], and I am not sure how to use "bottom-up" DP approach here..
Probably the best case would be to use recursive algo with memoization
it's like finding a path on chess board, the only difference is you need to "make" your path.
here is my thought:
1. static if the board is small, one can calculate all possible path -- which is (m+n)!/n! for a m*n size board --- and then their cost. pick the smallest one
2. dynamic way (recursive)
start from left up corner (position[0][0]), walk our way down.
keep a set of finished and unfinished paths we have so far, sorted by their cost ( path with same cost are sorted by the length, we keep the longer length one on top, because they are closer to finishing)
----------------------------------------------
take the top path in our set, if it's not fished yet, make it grow. check the last step of the path.
say it's at step [x][y], then we can make our river further flow right, or flow left.
please note that, to make our river flowing right , H[x][y+1] < H[x][y] is not enough, we also need to make sure it won't flow down or left, thus H[x+1][y] > H[x][y], H[x][y-1]>H[x][y].
update our set with these two new paths and their cost. remember to remove the old one first.
-------------------------------
if we keep doing this, the first full path we have is the best (or one of the best) one.
In my opinion, This is modified problem of Travelling Salesmen problem, where in place of cost they hv put the height and the new restriction of no diagonal travel.
Like I said, First Mark all the points in the matrices as NULL from the start and hill as H.
Now use any of prims/kruskal/Dijkstra or watever algo to find the minimum cost.
Rather U can jst modify their algo to derive your own algo with given restriction.
Why in the world would this be a traveling salesman problem? Traveling salesman deals with visiting all the nodes in a graph. Here there is no such requirement. Besides, this problem is clearly not NP-hard since it's being asked.
Dude this is a question rocket fuel contest.Are you sure that this was asked in your Google Interviews? Rocket fuel also had same Problem Statement and test cases thats why I guessed!!
- Vineet Setia May 20, 2012