Amazon Interview Question
Developer Program EngineersCountry: United States
I apologize, I read blob throughout the thread and took it as "blob" (fits in any space).
Thanks pxe for "flood fill" (label given for DFS on these things).
For the rectangular problem, let's consider a square matrix of width N (simply so that solving recurrences and comparing becomes easier in one variable). We can modify the algorithms back to MxN after we compare runtimes (with some faith).
1) Worst Possible Brute force:
Pick two points in the grid (topleft and bottomright corners), and check if it encloses a valid area.
Looks O(N^4)*O(N^2) = O(N^6) or something horrible like that (might be a loose upper bound).
2) How about divide and conquer? (As usual, assume N is power of 2 to make recurrence easier to solve):
Divide the grid up into 4 smaller squares, and recurse into each to find max area of any valid blob in each.
We must also consider any blob that spans more than 1 of the subsquares. We can fill this out starting from the centre to find the max such rectange. This is upperbound O(N^2). We then return the max of the centre blob, and the 4 recursive call return values up the call chain.
So recurrence looks T(N) = 4*T(N/2) + O(N^2)
T(N) = O(N^2 lgN) , should be easy to code
correct me if I'm wrong please
3) Theoretical minimum for problem: T(N) = O(N^2) , gotta touch them all.
You should probably try first to get this theoretical minimum by making use of the heavy dose of overlapping subproblems (and opt. sub. structure) in 1) brute force above. We can modify that into a table-ed bottom up approach I think, and try to get O(N^2).
No time to code, a movie is coming up :)
Wait,
The recursive calls can return more information that can aid the caller.
Ouch, many possibilities abound, still let's leave it as abstract O(N^2) as the combine middle-check step.
Got it I think.
We can pass up some interesting data from the recursive calls to the caller to actually calculate the middle max area in O(N). Try it out. This actually came while thinking about optimizing the 2) D and Q algorithm.
T(N) = 4*T(N/2) + O(N) <--- optimized middle calculation
T(N) = O(N ^(log_base2(4)) ) = O(N^2)
I'll code it up tomorrow. For now, this proves writing down ideas can lead to iterative improvements in solutions (or attempts) very fast.
I will first introduce my method, then give you java code.
Method:
Time Complexity for "MxN" matrix: O(min(M, N)^2 max(M, N))
Algorithm (take M < N):
We will read rows 1-by-1. We have a register which is an array of integers of length "N". Call it sum_zero.
When reading row "r" of matrix update "sum_zero" as:
if (matrix[r][c] == 0)
sum_zero[c]++;
else
sum_zero[c] = 0;
,i.e., I am finding the number of consecutive zeros in column "c" up to row "r".
The rest of the method is described in an example.
Assume we have "5" columns and have read 3 rows. First of all, sum_zero[c] < 4 for all c.
Assume we have
sum_zero = {0, 1, 2, 1, 0}
.
This means the part of matrix we have read so far has been like
{1, ,0 ,0 ,0, 1}
{X, ,1 ,0 ,1 ,X}
{X, ,X, ,1, ,X, ,X}
Note that value of "X" does not matter. So in this case, we have a rectangle of size "1x3" and one of size "2x1". To find the rectangle, all we need is the array sum_zero.
To Do:
Go through array sum zero and look for a continuous run of values larger than "k". Let's say we found L[k] as the largest run. Then, we have a rectangle with dimensions "(k + 1) x L[k]".
When reading row "k", the maximum value of sum_zero[c] is k + 1. So we need to check "k" numbers. This is done in O(k x N).
Since we are repeating the sequence for "k = 1, 2, ..., M", the overall time complexity is
O(N x M^2) and if we are smart, we choose M < N.
This is actually a cubic time for square matrices, which might not be good. But for sure the optimal time complexity is \theta(N^2) so this is not really that bad.
Java Code: (I would like to thank "Mike" for preparing the array (input). I copied it from his code above)
// Class MaxRectangle.java
package careercup;
public class MaxRectangle {
public MaxRectangle(int[][] A) {
matrix = A;
nrows = A.length;
ncolumns = A[0].length;
}
public int GetMaxRectArea() {
int max_area = 0;
int[] sum_zero = new int[ncolumns];
for (int c = 0; c < ncolumns; c++)
sum_zero[c] = 0;
for (int r = 0; r < nrows; r++) {
for (int c = 0; c < ncolumns; c++)
if (matrix[r][c] == 0)
sum_zero[c]++;
else
sum_zero[c] = 0;
// decide
int new_max_area = GetMaxArea(sum_zero, r);
if (new_max_area > max_area)
max_area = new_max_area;
}
return max_area;
}
private int GetMaxArea(int[] sum_zero, int r) {
// Look for longest run of value "k + 1"
int area_to_report = 0;
for (int k = 0; k < r; k++) {
int longest_run = 0;
int this_run = 0;
boolean saw_same = false;
for (int c = 0; c < ncolumns; c++) {
if (saw_same){
// we have been counting a run
if (sum_zero[c] > k)
this_run++;
else {
// end of this run
saw_same = false;
if (this_run > longest_run)
longest_run = this_run;
}
}
else {
// check if a new run is starting
if (sum_zero[c] > k) {
this_run = 1;
saw_same = true;
}
}
}
int new_area = longest_run * (k + 1);
if (new_area > area_to_report)
area_to_report = new_area;
}
return area_to_report;
}
int[][] matrix;
int nrows, ncolumns;
}
// class Program.java
// Nothing special. Just initiating the input and calling the algorithm. Big "thank-you" to Mike for writing the input in a nice reusable format. :)
public class Program {
public static void main(String[] args) {
int[][] input = GetMatrix();
MaxRectangle max_rect = new MaxRectangle(input);
int max_area = max_rect.GetMaxRectArea();
System.out.println("Maximum area in a rectangle of zeros is " + Integer.toString(max_area));
}
private static int[][] GetMatrix() {
int input[][] = { { 0,0,0,1,1,1,1,1,1,0,0,0,1,1 },
{ 1,1,1,0,0,0,1,0,1,1,1,1,1,0 },
{ 1,1,0,0,0,1,1,1,1,1,1,1,0,0 },
{ 0,0,1,1,1,1,0,0,0,0,1,1,1,1 },
{ 0,0,1,1,0,0,0,0,1,1,0,0,0,0 },
{ 1,1,0,0,0,1,0,0,0,0,0,1,1,0 } };
return input;
}
}
public class MatrixArea {
class Rect {
int row;
int col;
int width;
int height;
int area;
}
public static void main(String[] args) {
int matrix[][] = { { 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1 },
{ 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0 },
{ 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0 },
{ 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1 },
{ 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0 },
{ 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0 } };
findArea(matrix);
}
private static void findArea(int[][] input) {
Rect maxArea = new MatrixArea().new Rect();
int row = input.length;
int col = input[0].length;
for (int i = 0; i < row; i++) {
for (int j = 0; j < col; j++) {
if (input[i][j] == 0) {
Rect rect = matrixArea(input, i, j);
System.out.print(rect.area + " ");
if (rect.area > maxArea.area) {
maxArea = rect;
maxArea.row = i;
maxArea.col = j;
}
} else {
System.out.print(" ");
}
}
System.out.println();
}
System.out.println("Max Area " + maxArea.area + " (" + maxArea.width
+ " * " + maxArea.height + ") from index " + maxArea.row + ","
+ maxArea.col);
}
private static Rect matrixArea(int[][] input, int x, int y) {
int row = input.length;
int col = input[0].length;
int width = 0;
int height = 0;
int maxArea = 0;
Rect rect = new MatrixArea().new Rect();
for (int i = x; i < row; i++) {
if (input[i][y] != 0) {
break;
}
int curWidth = 0;
for (int j = y; j < col; j++) {
if (input[i][j] == 0) {
curWidth++;
} else {
break;
}
}
if (curWidth == 0)
break;
if (curWidth < width || width == 0)
width = curWidth;
height++;
int area = height * width;
if (area > maxArea) {
rect.width = width;
rect.height = height;
rect.area = area;
maxArea = area;
}
}
return rect;
}
}
This was fun and interesting, thanks.
Typing idea raw here (pseudocode + with bugs):
// assume RxC array, R : number of rows, C: number of columns
unsigned int maxarea=0;
for(r=0; r<R; r++)
for(c=0; c<C; c++)
if(A[r][c] == 0)
maxarea = max( tallyarea(A, r, c, R, C) , maxarea ); //inline func.
//recreate array:
for(r=0; r<R; r++)
for(c=0; c<C; c++)
if(A[r][c] == -1)
A[r][c]= 0;
//return max area:
return max area;
//helper recursive function
unsigned int tallyarea(A, r, c, R, C)
{
int i,j;
int areaaround=0;
if( ! (0<=r && r<C && 0<=c && c<C ) ) return 0; //if out of bounds
A[r][c]= -1; //mark this cell as already counted as part of some area
for(i=-1; i<=1; i++)
for(j=-1; j<=1; j++)
if(!(i==0 && j==0) && A[r+i][c+j]==0) //if a nonzero cell is around us
arearound += tallyarea(A,r+i,c+j); //accumulate any unseen area
return areaaround + 1; //unseen area around this cell plus area cell
}
This should be runtime O(RC), the theoretical min. for this problem.
There might be a better way to do this, but I haven't bothered googling.
What would someone call such an algorithm? Blob area something (like Adnan says)?
It's basically DFS, the way I saw it.
I allowed above diagonal connections as part of blobbing more area.
We can change this bit to be more restrictive otherwise
8 directions:
for(i=-1; i<=1; i++)
for(j=-1; j<=1; j++)
if(!(i==0 && j==0) && A[r+i][c+j]==0) //checks 8 directions
The other obvious blob extending case is to only allow 4 directions. It becomes easier to code....
//helper recursive function
unsigned int tallyarea(A, r, c) <-- assume knows about R,C or whatever
{
int d;
unsigned int areaaround=0;
//if out of bounds or not a unseen zero cell, no area to add
if( !(0<=r && r<C && 0<=c && c<C ) || A[r][c] != 0) return 0;
A[r][c]= -1; //mark this cell as already counted as part of some area
for(d=-1; d<=1; d+=2)
areaaround += tallyarea(A,r+d, c) + tallyarea(A,r, c+d);
return areaaround + 1;
}
Yes, sorry I'm a lazy coder, reflected in my style.
I believe your answer considers all blobs which are continuous in either up/down or right/left directions. But the question asks about blobs of type m*n, which means only rectangular blobs.
Look at the example.
It said "areas"
And the input grid has more blobby shapes and barely and real rectangles.
m*n is meaningless and can mean "input is a rectangle, not just a square"
Why do you and I have to guess?
This place has serious issues. Efficiency is garbage here.
People spend 1 min. speed typing their questions because they want to save 1 min. of their time, and cause havoc. Selfish idiots.
We can also use the Flood-Fill algorithm. Here's the complete code
#include<stdio.h>
#include<stdlib.h>
#define ROW 6
#define COL 14
int c[ROW][COL]={0,0,0,1,1,1,1,1,1,0,0,0,1,1,
1,1,1,0,0,0,1,0,1,1,1,1,1,0,
1,1,0,0,0,1,1,1,1,1,1,1,0,0,
0,0,1,1,1,1,0,0,0,0,1,1,1,1,
0,0,1,1,0,0,0,0,1,1,0,0,0,0,
1,1,0,0,0,1,0,0,0,0,0,1,1,0};
void FloodFill_Recursive(int row,int col, int color,int NewColor, int &cnt)
{
if(row<0 || col<0)
return;
if(row>=ROW || col>=COL)
return;
if(c[row][col]==color)
{
cnt++;
c[row][col]=NewColor;
FloodFill_Recursive(row+1,col,color,NewColor,cnt);
FloodFill_Recursive(row-1,col,color,NewColor,cnt);
FloodFill_Recursive(row,col+1,color,NewColor,cnt);
FloodFill_Recursive(row,col-1,color,NewColor,cnt);
}
}
void disp()
{
for(int i=0;i<ROW;i++)
{
for(int j=0;j<COL;j++)
printf("%d ",c[i][j]);
printf("\n");
}
}
int main()
{
int cnt=0;
int x=-1;
int y=-1;
int StartX,StartY;
int max=INT_MIN;
disp();
for(int i=0;i<ROW;i++)
{
x=i;
for(int j=1;j<COL;j++)
{
y=j;
FloodFill_Recursive(i,j,0,8,cnt);
if(cnt>max)
{
StartX=x;
StartY=y;
max=cnt;
}
cnt=0;
}
}
printf("Largest Blob Size:\t %d",max);
FloodFill_Recursive(StartX,StartY,8,7,cnt);
printf("\n");
disp();
return 1;
}
@ pxe, you are right. I forgot this constraint. It basically fills the largest connected areas.
@pxe, I have modified this code to consider only the rectangular areas. Here's the new function:
void FloodFill_Recursive(int row,int col, int color,int NewColor, int &cnt)
{
if(row<0 || col<0)
return;
if(row>=ROW || col>=COL)
return;
if((c[row][col]==color || c[row][col]==NewColor) && (row+1 < ROW && col+1 <COL && ((c[row][col+1]==color || c[row][col+1]==NewColor) && (c[row+1][col]==color|| c[row+1][col]==NewColor) && (c[row+1][col+1]==color || c[row+1][col+1]==NewColor))))
{
if(c[row][col]!=NewColor)
{
c[row][col]=NewColor;
cnt++;
}
if(c[row][col+1]!=NewColor)
{
c[row][col+1]=NewColor;
cnt++;
}
if(c[row+1][col]!=NewColor)
{
c[row+1][col]=NewColor;
cnt++;
}
if(c[row+1][col+1]!=NewColor)
{
c[row+1][col+1]=NewColor;
cnt++;
}
FloodFill_Recursive(row,col+1,color,NewColor,cnt);
FloodFill_Recursive(row+1,col,color,NewColor,cnt);
FloodFill_Recursive(row+1,col+1,color,NewColor,cnt);
}
}
The main function is essentially as above. Only the "j" counter should start from zero now.
From all zeros, start to grow rectangular region.
- Mark October 23, 2013If I can tell a zero covered region is convex or concave, I would be able to save more time as I can start only one 0 in a convex region; that is, if I get an area from one 0 of a convex region, starting from all other zero will yield the same result.