Google Interview Question
Software Engineer / DevelopersHere is a possible solution in C#. It lacks error checking of arguments. The processing step is O(n^2) both in time and space. After that, any average can be computed in O(1) time.
// build a running sum of input
// output[x,y] = sum of all elements in the rectactange with top left as [0,0] and bottom right as [x,y]
int[,] Process(int[,] input)
{
int[,] output = new int[input.GetLength(0), input.GetLength(1)];
int row, col;
output[0, 0] = input[0, 0];
// populate top most row of output
for (col = 1; col < input.GetLength(1); col++)
{
output[0, col] = output[0, col - 1] + input[0, col];
}
// populate left most column of output
for (row = 1; row < input.GetLength(0); row++)
{
output[row, 0] = output[row - 1, 0] + input[row, 0];
}
// populate remaining elements of output
for (row = 1; row < input.GetLength(0); row++)
{
for (col = 1; col < input.GetLength(1); col++)
{
output[row, col] = output[row, col - 1] - output[row - 1, col - 1] + output[row - 1, col] + input[row, col];
}
}
return output;
}
// compute sum of all elements in a sub-rectangle defined by start and end coordinates
int Sum(int[,] output, int startRow, int startCol, int endRow, int endCol)
{
int A, B, C, D;
A = (startRow == 0 || startCol == 0) ? 0 : output[startRow - 1, startCol - 1];
B = (startRow == 0) ? 0 : output[startRow - 1, endCol];
C = (startCol == 0) ? 0 : output[endRow, startCol - 1];
D = output[endRow, endCol];
return D - C - B + A;
}
// compute the number of elements in a sub-rectangle
int CountElements(int startRow, int startCol, int endRow, int endCol)
{
int totalRows = endRow - startRow + 1;
int totalCols = endCol - startCol + 1;
return totalRows * totalCols;
}
// compute the average of all the elements in a sub-rectange
double Average(int[,] output, int startRow, int startCol, int endRow, int endCol)
{
return (double) Sum(output, startRow, startCol, endRow, endCol) /
CountElements(startRow, startCol, endRow, endCol);
}
I dint get why do we need all this ? Can you please explain ?
Just to calculate the sum... is the below function not enough ?
int getAverageOfRectangle(int x1,int y1,int x2,int y2){
if(mapp.ContainsKey("x1"+"x2"+"y1"+"y2"))
return mapp.get("x1"+"x2"+"y1"+"y2");
int c=0;
for(int i=x1;i<x2;i++) {
for(int j=y1;j<y2;j++) {
sum+=matrix[i,j];
c++;
}
}
mapp.put("x1"+"x2"+"y1"+"y2",sum/c);
return sum/c;
}
That function will not run in constant time. If the subarray is very large like 100,000 X 100,000 elements, then it will take a really long time to run. The c# code above will take a really long time to run in order to preprocess. But after that, the average of any subarray can be done in constant time.
public class MatrixAverage {
public static void main(String[] args) throws Exception {
int n = 3;
int m = 4;
int[][] x = {{1,2,3,4}, {5,6,7,6}, {9,2,6,1}};
print(x);
processMatrix(x);
print(matrix);
System.out.println("Average " + getAverage(1, 1, 2, 3));
}
private static int[][] matrix;
static void print(int[][] a) {
int n = a.length;
if(n == 0) {
return;
}
int m = a[0].length;
for(int i = 0; i < n; ++i) {
for(int j = 0; j < m; ++j) {
System.out.print(a[i][j] + " ");
}
System.out.println();
}
}
public static float getAverage(int i1, int j1, int i2, int j2) {
int sum = matrix[i2][j2] - matrix[i2][j1-1] - matrix[i1-1][j2] + matrix[i1-1][j1-1];
int count = (j2-j1 + 1) * (i2 - i1 + 1);
return ((float)sum) / ((float)count);
}
static void processMatrix(int[][] input) {
if(input == null) {
return;
}
int n = input.length;
if(n == 0) {
return;
}
int m = input[0].length;
matrix = new int[n][m];
matrix[0][0] = input[0][0];
for(int i = 1; i < n; ++i) {
matrix[i][0] = matrix[i - 1][0] + input[i][0];
}
for(int j = 1; j < m; ++j) {
matrix[0][j] = matrix[0][j - 1] + input[0][j];
}
for(int i = 1; i < n; ++i)
for(int j = 1; j < m; ++j) {
matrix[i][j] = matrix[i][j-1] + matrix[i-1][j] - matrix[i-1][j-1] + input[i][j];
}
}
First idea that came to my mind:
Assume the graph is represented in adjacency matrix format,
Serialize:
1. count the number of columns in the matrix. Say it's C.
2. Write C<delimiter> to file
3. Now traverse the matrix in row major way and write each value<delimiter> to the file.
Deserialize:
1. Read the first character from file, it's number of columns in the matrix. say it's C
2. Read C values from the file(ignore delimiter), that would be a row of adjacency matrix.
3. Keep doing 2 until end of file.
I convert the problem as rather than calculate the average, calculate the sum. Once you get the sum, you can count the number of elements in constant time and then yield the average.
- Xiaonan Ji March 08, 2011Suppose we ask for sum of a rectangle M[row1:row2,col1:col2], which spreads across rows between row1 and row2 and columns between col1 and col2. We have:
Sum(M[row1:row2,col1:col2]) = Sum(M[1:row2,1:col2]) - Sum(M[1:row1,1:col2]) - Sum(M[1:row2,1:col1]) + Sum(M[1:row1,1:col1]).
From the above formula, we can see that as long as we can precalculate sum of rectangle M[1:row,1:col], for each row and each column, we can get sum of any rectangles in constant time.
The precalculation takes another matrix (denoted as A) with size O(nm). For each cell (row,col), we store the sum of M[1:row,1:col]. We can use dynamic programming to do the preprocess, which has the incremental calculation as:
A[i,j] = A[i-1,j] + A[i,j-1] + M[i,j] - A[i-1,j-1].
Clearly, it takes an extra matrix to store the preprocess matrix. Thus the space is doubled. It takes O(nm) time to preprocess. After that, it takes constant time to calculate average of any rectangle.
I didn't read sai's C# code and verified whether his idea is the same as mine or not. Anyway it is no harm to take his as implementation and mine as comments to better explain the solution if the ideas are the same.