Amazon Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
For Example, suppose we are looking for M(2,4), then Max, sub_size, sub_index, index goes as follow.
-------------------------------------------------------------------------------
|(1,2) | (1,3) | (1,4) | (1,5) | (2,3) | (2,4) | (2,5) | (3,4) | (3,5) | (4,5)|
-------------------------------------------------------------------------------
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
-------------------------------------------------------------------------------
<-----------------------------Max---------------------------------------------->
<-----------------------sub_size-------------->
^
|
sub_index
^
|
index
I understand that it takes O(N^2) space.
how about run time for query. I believe it is O(lgN). Correct me if I am wrong. Thank you
Actually, this is not optimized storage. Given any 3 points, you can have the 3rd distance figured out by the other 2. Using that logic, one can cut the size of the 1 array to at least half (if not more). With some more thinking, one can index all those, too.
@Son: Given the lengths of AB and BC in a triangle, can you find the length of AC? The answer is no. It might still be possible to use a knowledge of certain distances to compute certain other distances, but it's not as simple as you suggest.
@Vincent: It's constant time, assuming O(1) arithmetic operations. There's nothing in there such that the running time depends on N.
@SRB though the logic is correct but you mention "Store the Matrix in single dimensional array by removing all the elements on & above major diagonal" which means only those elements will be stored for which i>j however in the example you have taken elements above major diagonal but thats fine as the logic is correct
My Q is do we need to do so much calculations? consider if we only save lower half as u mentioned (i>j). Then in your example elements saved will be
-------------------------------------------------------------------------------
|(2,1) | (3),1 | (3,2) | (4,1) | (4,2) | (4,3) | (5,1) | (5,2) | (5,3) | (5,4)|
-------------------------------------------------------------------------------
So for calculating index in new array we can just use
new_index = (i-2)*(i-1)/2 +j
because for reaching any element in virtual 2D array we will have to skip (i-2)*(i-1)/2 elements in i-1 rows and then j elements in ith column
@kevin: Here is the formula:
Address of any element a(i,j) = number of elements up to a(i,j) element
= Total number of elements in first (i-1) rows + number of elements upto jth column in the ith row
= 0+1+2+...(i-2) + j
= (i-2)*(i-1)/2 + j
Here is why (0+1+2...(i-2) )
suppose take 3x3 matrix
a11 a12 a13
a21 a22 a23
a31 a32 a33
we are taking lower triangular matrix excluding diagnol element, so here is the number in each row
0
1
2
I suggest following solution (written in java):
public static void main(String[] args) {
int N = 5;
int[][] matrix = new int[][]{
{ 0, 1, 2, 3, 4},
{ 1, 0, 5, 6, 7},
{ 2, 5, 0, 8, 9},
{ 3, 6, 8, 0,10},
{ 4, 7, 9,10, 0},
};
int[] matrixArray = new int[]{
1, 2, 3, 4,
5, 6, 7,
8, 9,
10
};
// print to ensure that representation is ok
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
System.out.print(getMatrixElement(matrixArray, i, j, N)+" ");
}
System.out.println();
}
}
static int getMatrixElement(int[] matrixArray, int i, int j, int N) {
//check index validity
if (i < 0 || N <= i || j < 0 || N <= j) throw new IllegalArgumentException();
//check equality
if (i == j) return 0;
// ensure j<i
if (i < j) {
int tmp = i;
i = j;
j = tmp;
}
int rowOffset;
if (j == 0){
rowOffset = 0;
} else {
rowOffset = (2*N-j-1) * j/2; // offset is an arithmetic progression like: 4 .. 3 .. 2 .. 1
}
return matrixArray[rowOffset + (i-j-1)];
}
/*
Assume the elements are stored in the array as given below
(2,1),
(3,1),(3,2),
(4,1),(4,2),(4,3),
(5,1),(5,2).......
*/
#include<stdio.h>
#define N 20
int find_distance_index(int a,int b);
main()
{
int i,j,distance[(N*N-1)/2];
FILE *fp;
fp=fopen("vin.txt","w");
for (i=1;i<=N;i++)
{
for(j=1;j<=N;j++)
{
fprintf(fp,"%d \t",find_distance_index(i,j));
}
fprintf(fp,"\n");
}
fclose(fp);
getchar();
return 0;
}
int find_distance_index(int a,int b)
{
int return_value = 0;
if(a!=b)
{
return_value= (a>b)?((a-1)*(a-2)/2)+b: ((b-1)*(b-2))/2+a;
}
return(return_value-1);
}
If you know the distance from a to b and a to c, you don't need to store the distance from b to c
yes, we need to if the points are in 2-D space. In 1-D space we can sort the points on the basis of their co-ordinate and store it.
If you know the distance AC and you also know AB, how will you find BC? You cannot infer the length of the third side of a triangle from the length of the two other sides.
The problem with the matrix is that it occupies N*N rooms of the memory, where only a small subset of node-pairs (let's say m) have distance information. To solve this problem, one can use a hashmap, where the keys are the pair information, and the values are the distance between the two. The key of a pair could be defined as an interger, which can be uniquely built having the pair ids. For example, (x*N + y) can be used as the key id for the nodes with ids x, and y.
Using this scheme, the required space for storing the information will be: 2*len(Integer) *m/alpha , which for an enough large N, it would be much smaller than N*N*len(Integer)
But here say if all N*N values are there after removing the symmetric ones, Then also the amount of N * (N-1) / 2 values is so huge to store in a hashmap right. Lets say N is 10000.
"The problem with the matrix is that it occupies N*N rooms of the memory, where only a small subset of node-pairs (let's say m) have distance information."
I don't understand what you mean by this sentence. The problem statement suggests a distance is defined between every pair of points, not some small subset.
In case the matrix is asymmetric, then can we use the following approach?
Create a hashMap, with key=distance, value=bucket of pairs having such a distance between them.
PROS: If the overall original matrix size >> number of distance distances between pairs, then this approach will have a lesser space requirement.
CONS: If the number of distinct distances between pairs is comparable to size of the original matrix, then this approach will fail.
KD trees must be used for storing the data.
building a kd-tree has O(N·logN) time complexity and O(K·N) space complexity
nearest neighbor search - close to O(logN)
M nearest neighbors - close to O(M·logN)
Store the Matrix in single dimensional array by removing all the elements on & above major diagonal.
E.g. Let matrix size is 5X5, then the single dimensional array will have following list
-------------------------------------------------------------------------------
|(1,2) | (1,3) | (1,4) | (1,5) | (2,3) | (2,4) | (2,5) | (3,4) | (3,5) | (4,5)|
-------------------------------------------------------------------------------
The following function will calculate the exact index position of (i,j) in single dimensional array of matrix size NXN.
S(N) = O(N*(N-1)/2)
- SRB January 14, 2013O(N) = O(1)