Amazon Interview Question for SDE1s


Country: United States
Interview Type: Phone Interview




Comment hidden because of low score. Click to expand.
1
of 3 vote

Possible approach :

1. Assume that each element is a node.
2. Each '1' node is connected to the neighbor if the neighbor is also '1'
3. Find the largest connected component of the graph.

- Anonymous April 24, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Do you have some sample code to depict your algorithm ?

- guilhebl May 12, 2015 | Flag
Comment hidden because of low score. Click to expand.
1
of 1 vote

We can DFS starting from some cell travelling until the DFS reaches a 0. We record the number of visited cells as we DFS and return this value. Let's call this DFS function F.

We now do F on every cell, and return the maximum value found from the Fs.

There are of course many ways of optimizing this, but this is the general idea.

- SK April 25, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

We can think of this problem as a sort of graph coloring problem, where each "island" of 1s should have it's unique color, and also each "oceans" of 0s should have it's unique color.

Based on that,:

1. calculate each group or "island" of elements(those clustered together) of 1s, tagging each country with a number starting from 1 (color), and 0s too.

2. after all groups are calculated retrieve size of largest cluster group of 1s.

public static void solveCountNumCountries() {
	int m[][]= {{1,1,0,0,0},{1,1,0,0,0},{1,0,0,1,1},{0,0,0,0,0},{1,0,1,0,1}};
	System.out.println(countNumCountries(m));
}

public static int countNumCountries(int[][] m) {				
	int[][] country = new int[m.length][m[0].length];
	for (int i = 0; i < country.length; i++) {
		for (int j = 0; j < country[0].length; j++) {
			country[i][j] = -1;
		}			
	}
	
	int ctyCount = 1;
	
	for (int i = 0; i < m.length; i++) {
		for (int j = 0; j < m[0].length; j++) {
			if(country[i][j] == -1) {
				expandBorders(m, country, i, j, m[i][j], ctyCount++);
			}
		}
	}

	int[] ctyCounts = new int[ctyCount];
	for (int i = 0; i < ctyCounts.length; i++) {
		ctyCounts[i] = 0;			
	}
	
	int maxCount = 0; 
	for (int i = 0; i < m.length; i++) {
		for (int j = 0; j < m[0].length; j++) {
			ctyCounts[country[i][j] - 1]++;
			if(m[i][j] == 1) {
				maxCount = Math.max(maxCount, ctyCounts[country[i][j] - 1]);
			}
		}
	}
	
	return maxCount;
}

private static void expandBorders(int[][] m, int[][] c, int i, int j, int elemVal, int ctyCount) {						
	if (i >= 0 && i < m.length && j >= 0 && j < m[0].length) {
		if(c[i][j] == - 1 && m[i][j] == elemVal) {
			c[i][j] = ctyCount;
			
			// check borders clockwise rotation
			if (i >= 0 && i < m.length && j > 0 && j < m[0].length) {
				expandBorders(m, c, i, j - 1, elemVal, ctyCount);
			}		
			if (i > 0 && i < m.length && j > 0 && j < m[0].length) {
				expandBorders(m, c, i-1, j-1, elemVal, ctyCount);
			}
			if (i > 0 && i < m.length && j >= 0 && j < m[0].length) {
				expandBorders(m, c, i-1, j, elemVal, ctyCount);
			}
			if (i > 0 && i < m.length && j >= 0 && j < m[0].length-1) {
				expandBorders(m, c, i-1, j+1, elemVal, ctyCount);
			}
			if (i >= 0 && i < m.length && j >= 0 && j < m[0].length-1) {
				expandBorders(m, c, i, j+1, elemVal, ctyCount);
			}
			if (i >= 0 && i < m.length-1 && j >= 0 && j < m[0].length-1) {
				expandBorders(m, c, i+1, j+1, elemVal, ctyCount);
			}
			if (i >= 0 && i < m.length-1 && j >= 0 && j < m[0].length) {
				expandBorders(m, c, i+1, j, elemVal, ctyCount);
			}
			if (i >= 0 && i < m.length-1 && j > 0 && j < m[0].length) {
				expandBorders(m, c, i+1, j-1, elemVal, ctyCount);
			}		 		
			
		}
	}		
}

// output: 5

- guilhebl April 22, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Why do we need to color if we can find out the largest connected component ?

- King@Work May 10, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

@King@Work the solution above expands on each island of connected components finding the largest one, do you have a code sample of your idea and how would you calculate the largest connected component?

- guilhebl May 12, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

General idea:
- View matrix as a graph
- Have a maxSize variable to track the max cluster found so far

1. Search for next unvisited node n with value 1
2. Perform DFS or BFS starting at n where you only continue traversing if the current node is unvisited with a value 1. Keep incrementing a size variable as you traverse.
3. Update maxSize if needed
4. Go back to 1 until there are no more unvisited nodes with value 1
5. Return maxSize

This problem can be broken down into a few methods. I use an auxiliary boolean matrix d to track visited nodes and a Point class to pass around (x,y) pairs.

// Returns next point that contains an unvisited node with value 1
// As a small optimization, start search from x, y, searching row by row, left to right.
Point findNextAvailableCluster(int [][] m, boolean [][] d, int x, int y) {...}

// Perform DFS or BFS starting at (x,y) in m and return the size of the cluster
// Precondition is that (x,y) is an unvisited node with value 1
int getClusterSize(int [][] m, boolean [][] d, int x, int y) {...}

// Keep finding the next unvisited node with value 1
int findMaxClusterSize(int [][] m) {

    if (m == null || m.length <= 0 || m[0].length <= 0) {
        return 0;
    }

    int maxSize = 0;
    boolean [][] d = new boolean[m.length][m[0].length];
    Point p = findNextAvailableCluster(m, d, 0, 0);

    while (p != null) {
        int size = getClusterSize(m, d, p.x, p.y);
        if (size > maxSize) {
            maxSize = size;
        }
        p = findNextAvailableCluster(m, d, p.x, p.y);
    }

    return maxSize;
}

Note that the BFS/DFS only needs to traverse two 'children' at each step: (x+1,y) and (x,y+1)
This is b/c we are always going left->right, top->bottom when finding the next available cluster, and that (x+1,y+1) will be intrinsically searched when looking at one of the children.

- JW April 26, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I decided to tackle this problem a little differently and limit myself to iterative solutions only since others have already done recursive solutions. Here's my C# solution. (If the 2D array size is variable, this would be pretty nasty since it'd be O(n^4). But that wasn't specified, so I'm assuming constant array sizes here.)

public struct ClusterNode
        {
            public bool Checked { get; set; }
            public int ClusterId { get; set; }
        }

        private const int MAX_X = 10;
        private const int MAX_Y = 10;

        private int LargestCluster(ClusterNode[,] clusterArray)
        {
            int result = 0;
            List<int> clusterCount = new List<int>();
            for (int x = 0; x < MAX_X; ++x)
            {
                for (int y = 0; y < MAX_Y; ++y)
                {
                    if (clusterArray[x, y].Checked)
                    {
                        // Assign same cluster ID as checked node above
                        if (y > 0 && clusterArray[x, y - 1].Checked)
                        {
                            clusterArray[x, y].ClusterId = clusterArray[x, y - 1].ClusterId;
                            clusterCount[clusterArray[x, y].ClusterId]++;

                            // If node to the left is also checked but with a different cluster ID, merge them into one cluster.
                            if (x > 0 && clusterArray[x - 1, y].Checked
                                && clusterArray[x - 1, y].ClusterId != clusterArray[x, y].ClusterId)
                            {
                                TakeOverBadCluster(clusterArray, clusterCount, 
                                    clusterArray[x - 1, y].ClusterId, clusterArray[x, y].ClusterId, x, y);
                            }
                        }
                        // Assign same cluster ID as checked node to the left
                        else if (x > 0 && clusterArray[x - 1, y].Checked)
                        {
                            clusterArray[x, y].ClusterId = clusterArray[x - 1, y].ClusterId;
                            clusterCount[clusterArray[x, y].ClusterId]++;
                        }
                        // Assign new cluster ID
                        else
                        {
                            clusterCount.Add(1);
                            clusterArray[x, y].ClusterId = clusterCount.Count - 1;
                        }
                    }
                }
            }
            return clusterCount.Count > 0 ? clusterCount.Max() : 0;
        }

        private void TakeOverBadCluster(ClusterNode[,] clusterArray, List<int> clusterCount, 
            int badClusterId, int goodClusterId, int currentX, int currentY)
        {
            for (int x = 0; x < MAX_X; x++)
            {
                for (int y = 0; y < MAX_Y; y++)
                {
                    if (x == currentX && y == currentY)
                    {
                        break;
                    }
                    if (x >= currentX)
                    {
                        continue;
                    }

                    if (clusterArray[x, y].Checked && clusterArray[x, y].ClusterId == badClusterId)
                    {
                        clusterArray[x, y].ClusterId = goodClusterId;
                        clusterCount[goodClusterId]++;
                    }
                }
                clusterCount[badClusterId] = 0;
            }
        }

- RS June 11, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

public static int FindBiggestClusterInField(int[][] field)
        {
            // put all positions in hash set that have 1.
            // take one out find all positions that exist in set adjacent.
            // in this case as it is 5 X 5 we can join the indexes so that it is 00 to 44 as keys in the set.
            // we ensure we do not try to check the same position twice as it is removed from the groomed set.
            HashSet<int> uncheckedOneSpaces = new HashSet<int>();

            // O(n) cost. 
            // groom and store positions to be counted
            for (int row = 0; row < field.Length; row++)
            {
                for (int column = 0; column < field[row].Length; column++)
                {
                    int position = (row * 10) + column;
                    if (field[row][column] == 1)
                    {
                        uncheckedOneSpaces.Add(position);
                    }
                }
            }

            int highestCount = 0;

            // O(m*4) : m <= n cost. Still O(n)
            // while we have uncounted clusters keep going
            while (uncheckedOneSpaces.Count > 0)
            {
                Queue<int> positionsToCheck = new Queue<int>();
                
                // Code reduction action
                Action<int> queueIfPresentinUncheckedSet = (position) =>
                {
                    if (uncheckedOneSpaces.Remove(position))
                    {
                        positionsToCheck.Enqueue(position);
                    }
                };

                // Seed queue with one of the positions that have not been checked.
                // and remove it so it wont be counted again.
                var tempPosition = uncheckedOneSpaces.First();
                uncheckedOneSpaces.Remove(tempPosition);
                positionsToCheck.Enqueue(tempPosition);

                int count = 0;

                while (positionsToCheck.Count > 0)
                {
                    // foreach position find its adjacent positions that exist in unchecked set;
                    // place them in positionsToCheckSet
                    var currentPosition = positionsToCheck.Dequeue();
                    count++;

                    // add up one 
                    queueIfPresentinUncheckedSet(currentPosition - 10);

                    // add down one
                    queueIfPresentinUncheckedSet(currentPosition + 10);

                    // add left one * works due to current limitation of width of field.*
                    queueIfPresentinUncheckedSet(currentPosition - 1);

                    // add right one
                    queueIfPresentinUncheckedSet(currentPosition + 1);
                }

                if (count > highestCount)
                {
                    highestCount = count;
                }
            }

            return highestCount;
        }

- MS coder September 07, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

def find_max_cluster_size(arr):
    checked_status = [[False] * len(arr[0]) for i in range(len(arr))]

    max_cluster_size = 0
    next_unchecked = (0, 0)

    while next_unchecked:
        x, y = next_unchecked[0], next_unchecked[1]
        max_cluster_size = max(
            max_cluster_size, find_cluster_size(arr, x, y, checked_status))
        next_unchecked = find_next_unchecked(x, y, checked_status)

    return max_cluster_size


def find_next_unchecked(x, y, checked_status):
    """
    Finds the next node that has not yet been checked

    Starts at the last known node that we attempted a find_cluster_size
    """
    starting_x = x
    for i in range(y, len(checked_status)):
        for j in range(starting_x, len(checked_status[0])):
            if not checked_status[i][j]:
                return (j, i)

        starting_x = 0

    return None


def find_cluster_size(arr, x, y, checked_status):
    """
    Get the size of the cluster starting from this node in directions left,
    right and down
    """
    try:
        if checked_status[y][x]:
            return 0
    except IndexError:
        # We're beyond the size of the array. Easier than checking if x and y
        # valid
        return 0

    checked_status[y][x] = True

    if arr[y][x] == 0:
        return 0

    # Add one for this block, and everything in the cluster to the right, to
    # the left and underneath
    return (
        1
        + find_cluster_size(arr, x+1, y, checked_status)
        + find_cluster_size(arr, x-1, y, checked_status)
        + find_cluster_size(arr, x, y+1, checked_status)
    )


if __name__ == '__main__':
    table = [
        [1, 1, 1, 0, 0],
        [1, 0, 0, 0, 0],
        [1, 0, 0, 0, 0],
        [0, 0, 0, 1, 0],
        [1, 1, 1, 1, 1],
    ]
    for row in table:
        print row
    print find_max_cluster_size(table)

- maguire.brendan October 21, 2015 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More