Walmart Labs Interview Question
Software Engineer / DevelopersCountry: India
Interview Type: In-Person
What are you saying? 10^30*10*40 characters? Assume 1 byte for each character.
1GigaByte = 1024*1024*1024 bytes
10^32 bytes = 10^23 GigaBytes
You cannot have this much memory (or even secondary storage) available.
Now let us assume that somehow we manage to arrange this massive memory. You know that you have to visit each character at least once. To process 10^7 characters it takes an order of a second assuming O(1) for each character. Now to process 10^32 characters it will take 10^25 seconds means 10^17 years.
Who are saying that EACH cell in this matrix is translated to memory or any storage? We can save only not-null values into memory and realise function which return value (or null) by index. For example, excel do not save empty cells in a memory. Moreover, this task is abstract and we must solve it without any connections to memory size and real times.
That is a very big machine. The earth has 10^50 atoms in it some out of the box thinking will be required. Did your interviewers had big almond shaped eyes, big heads and a gray pallor?
All joking aside we have a big matrix or the matrix is very sparse and or does not have to actually be stored like say the number of states in some game.
The simple algorithm is to do a recursive search starting at every node in the matrix. At each recursion try all 4 directions including the one we came from. We are talking perhaps 10 to 40 lines of code. if we have some way to quickly iterate through nodes of a particular value then this comes into the realm of reality. At 10^10 notes on where we have been we will probably go to an iterative solution that moves back and forth along a big file stored on some disks.
We can give each starting node to a different processor if we are talking a parallel computing and we could even fork off different branches of a search at any point along the way if we wanted. We might even consider starting in the middle of the string to be found and giving each side to a different processor. Under some regimes this might help though I can't be sure.
This can be turned into a graph problem. The fact that the graph and the query is huge should be of no concern at first; start simple.
- Stu November 05, 2013We treat the matrix of characters as a graph such that there is an edge from a cell to all 8 adjacent cells (less for the cells along the edges). There's also an edge from a cell to itself.
The first thing to do would be to do an O(n) scan of the matrix to find all cells that begin with the same character as the query string.
Next, we can perform an A* search from each of those cells. Our heuristic is the # of characters not yet matched. We only generate children who satisfy the criteria that their cell contains the next character in the query. We can terminate the search if the open list is empty and we haven't found a solution, exactly like A*.
Because we're not guaranteed to find a string match in the matrix, our worst case will be something like O(V^2), where we've performed a BFS from each node in the graph.