Amazon Interview Question for Software Engineer / Developers


Country: United States
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
3
of 9 vote

Trie is better from HashTable with respect to space complexity. If you have to words with common prefix e.g abc abcde you still have to create to separate keys in HT whereas in trie the words share common prefix part.
But if access time is the most important factor the HT is better. So I think you should have explained both approaches and compare their pros and cons.

- thelineofcode February 05, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

How does Trie solve the follow-up issues? Like how does it handle sclability and memory issues? Also, regarding Trie's space complexity, for words share common prefix like abc and abcde, even though they share common prefix in trie, doesn't it still store two values for both abc and abcde?

- Guy February 06, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Tries requires a lot more memory because of indexing arrays in every node. It is not so if a trie is really dense and every node has close to the alphabet links. In that case we just use all redundant space.

- Anonymous February 06, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Well, each trie node just keeps an array of size 26 (one for each letter) or a hashtable<Character, TrieNode> whose size is 26 or less. I don't think it will require any additional space? In the case of hashtable implement, for input abc, abcd, abcde, abcdef, abcdefg, the common prefix abc is stored five times, whereas Trie will only store abc once.

- Guy February 06, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

You can use a function like crc32("abc") which will give you an integer value for your word.
You may chose a server to store or lookup your word as crc32("abc") % (number of your servers)

However, there is a problem with this approach. If you add new servers to your app, you have to rearrange your words by servers again...

- darik June 11, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Store dictionary using B+ tree in a file (same as B+ tree index built in Database). Each page of a file will represent a node of a B+ tree. This will have high fan out of the Tree (for example if page size of a file is 4 KB and if we use 32 bit number to represent page number( pointer ) each node will have 4KB/ 4 B = 1024 fan out ). and hence very low depth of the tree.

B+ tree can be preferred over HashTable when size of hashtable is larger than RAM size. If size of hashtable is larger than RAM size it will increase threshing due to random access pattern of hashtable.
In this case B+ tree gives benefit of locality of reference by storing nodes which are at lower level in RAM (Cache)

- jigs August 30, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I believe it would be solve by Trie search + B+ tree. In your load balancer you can write an algo while search or insert that if some string appears start from a or b then which machine it should go. Now on particular machine, we can use B+ tree to store and retrieve words.

- Phoenix September 01, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

I think he questioned about memory since you are storing everything in the hash table. Here is the implementation with using B-tree.

public class Dictionary
{
    public Node head = new Node();
        
    private class Node{
        
        public Node parent;
        public char index;
        public String definition;
        public HashMap<Character, Node> children;
        
        public Node(){
            parent = null;
            index = '\u0000';
            definition = null;
            children = new HashMap<>();
        }
        
        public Node(Node parent, char index, String definition){
            this.parent = parent;
            this.index = index;
            this.definition = definition;
            children = new HashMap<>();
        }
        
        public Node(Node parent, char index){
            this.parent = parent;
            this.index = index;
            this.definition = null;
            children = new HashMap<>();

        }
        
        private boolean hasDefinition(){
            if(definition != null)
                return true;
            else
                return false;
        }        
    }
    
    public void insert(String word, String definition){

        Node currentNode = head;

        for(int i = 0; i < word.length(); i++){
            char index = word.charAt(i);
            if(currentNode.children.containsKey(index)){
                currentNode = currentNode.children.get(index);     
            }
            else{
                Node newChild = new Node(currentNode, index); 
                currentNode.children.put(index, newChild);
                currentNode = newChild;                
            }
        }
        currentNode.definition = definition;        
    }
    
    public String lookup(String word){
        
        Node currentNode = head;
        String returnDef = "";
        
        for(int i = 0; i < word.length(); i++){
            char index = word.charAt(i);
            if(currentNode.children.containsKey(index)){
                currentNode = currentNode.children.get(index);
            }
            else{
                return "";
            }
        }
        if(currentNode.hasDefinition()){
            returnDef = currentNode.definition;    
        }
        return returnDef;
    }
}

- HurricaneCheetah February 08, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
-2
of 2 vote

You could use MD5 function to generate string for each input and use the resulting string as key in your HashMap, for example:
- md5("abc") = "900150983cd24fb0d6963f7d28e17f72"
- md5("abcd") = "e2fc714c4727ee9395f324cd2e7f331f"

you would have less probability of key clashes on this case.

- guilhebl February 06, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

How would this help? Could you please explain? Thank.

- Guy February 06, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 2 votes

Since the interviewer sated on 2) you are out of memory, I believe Distributed HashTable would solve the issue, you would use a Hash function of the key String and with the resulting String you could index a range of machines that could answer the request, using consistent hashing would avoid the bucket collision issues pointed above, and you would be able to scale to several machines.

- guilhebl February 06, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

which bucket collision issues were you referring to? Even thought we use Distributed Hashtable, it would be still possible that all machines run out of memory.

- Guy February 06, 2014 | Flag


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More