Amazon Interview Question for SDE1s


Country: India
Interview Type: Phone Interview




Comment hidden because of low score. Click to expand.
2
of 2 vote

If we say huge file, than it can go any whare ... even number of unique elements can also go anywhere (to fit in memory, either using a trie or HashTable). So i think:

The obvious answer is of course to keep a hash map\trie and store a counter of the occurrence of elements as you move through the file. This is (in terms of time complexity) the optimal solution.

If however, your space requirements are tight you can perform an external sort on the file and then find the longest consecutive run of equal elements. The following should have a constant memory footprint and can be done in Θ(nlogn).

- Ajeet May 17, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Trie is good. Store the count in leaf. In a global variable have the max count & word. If new word's count is more than existing update it.

If N maximum words has to be found, use max-heap to store the word with frequency as determining max-heap property.

- geekyjaks May 16, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Why not using HT with key as word and value as the count

- Anon May 16, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Anon - I don't see why a hashtable or a trie would wouldn't work just fine for this.

- Michael.J.Keating May 16, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

simple with complexity=n

package amazon.test.practice;
import java.util.*;
import java.io.*;
public class Wordcount {
public static void main(String args[]) throws FileNotFoundException
{
	
	HashMap<String, Integer> hm = new HashMap<String, Integer>();
	Scanner input=new Scanner(new File("c:\\1.txt"));
	String max="none";
	hm.put(max, 0);
	while(input.hasNext())
	{
		String word = input.next();
		//System.out.println(word);
		if (hm.containsKey(word))
		{
			hm.put(word,hm.get(word)+1);
		}
		else
		{
			hm.put(word, 1);
		}
		if (hm.get(word)>hm.get(max))
		{
			max=word;
		}
	}
/*	Set set = hm.entrySet();
    Iterator i=set.iterator();
    while(i.hasNext())
    {
    	 Map.Entry me = (Map.Entry)i.next();
         System.out.print(me.getKey() + ": ");
         System.out.println(me.getValue());
      
    }*/
    System.out.println(max+"-------"+hm.get(max));
	}

}

- satishkumardhule May 17, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

In my opinion, it depends on the environment you have, how frequent you have to do this, and what the expected time is.
If the requirement is to do this often and if you have a client / server environment with multiple workers, some sort of MapReduce might be an alternative. You have a server that create chunks (splits) of the given file and passes the information of chunks (from, to) to the workers. Also you have workers to do the "map" and others to do the "reduce".
In the Map-Phase the workers create a Mapping of words to occurances for the given chunk. Then they pass it to an intermediate store (e.g. disk)
In the Reduce-Phase workers take the created Map-Results from different workers and combine them to a final result.

- Scavi May 17, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

The complexity of using a trie is O(nlogk), where k is the average word length.

- Anonymous May 18, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

where does log come into picture? IMO, its O(nk)

- anon123 May 20, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

the

- Anonymous May 18, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Iterate through the file reading one word at a time and write the word in a dictionary.

Case 1: if the word is already present in dictionary - then value++;
case 2 ; if the word is not present in dictionary - add key as word and value as 1

c# code

Dictionary< string , int> dicWordCount = new Dictionary<string , int>();
while(file.Read())
{
string w = file.ReadLine();
if(dicWordCount.containsKey(w))
{
dicWordCont[w] ++; //increment the value by one
}else
{
dicWordCount.Add(w,1);
}
}

//now, iterate through the values and find max of value
int maxOccurance = 0;
int keyWord = "":
foreach(string s in dicWordCont.Values)
{
int count = dicWordCount[s];
if( maxOccurance < count)
{
maxOccurance = count;
keyWord = s;
}
}

- Anonymous May 18, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Heap based solution,
1. Heap data structure created so as to hold the string, count.
2. Read each string and search the heap for the string, if found update the count and heapify else create new node with count=1.

- HardCode June 23, 2014 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More