Amazon Interview Question for Interns


Country: United States
Interview Type: Phone Interview




Comment hidden because of low score. Click to expand.
0
of 0 vote

Since you will have to parse the whole logger anyway the average complexity of the program is theta(N) where N is the number of records
here is a simple program in python

import datetime
def find_repeated_customer():
	file_obj = open("file path","r")
	customer_last_visit = {}
	repeat_customer = set()
	while line in file_obj:
		timestamp,customer_id,page_id = line.split(" : ")
		last_visit = customer_last_vist.get(customer_id,None)
		if not last_visit:
			customer_last_visit[customer_id] = last_visit
		else:
			# assuming time stamp looks like 2013-03-29 01:03:26.947000
			year,month,date = timestamp.split(" ")[0].split("-")
			current_visit = datetime.date(year,month,date)
			day_diff = current_visit - last_visit
			if day_diff >=1:
				repeat_customer.add(customer_id)
			customer_last_visit[customer_id] = current_visit

- Sandy March 28, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

public static Set<String> findRepeatCustomers(String fileName){
		Set<String> set = new HashSet<String>();
		Map<String, Integer> map = new HashMap<String, Integer>();
		FileReader fr = null;
		BufferedReader br = null;
		String line=null, custId=null;
		try {
			fr = new FileReader(fileName);
			br = new BufferedReader(fr);
			while((line = br.readLine()) != null && line.length()>0) {
				StringTokenizer st = new StringTokenizer(line, ":");
				st.nextToken();
				custId = st.nextToken();
				if (map.containsKey(custId)){
					Integer count = map.get(custId) + 1;
					map.put(custId, count);
				} else
					map.put(custId, new Integer(1));
			}
			for(String cID: map.keySet()) {
				if (!map.get(cID).equals(new Integer(1)))
					set.add(cID);
			}
			return set;
		} catch (IOException ioex) {
			System.out.println(ioex.getMessage());
		} finally {
			try {
				br.close();
				fr.close();
			} catch(IOException ioex){
				System.out.println(ioex.getMessage());
			}
		}
		
		return set;
	}

- lin March 28, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Your code does not consider the condition "after at least a day. "
without this condition the problem is a pretty trivial hashmap lookup..

- AmazonUser March 28, 2013 | Flag
Comment hidden because of low score. Click to expand.
3
of 3 votes

To fix your code: Since the file is a *log* file, the entries are going to be sorted by default. We take advantage of this. Maintain one map and a set
For each entry in the log file
if(map.contains(custid))
Timestamp ts = map.get(custId);
if(difference(current rowtimestamp, ts) > 24 hours)
Set.add (custid) // you found your repeat customer here
map.remove(custid) // we don't want to process timestamp difference for this customer again
else{
if(!set.contains(custid))
map.add(custid, timestamp)
}

- AmazonUser March 28, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

public void PrintRepeatCustomers()
        {
            Console.WriteLine("Enter path to log file.");
            var path = Console.ReadLine();
            if (path == null) return;
            var lines = File.ReadAllLines(path);
            var listLines = lines.ToList();
            //logs will only contain the last visit timestamp of customer
            var logs = new Dictionary<long,DateTime>();
            foreach (var line in listLines)
            {
                var terms = line.Split(':');
                var customerId = long.Parse(terms[1]);
                var timestamp = DateTime.Parse(terms[0]);
                if(logs.ContainsKey(customerId) && timestamp - logs[customerId] < TimeSpan.FromDays(1))
                {
                    Console.WriteLine(customerId);
                }
                logs[customerId] = timestamp;
            }
        }

- sathley March 28, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I would just read timeStamp and customId of each record, and put them into chaining hash table (using customId as key). Traversing the hash table, in each chained link-list, minus first one (time) from last one, if result bigger than one day, print out.

- outdoor March 29, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

its better if we do the above question using BST where each node contain timestamp and customer id . tree is designed using customer id . each time for new entry in log comes,seach will be log(n) only on the basis of customer id. And as node is already present check the time if its more than one day print else
update the time. And if customer id is not present then create a new node for that customer...

- csgeeg March 29, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Why not use hashmap? Its o(1), which is better than o(log n) of bst.

- Jay April 14, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Approach:

As mentioned in qus we have to find customer who returns to Amazon site(any page)....
so we should focus on Amazon site customer only.....
all those log entry which describe for non -Amazon site we should ignore.

now there are two case on page ID:
1. page id is kind of numeric, alphanumeric or something we can't just say its Amazon's site page without checking Amazon site page IDs list
2. page id is always containing something common so that we can figure it out its Amazon's site page for eg: pkt.amazon.com/index , pkt.amazon.com/review

-For any of case we will sort out only Amazon pages visited by customer
-Pick a customer Id and calculate date from given timestamp
-Find out next occurrence for that customer id in logs and check for visited date from timestamp
-In case there is no diff in date find for next entry and remove one entry from log list
-In case till the end we dont find the customer visit in >= 1 day remove that customer visit entry
-In case there is diff of at least one day in both log entry data then make a entry of the customer in Result list and remove all the occurrence of that customer from sorted log list so that it wil be easy to do search in log list for other customers.

-Now pick another customer and apply logic as above....

- PKT March 29, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I would take the number of days to be accounted from the first and last lines of the log file. and then sort the lines according to the cust id first and then timestamp if the custid is same. for instance, I keep an ArrayList<Line> and the comparator used will have the compare method as follows



public int compare(Line line1, Line line2){
if(line1.custid.equals(line2.custid){
return line1. timestamp - line2.timestamp;
}
return line1.custid - line2.custid;

}

This can be done in o(nlogn) time, where n is the number of lines.
Now we just have to go through the entire sorted list once and make sure for each customer, the timestamps visited confirms with every day in the range we calculated first.

this will just take o(n) time..

The total time complexity is o(nlogn)

- Arun April 02, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

This approach takes the cust_id's and then checks the timestapms
1. Take a cust_id
2. Check for that cust_id in the log file, for every record found compare TimeStamps, if difference is greater than a day, print the cust_id and mark that cust_id as visited(so that we do not count it again,we can use a hash table here)
3. Repeat the process for the unprocessed cust_id's

If cust_id's are integers, we can sort the file based on cust_id's and then compare the timestamps,i.e when they visit and display the result.

I know that this is not the optimal approach but I think it will work....!!

- pulkit.mehra.001 March 03, 2014 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More