Amazon Interview Question
SDE-2sTeam: Transportation Team
Country: India
Interview Type: In-Person
This is a standard graph traversal problem. In that Each URL is a node and set of images on each node is data on which certain processing is to be done.
We can use BFS or DFS for traversal of the URL. We can define size of the image as a metric to judge quality of image. So that more the size of the image better it would be. Now, make few buckets of size and assign an integer to them. For example: image of size range 4-6 mb would be average so number 2 could be assigned. Image with size less than 10 kb would below good so number 0 can assigned.
After traversing through all the images, we can get average value of the image quality.
I have provided just a basic idea. Otherwise there could be n number of ways in which we could get image quality.
This is a standard graph traversal problem. In that Each URL is a node and set of images on each node is data on which certain processing is to be done.
We can use BFS or DFS for traversal of the URL. We can define size of the image as a metric to judge quality of image. So that more the size of the image better it would be. Now, make few buckets of size and assign an integer to them. For example: image of size range 4-6 mb would be average so number 2 could be assigned. Image with size less than 10 kb would below good so number 0 can assigned.
After traversing through all the images, we can get average value of the image quality.
I have provided just a basic idea. Otherwise there could be n number of ways in which we could get image quality.
Okay lets break the problem down. Begin with calcuation
1. Visit one site (1 URL)
2. Go through all of its images, mark them V Good - Poor and calculate some sort of average. We will need some scale of marking. So lets take: V Good = 4, Good = 3, Avg = 2, Poor = 1
S = Total score of the URL, N = Total number of images on the URL
3. Now to store the score of a website. Use a hashtable/ STP map which takes key-val pairs.
- puneet.sohi June 15, 2015So key = URL, value = Avg_Score_For_URL. This will also serve as a check for already visited URL's. Before visiting a URL check if its entry already exists in the hashtable.
This would be a very basic solution.