is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.
CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.
Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.
Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.
Let me do it using mapreduce instead of spark, I don't know much of spark. I would assume the concept is similar though.
- kabs November 24, 2018file1: <startDate endDate>
1 5
8 20
file2: <date visitor>
2 5
3 8
10 120
So answer should be 8-20 since in that date range we have max visitor.
Assumption : since file1 just has ranges it should be a small file and can be loaded into the distributed cache of hadoop.
Now execute map code for file2, the code should do following :
1) read file2 and for each date , see which range it belongs to from the distributed cache and increment the counter for that time range.
eg
for 2 increment counter for 1_5
for 3 increment counter for 1_5
for 10 increment counter for 8_20
3) output <range , counter> as map output
4) In reduce add all the counters for every range.
Also - we need to add total order sorting so that overall output of all reducers are sorted.