FactSet Research Systems, Inc Interview Question
Software Engineer / DevelopersCountry: United States
is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.
CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.
Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.
Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.
MapReduce is actually Map -> Merge -> Reduce
- gogo March 31, 2014In general, data in HDFS is in {key, value} form.
Value would correspond to your record and Key would correspond to the primary key of the row.
Map is used to transform & filter keys and values.
Merge collects and sorts values of the same key.
Reduce is any aggregation operation.
Some SQL queries e.g. with inner queries, may require cascading MapReduce runs.
E.g.
Sales table row = {txn_id, txn_time, item_id, price, quantity}
SELECT sum(price*quantity) AS day_value
FROM Sales
WHERE item_id = 1
GROUP BY DAY_OF_WEEK(txn_time)
HAVING day_value > 10
Map operation:
1) have an if(item_id == 1) equivalent statement
2) do value = price*quantity calc
3) find DAY_OF_WEEK
4) write out {dow, value} record #no need item_id as there is only one
Merge operation is automatic and collates all records with the same dow
Reduce:
1) sum the value to get day_value
2) write out only day_value > 10
You will get files on HDFS of "days" corresponding to their day_value.