Facebook Interview Question
Data EngineersCountry: United States
Interview Type: In-Person
It should return $95.15. Your query is returning $85.95. For the year 2016 there are 4 lines and highest amount is $95.15. In the subselect order by should be desc. Code tested in SQL Server.
id supp_id date amount
2 1 2016-05-11 16:35:00.000 95.15
4 1 2016-07-09 09:00:00.000 95.15
5 3 2016-01-09 08:15:00.000 95.15
Tell the interviewer what he's looking for - a self join to find all the invoices with the highest value. Also tell him that doing a self-join is a very expensive operation, esp when the amount field is not indexed. The most efficient way is to scan the table once and write a client program to find the answer.
-- using SQL Server T-SQL. I'm not sure why Suppliers are even listed. You didn't mention that we want them in the result. So I'll assume the Suppliers are there, but we don't need them.
-------code-----
DECLARE @max_paid_amt int
-- this will get the highest amount, but if there's more than one with the same, we won't get the rest.
SET @max_paid_amt = (
SELECT TOP 1 I.paid_amt FROM Invoices I WHERE YEAR(I.inv_date) = 2016 ORDER BY I.paid_amt DESC
)
-- select all of them. All fields in the table.
SELECT I.*
FROM Invoices WHERE year(I.inv_date) = 2016 AND I.paid_amt = @max_paid_amt
--using SQL SERVER T-SQL. I'm assuming that the 2 tables are already created and populated. Question doesn't mention that we actually will return anything in Suppliers table, so I'm not.
---code---
DECLARE @max_paid_amt int
-- this will get the highest amount, but if there's more than one with the same, we won't get the rest.
SET @max_paid_amt = (
SELECT TOP 1 I.paid_amt FROM Invoices I WHERE YEAR(I.inv_date) = 2016 ORDER BY I.paid_amt DESC
)
-- select all of them.
SELECT I.*
FROM Invoices WHERE year(I.inv_date) = 2016 AND I.paid_amt = @max_paid_amt
I dont think there will be a single invoice so anything with limit 1 or TOP wont be the right solution, We cant group by either because then we need to use aggregate functions ( max, min) which we are not allowed to use.
This is what I came up with, Let me know if this is wrong -
select inv_id,supp_id,supp_name,inv_amt from (
select i.inv_id,i.supp_id,s.supp_name,i.inv_amt, rank() over(partition by i.inv_id order by i.inv_amt desc) as "Rank" from invoice i
left outer join Supplier s
on i.supp_id=s.supp_id
where year(i.payment_date)='2016' ) A
where A.Rank = 1
Please note that I am joining supplier table to get supp_name, if we dont need supplier name then that join can be removed.
Can someone check and confirm if this is correct -
select inv_id,supp_id,inv_date,inv_amt from (
select i.id as inv_id,s.id as supp_id,i.inv_date,i.inv_amt,dense_rank() over(partition by i.id,s.id,i.inv_date order by i.inv_amt desc) as ranks from invoice i
left outer join supplier s
on i.supp_id=s.id
where date_part('year',i.inv_date)='2016' ) A
where A.ranks=1
as mysql query
and the DDL to create the schema
- Chris July 01, 2017