Data Engineer Interview Questions
- 0of 0 votes
Answerscreate a custom feature transformer in spark scala.Lets say dataframe is like below
- ashwini.padhy89 December 03, 2018 in India
+--------------------+ .
| email_list| .
+--------------------+ .
|testmail1115@gmail.com| .
|mavenmaven@mlail.com| .
|dnd.7899334622@gmail.com| .
+--------------------+ .
If i use the transformer it converts the input array of strings into an array of n-grams.like below:
+--------------------+--------------------+
| email_list| ngrams| .
+--------------------+--------------------+
|testmail1115@gmail.com|[t e, e s, s t, t...|
|mavenmaven@mlail.com|[m a, a v, v e, e...| .
|dnd.7899334622@gmail.com|[d n, n d, d...| .
+--------------------+--------------------+ .
How to get the distinct ngram present rather the pattern or array .| Report Duplicate | Flag | PURGE
StartUp Data Engineer - 0of 0 votes
AnswersYou have two files in hdfs one having date range with two columns start date and end date and another having two column with date and visitors field. You have to write a spark code which gives date range having maximum no. of visitors using that two files.
- tokritijain October 30, 2018 in India| Report Duplicate | Flag | PURGE
Amazon Data Engineer - -1of 1 vote
AnswersWe have two sequences A and B consisting of integers, both of length N, and you would like them to be (strictly) increasing, i.e. for each K (0 ≤ K < N − 1), A[K] < A[K + 1] and B[K] < B[K + 1]. Thus, you need to modify the sequences, but the only manipulation you can perform is to swap an arbitrary element in sequence A with the corresponding element in sequence B. That is, both elements to be exchanged must occupy the same index position within each sequence.
- koustav.adorable March 03, 2018 in United States
For example, given A = [5, 3, 7, 7, 10] and B = [1, 6, 6, 9, 9], you can swap elements at positions 1 and 3, obtaining A = [5, 6, 7, 9, 10], B = [1, 3, 6, 7, 9].
Your goal is make both sequences increasing, using the smallest number of swaps.
Write a function:
public int minswaps(int[] A, int[] B);
that, given two zero-indexed arrays A, B of length N, containing integers, returns the minimum number of swapping operations required to make the given arrays increasing. If it is impossible to achieve the goal, return −1.
For example, given:
A[0] = 5 B[0] = 1
A[1] = 3 B[1] = 6
A[2] = 7 B[2] = 6
A[3] = 7 B[3] = 9
A[4] = 10 B[4] = 9
your function should return 2, as explained above.
Given:
A[0] = 5 B[0] = 2
A[1] = -3 B[1] = 6
A[2] = 6 B[2] = -5
A[3] = 4 B[3] = 1
A[4] = 8 B[4] = 0
your function should return −1, since you cannot perform operations that would make the sequences become increasing.
Given:
A[0] = 1 B[0] = -2
A[1] = 5 B[1] = 0
A[2] = 6 B[2] = 2
your function should return 0, since the sequences are already increasing.
Assume that:
N is an integer within the range [2..100,000];
each element of arrays A, B is an integer within the range [−1,000,000,000..1,000,000,000];
A and B have equal lengths.
Complexity O(N)| Report Duplicate | Flag | PURGE
Amazon Data Engineer Algorithm - 1of 1 vote
AnswerGiven an array, find the number of tuple such that A [i] + A [j] + A [k] = A [l] in an array, where i <j <k <l.
- ajay.raj January 26, 2018 in United States| Report Duplicate | Flag | PURGE
Google Data Engineer - 0of 0 votes
AnswersThere are three numbers a, b, and c. the product of any two numbers is equal to the third number. For example a*b=c or b*c=a or a*c=b. Then what are the possible a, b and c values?
- D PRAVEEN KUMAR January 23, 2018 in India| Report Duplicate | Flag | PURGE
Skill Subsist Impulse Ltd Data Engineer General Questions and Comments - 0of 0 votes
AnswersGive you a 2xN board and two kinds of tiles: 1x2 (two squares across), 2x1 (two squares up) Ask how many ways you can fill the board.
** ** * * * * ** **
Follow up is the new four kinds of tiles: L shape in different angle, , ask you how many kinds of tiles are now six
- ajay.raj January 23, 2018 in United States| Report Duplicate | Flag | PURGE
Google Data Engineer - 0of 0 votes
Answersgive a binary matrix, 0 on behalf of the sea, 1 on behalf of the land, the val also represents the height of the altitude, if a cell is originally on land and is also surrounded by eight neighbor are on land, that cell become 2, each cell and its eight neighbor elevation cannot differ by more than 1. Return to the highest altitude can take altitude (special case is if the entire matrix is 1, then it is unlimited)
- ajay.raj January 23, 2018 in United States| Report Duplicate | Flag | PURGE
Google Data Engineer - 0of 0 votes
AnswersGive a weighted n-nary tree and find the longest path from the root node to the leaf node
- ajay.raj January 21, 2018 in United States
class Node {
int id;
// connected node id, edge weight
Map <Integer, Integer> edges;
}| Report Duplicate | Flag | PURGE
Google Data Engineer - 0of 0 votes
AnswersGiven a binary matrix, count the number of square that can be formed by all 0s
- ajay.raj January 20, 2018 in United States| Report Duplicate | Flag | PURGE
Google Data Engineer - 0of 0 votes
Answersgiven a string p, called order, such as abc, means a in front of b, and so on
- ajay.raj January 20, 2018 in United States
given a second string s, to determine whether it is follow the order of p, return boolean,
example If aaa return true,
If cba is false
If aaxyc is true, the letters that have not been seen in the order are skipped| Report Duplicate | Flag | PURGE
Google Data Engineer - 0of 0 votes
AnswersMS FTE Question:
- mktauseef October 04, 2017 in United States
Find the gap from 1,2,5,6,10
Answer : 3,4,7,8,9| Report Duplicate | Flag | PURGE
Data Engineer Database - 0of 0 votes
AnswersDesign a system to find top 10 twitter hashtags in the most recent 1 min, 10 min, 1 hr
- SmashDUNK August 28, 2017 in United States| Report Duplicate | Flag | PURGE
Twitter Data Engineer Software Design - 0of 0 votes
Answersnumber_one = "193283492420348904832902348908239048823480823"
- shopatlemo July 01, 2017 in United States
number_two = "3248234890238902348823940990234"
Question:
1) I need to multiply this and get the answer
2) DO NOT CONVERT TO INT AND DO THE MULTIPLICATION| Report Duplicate | Flag | PURGE
Facebook Data Engineer Python - 0of 0 votes
AnswersI have two tables
- shopatlemo July 01, 2017 in United States
Supplier Table:
Supp_id
supp_name
Invoice Table:
inv_id
supp_id
inv_date
inv_amt
payment_date
paid_amt
I want to list the invoice(s) that have highest invoice_amt for the year 2016.
DO NOT USE MIN/MAX function| Report Duplicate | Flag | PURGE
Facebook Data Engineer SQL - 0of 0 votes
AnswersGroup by with having related questions. ER provided was customer table.
- harshvp April 12, 2017 in United States for Search| Report Duplicate | Flag | PURGE
Facebook Data Engineer Database - 0of 0 votes
AnswersFind the % of all male customers in a specific area out of all the customers in that area.
- harshvp April 12, 2017 in United States for Search| Report Duplicate | Flag | PURGE
Facebook Data Engineer Database - 0of 0 votes
AnswersGet total number of all the departments of each employees
- harshvp April 12, 2017 in United States for Search| Report Duplicate | Flag | PURGE
Facebook Data Engineer Database
Open Chat in New Window