## Data Engineer Interview Questions

- 0of 0 votes

Answerscreate a custom feature transformer in spark scala.Lets say dataframe is like below

- ashwini.padhy89 December 03, 2018 in India

+--------------------+ .

| email_list| .

+--------------------+ .

|testmail1115@gmail.com| .

|mavenmaven@mlail.com| .

|dnd.7899334622@gmail.com| .

+--------------------+ .

If i use the transformer it converts the input array of strings into an array of n-grams.like below:

+--------------------+--------------------+

| email_list| ngrams| .

+--------------------+--------------------+

|testmail1115@gmail.com|[t e, e s, s t, t...|

|mavenmaven@mlail.com|[m a, a v, v e, e...| .

|dnd.7899334622@gmail.com|[d n, n d, d...| .

+--------------------+--------------------+ .

How to get the distinct ngram present rather the pattern or array .| Report Duplicate | Flag | PURGE

StartUp Data Engineer - 0of 0 votes

AnswersYou have two files in hdfs one having date range with two columns start date and end date and another having two column with date and visitors field. You have to write a spark code which gives date range having maximum no. of visitors using that two files.

- tokritijain October 30, 2018 in India| Report Duplicate | Flag | PURGE

Amazon Data Engineer - -1of 1 vote

AnswersWe have two sequences A and B consisting of integers, both of length N, and you would like them to be (strictly) increasing, i.e. for each K (0 ≤ K < N − 1), A[K] < A[K + 1] and B[K] < B[K + 1]. Thus, you need to modify the sequences, but the only manipulation you can perform is to swap an arbitrary element in sequence A with the corresponding element in sequence B. That is, both elements to be exchanged must occupy the same index position within each sequence.

- koustav.adorable March 03, 2018 in United States

For example, given A = [5, 3, 7, 7, 10] and B = [1, 6, 6, 9, 9], you can swap elements at positions 1 and 3, obtaining A = [5, 6, 7, 9, 10], B = [1, 3, 6, 7, 9].

Your goal is make both sequences increasing, using the smallest number of swaps.

Write a function:

public int minswaps(int[] A, int[] B);

that, given two zero-indexed arrays A, B of length N, containing integers, returns the minimum number of swapping operations required to make the given arrays increasing. If it is impossible to achieve the goal, return −1.

For example, given:

A[0] = 5 B[0] = 1

A[1] = 3 B[1] = 6

A[2] = 7 B[2] = 6

A[3] = 7 B[3] = 9

A[4] = 10 B[4] = 9

your function should return 2, as explained above.

Given:

A[0] = 5 B[0] = 2

A[1] = -3 B[1] = 6

A[2] = 6 B[2] = -5

A[3] = 4 B[3] = 1

A[4] = 8 B[4] = 0

your function should return −1, since you cannot perform operations that would make the sequences become increasing.

Given:

A[0] = 1 B[0] = -2

A[1] = 5 B[1] = 0

A[2] = 6 B[2] = 2

your function should return 0, since the sequences are already increasing.

Assume that:

N is an integer within the range [2..100,000];

each element of arrays A, B is an integer within the range [−1,000,000,000..1,000,000,000];

A and B have equal lengths.

Complexity O(N)| Report Duplicate | Flag | PURGE

Amazon Data Engineer Algorithm - 1of 1 vote

AnswerGiven an array, find the number of tuple such that A [i] + A [j] + A [k] = A [l] in an array, where i <j <k <l.

- ajay.raj January 26, 2018 in United States| Report Duplicate | Flag | PURGE

Google Data Engineer - 0of 0 votes

AnswersThere are three numbers a, b, and c. the product of any two numbers is equal to the third number. For example a*b=c or b*c=a or a*c=b. Then what are the possible a, b and c values?

- D PRAVEEN KUMAR January 23, 2018 in India| Report Duplicate | Flag | PURGE

Skill Subsist Impulse Ltd Data Engineer General Questions and Comments - 0of 0 votes

AnswersGive you a 2xN board and two kinds of tiles: 1x2 (two squares across), 2x1 (two squares up) Ask how many ways you can fill the board.

`** ** * * * * ** **`

Follow up is the new four kinds of tiles: L shape in different angle, , ask you how many kinds of tiles are now six

- ajay.raj January 23, 2018 in United States| Report Duplicate | Flag | PURGE

Google Data Engineer - 0of 0 votes

Answersgive a binary matrix, 0 on behalf of the sea, 1 on behalf of the land, the val also represents the height of the altitude, if a cell is originally on land and is also surrounded by eight neighbor are on land, that cell become 2, each cell and its eight neighbor elevation cannot differ by more than 1. Return to the highest altitude can take altitude (special case is if the entire matrix is 1, then it is unlimited)

- ajay.raj January 23, 2018 in United States| Report Duplicate | Flag | PURGE

Google Data Engineer - 0of 0 votes

AnswersGive a weighted n-nary tree and find the longest path from the root node to the leaf node

- ajay.raj January 21, 2018 in United States

class Node {

int id;

// connected node id, edge weight

Map <Integer, Integer> edges;

}| Report Duplicate | Flag | PURGE

Google Data Engineer - 0of 0 votes

AnswersGiven a binary matrix, count the number of square that can be formed by all 0s

- ajay.raj January 20, 2018 in United States| Report Duplicate | Flag | PURGE

Google Data Engineer - 0of 0 votes

Answersgiven a string p, called order, such as abc, means a in front of b, and so on

- ajay.raj January 20, 2018 in United States

given a second string s, to determine whether it is follow the order of p, return boolean,

example If aaa return true,

If cba is false

If aaxyc is true, the letters that have not been seen in the order are skipped| Report Duplicate | Flag | PURGE

Google Data Engineer - 0of 0 votes

AnswersMS FTE Question:

- mktauseef October 04, 2017 in United States

Find the gap from 1,2,5,6,10

Answer : 3,4,7,8,9| Report Duplicate | Flag | PURGE

Data Engineer Database - 0of 0 votes

AnswersDesign a system to find top 10 twitter hashtags in the most recent 1 min, 10 min, 1 hr

- SmashDUNK August 28, 2017 in United States| Report Duplicate | Flag | PURGE

Twitter Data Engineer Software Design - 0of 0 votes

Answersnumber_one = "193283492420348904832902348908239048823480823"

- shopatlemo July 01, 2017 in United States

number_two = "3248234890238902348823940990234"

Question:

1) I need to multiply this and get the answer

2) DO NOT CONVERT TO INT AND DO THE MULTIPLICATION| Report Duplicate | Flag | PURGE

Facebook Data Engineer Python - 0of 0 votes

AnswersI have two tables

- shopatlemo July 01, 2017 in United States

Supplier Table:

Supp_id

supp_name

Invoice Table:

inv_id

supp_id

inv_date

inv_amt

payment_date

paid_amt

I want to list the invoice(s) that have highest invoice_amt for the year 2016.

DO NOT USE MIN/MAX function| Report Duplicate | Flag | PURGE

Facebook Data Engineer SQL - 0of 0 votes

AnswersGroup by with having related questions. ER provided was customer table.

- harshvp April 12, 2017 in United States for Search| Report Duplicate | Flag | PURGE

Facebook Data Engineer Database - 0of 0 votes

AnswersFind the % of all male customers in a specific area out of all the customers in that area.

- harshvp April 12, 2017 in United States for Search| Report Duplicate | Flag | PURGE

Facebook Data Engineer Database - 0of 0 votes

AnswersGet total number of all the departments of each employees

- harshvp April 12, 2017 in United States for Search| Report Duplicate | Flag | PURGE

Facebook Data Engineer Database

**CareerCup**is the world's biggest and best source for software engineering interview preparation. See all our resources.

Open Chat in New Window