## ThoughtWorks Interview Question

Applications Developers**Country:**India

**Interview Type:**Written Test

Just a thought process, about to enter under-grad school, however, what I can make of this problem is creating a data structure(lets say Sets) for every subject.

First thing is what is the total marks of each test.

For eg: If subject 1 total marks = 100, we can subtract each obtained score from 100, and the answer we get can be entered in a set. For eg: 100 - 20 = 80,

80 enters the set, and then we check the second students score and subtract it from 100, and we get 65.

We compare 65 to 80 which is already in the set, as 65 less than 80, we can regard the score of second student higher than first student.

Do the same with the third student.

Just a thought process and an idea.

Sorting is not an entirely good idea here,

we do not need it. We can solve it w/o sorting, and in linear time.

Specifically, we can solve it in 2n, in fact, we can solve it in exact n, e.g. in a single traversal. But first, I would showcase 2n in ZoomBA :

```
data = [ [ 'Student1', 20, 40, 65],
[ 'Student2', 35, 40, 50],
[ 'Student3', 10, 55, 65] ]
// for 3 subjects... can be generalized ...
max = list([0:3]) -> { -1 } // hoping scores do not become -ve
max = fold ( data , max ) ->{
row = $.o ; max = $.p
for ( i = 1 ; i < 4 ; i += 1 ){
if ( row[i] > max[i-1] ){
max[i-1] = row[i]
}
}
max // return
}
// next, pick and group toppers student -> topped_in_sub_count
// select those who are topper in more than n subjects ?
n = 2
toppers = fold( data ) -> {
row = $.o
count = sum ( [1:4] ) -> { row[$.o] == max[$.i] ? 1 : 0 }
continue ( count < n ) // ignore when count is < n
[ row[0] , count ] // return
}
println( toppers )
```

object Misc {

def main (args:Array[String]) {

// Student1 20 40 65

// Student2 35 40 50

// Student3 10 55 65

// Given n = 2.

// Find the name of the students who has got top marks in at least n subjects.

val warehouseLocation = "file:${system:user.dir}/spark-warehouse";

val sparkSession = SparkSession.builder().appName("SparkUseCaseLearning").master("local")

.config("spark.sql.warehouse.dir", warehouseLocation)

.enableHiveSupport().getOrCreate();

import sparkSession.sqlContext.implicits._;

val studentMarksDF = Seq((20,40,65),(35,40,50),(10,55,65)).toDF("sub1","sub2","sub3");

val totalMarksDF = studentMarksDF.withColumn("Total", $"sub1"+$"sub2"+$"sub3").orderBy($"Total".desc);

totalMarksDF.show();

// +----+----+----+-----+

// |sub1|sub2|sub3|Total|

// +----+----+----+-----+

// | 10| 55| 65| 130|

// | 20| 40| 65| 125|

// | 35| 40| 50| 125|

// +----+----+----+-----+

println(totalMarksDF.head())

// [10,55,65,130]

}

}

```
object Misc {
def main (args:Array[String]) {
// Student1 20 40 65
// Student2 35 40 50
// Student3 10 55 65
// Given n = 2.
// Find the name of the students who has got top marks in at least n subjects.
val warehouseLocation = "file:${system:user.dir}/spark-warehouse";
val sparkSession = SparkSession.builder().appName("SparkUseCaseLearning").master("local")
.config("spark.sql.warehouse.dir", warehouseLocation)
.enableHiveSupport().getOrCreate();
import sparkSession.sqlContext.implicits._;
val studentMarksDF = Seq((20,40,65),(35,40,50),(10,55,65)).toDF("sub1","sub2","sub3");
val totalMarksDF = studentMarksDF.withColumn("Total", $"sub1"+$"sub2"+$"sub3").orderBy($"Total".desc);
totalMarksDF.show();
// +----+----+----+-----+
// |sub1|sub2|sub3|Total|
// +----+----+----+-----+
// | 10| 55| 65| 130|
// | 20| 40| 65| 125|
// | 35| 40| 50| 125|
// +----+----+----+-----+
println(totalMarksDF.head())
// [10,55,65,130]
}
}
```

The answer is given in the question: student 3 obviously has the highest marks and it would be wrong to average out the scores between 1 and 2 because they are equal. Which means if 3 is are baseline and 65 is high then person 1 also scored 65 which is higher than person 2.

- Amanda December 13, 2016