## Amazon Interview Question

SDE-2s**Country:**India

**Interview Type:**In-Person

This is problem of "Maximum Independent Set", so your solution is not correct. As a NPC problem, maybe the only way is searching.

MIS is fundamentally different problem than the one mentioned here.

MIS is finding a maximal subset of nodes S a Graph (G=(V,E). Where no two elements of S are connected by edge E.

This is create a new Graph G'(V',null) ; V' is subset of V ,such that there are no edges in the new graph.

Greedy approach of removing node with maximum outgoing edges will work.

You approach to solve the problem is good ,but you have to include some more steps when you remove any node.

Lets say you are maintaining a counter which counts how many times ,you re removing nodes for that set.

Construct a graph of given set with edges of common between set.

When you remove first edge ,remove common node from one set and set the flag of that set 1 denotes that 1 node has removed from that set

Construct the graph again with remaining nodes

Take the next edge from the ,remove the node from the set where counter is less

if both the counters are equal then remove the nodes from the bigger set.

Repeat the above process till there are any edge left in the graph

Then give the final set of independent sets.

I understood this problem as a problem with multiple sets with integer numbers. There are common numbers between the sets. So remove the common numbers so that all sets are independent sets.

Example question:

Set1: {1,2,3,4,5}

Set2: {3,6,7}

Set3: {3,4,5}

Now remove the duplicate elements in the set so that you get maximum number of sets

So instead of removing all the elements from set 3 as they are all duplicate, I can generate the output result as

Set1: {1,2,5}

Set2: {6,7}

Set3: {4}

Is this a correct interpretation of this problem?

Note: This can not be a maximum independent set problem as noted by some. In the MIS problem usually the goal is to find an independent "weighted" set with highest sum. Here the restriction for the independent set is the set with no two adjacent elements present in the set.

example: for the set {1,4,5,4,10,6,1,4}, the max independent set is {1,5,10,4} with the max weight of 20.

Ta for write up, helps make sense of the problem.

One question I have is why is the 3 is excluded from the results?

Only solution that springs to mind:

* Scan through and get a list of numbers in more than one list.

* Then go through each list, and each number in each list.

* If current number is not shared with other lists

* add it to the curent list

* add it to a visted Stack so we know not to use it in lists we're yet to visit

* go to next item in that list.

* If current number is shared and we haven't added it to previous list (its not in visited Stack) then have 2 choices

* add it to current list, visited Stack, and continue

* don't add it to this list and continue, leaving it open to be used in any future lists that have it

* Each time we hit a situation where we've evaluated an entire path, as in we've looked at each number in each list, we compare the solution to the best we've got so far and if its better save it.

The performance is going to be dominated by the branching, in terms of choice between putting it in current list or not, caused by a shared unvisited number. I'd think O(2^S) where S is the count of instances where a particular number is in more than one list.

Having said that with this approach the 3 would be added to one of the sets which is why I ask.

Using Greedy Approach, we can solve this. As per the problem, we have given N number of sets, for example,

S1 = {1,2,3,4}

S2 = {2,3}

S3 = {4}

S4 = {5,6}

Optimal Solution for this problem is S2, S3 & S4.

Algorithm for this Optimal solution is,

Find the set with less number of elements in it and remove the sets that intersect with it.

From above, we have S3 with less number of elements ie. {4}. Now remove the set which intersect with S3. So S1 intersect with S3. Now remove S1. And do this recursively , So we end up getting the optimal solution.

```
// Given N sets of integers, remove some sets so that the remaining all sets
// are disjoint with one another. Find the optimal solution so that the number
// of sets remaining at the end is maximum
import java.util.*;
// Instance of this class passed during downward recursion.
class Superset extends HashSet<Integer> {
// Add set only if it's disjoint, updating the used map accordingly.
public boolean addMaybe(Set<Integer> set) {
for (int i : set) {
if (this.contains(i))
return false;
}
// If here, add the entire set.
this.addAll(set);
return true;
}
}
// Note: No real need for the set to be sorted during processing, but it makes for nicer output display.
// Empty instances of this class are created by the recursion "base case", and passed up during post-recursion ascent,
// with each level adding an index to the set.
// Note: The integers in the TreeSet correspond to indices into the ArrayList of candidate sets being down-selected.
class Optset extends TreeSet<Integer> {}
// Encapsulate the optimization algorithm.
class OptimalSet {
private ArrayList<Set<Integer>> sets;
int n, m, max;
// The original list of int sets, along with the numeric parameters, is passed during construction.
OptimalSet(ArrayList<Set<Integer>> sets, int n, int m, int max) {
this.sets = sets;
this.n = n;
this.m = m;
this.max = max;
}
// Recursive algorithm for finding optimal set of sets.
// Explanation: At each level (defined by input idx), we check to see whether the int set corresponding to input idx
// can be added to the existing Superset (also input) without violating the disjointness constraint. If not, we
// return null to prune the subtree at this point; otherwise, we add the int set to the Superset, and loop over
// "child" levels, each of which represents an int set candidate for addition to the current superset. Each child
// recursion returns its optimal set (Optset). Because the recursions that build the subtrees have knowledge of the
// tree above them (via passed Superset instance), we know the returned set is valid; thus, each level simply
// chooses the "best" subtree set (i.e., the one containing the most int sets), adds itself, and returns the updated
// Optset. After all recursion has finished, there will be only 1 surviving Optset: the optimal one (or at least one
// that has no betters).
Optset find(Superset sset, int idx) {
// Add the set at idx (if possible).
// TODO: Potential optimization. Don't addMaybe if no possibility to recurse.
if (idx < 0 || sset.addMaybe(sets.get(idx))) {
// Accept this level.
if (idx < 0)
// Bootstrap.
sset = new Superset();
// Recurse on subtrees, looking for best Optset.
Optset bset = null;
for (int i = idx + 1; i < n; i++) {
Optset oset = find(sset, i);
if (oset != null && (bset == null || oset.size() > bset.size())) {
// Found a better subtree
bset = oset;
}
}
// Is there a best set from amongst children?
if (bset == null)
// Either no children, or no child subtree that can be added.
bset = new Optset();
if (idx >= 0) {
// Remove what was added by addMaybe before returning.
// Rationale: Reuse a single Superset for entire recursion, rather than cloning many.
sset.removeAll(sets.get(idx));
// Add self before returning.
bset.add(idx);
}
return bset;
}
// Prune this level (since adding it to superset violates disjointness).
return null;
}
}
public class OptimalSetTest {
// Usage:
// Example usage: 10 sets, 5 ints per set, ints between 0 and 99
// java OptimalSetTest 10 5 100
public static void main(String argv[]) {
int n = Integer.parseInt(argv[0]); // # sets
int m = Integer.parseInt(argv[1]); // ints per set
int max = Integer.parseInt(argv[2]); // exclusive endpoint of rand range
// Build array of n sets of random ints; each set contains m ints in range [0, max), and each int is unique
// within its set (but not across the sets).
Random rand = new Random();
ArrayList<Set<Integer>> sets = new ArrayList<Set<Integer>>();
for (int i = 0; i < n; i++) {
TreeSet<Integer> set = new TreeSet<Integer>();
sets.add(set);
// Fill the set with random ints.
for (int j = 0; j < m; j++) {
int ri;
while (set.contains(ri = rand.nextInt(max))) {}
set.add(ri);
}
}
// Display the sets, along with the index in master array of each.
int j = 0;
for (Set<Integer> set : sets) {
System.out.println(j++ + "\t: " + set);
}
// Invoke recursive method that returns optimal set of sets.
// Note: Returned Optset contains only the indices of the selected sets.
Optset oset = new OptimalSet(sets, n, m, max).find(null, -1);
// Display the optimal set of sets.
System.out.println();
System.out.println("Optimal set of sets (containing " + oset.size() + " disjoint subsets):");
// Concatenate the set of sets into a single superset both for display purposes, and to demonstrate uniqueness
// using implicit constraints of a TreeSet.
Set<Integer> all = new TreeSet<Integer>();
for (int i : oset) {
System.out.println(i + "\t: " + sets.get(i));
all.addAll(sets.get(i));
}
System.out.println("Superset (containing " + all.size() + " ints):");
System.out.println(all);
}
}
// vim:ts=4:sw=4:et:tw=120
```

A simple approach is a recursive approach. A given set is either in the optimal set, or is not. Considering both these possibilities, we can write an algorithm that makes this comparison recursively.

ArrayList<Set<Integer>> getMaxDisjointLists(ArrayList<Set<Integer>> list){

if(list.isEmpty){

return list;

}

ArrayList<Set<Integer>> withoutSet0 getMaxDisjointLists(list.remove(0));

HashSet<Integer> currentSet = new HashSet<Integer>();

for(Integer i: list.get(0)){

currentSet.add(i);

}

for(int i=1;i<list.size();i++){

if(!currentSet.retainAll(list.get(i)).isEmpty()){

list.remove(i);

}

withSet0 = getMaxDisjointLists(list);

if(withSet0.size() >withoutSet0.size()){

return withSet0;

}else{

return withoutSet0;

}

}

I would do it this way:

//create 3 collections, the first representing the N sets. the second being the "winning collection of sets, starting off empty, collection 3 being my 'working collection'

for(i=1;i<collection1.length;i++)

{

clear out collection 3

//assume set i MUST be in the 'winning set'

add set(i) from collection 1 to collection 3

loop through the sets and add all sets that are disjoint to set(i) to collection 3

if collection3.length > collection2.length

we have a new winner, so clear collection 2, and copy collection 3 into it.

}

this does not handle ties

```
# Question
# Given N sets of integers, remove some sets so that the remaining all sets are disjoint with one another.
# Find the optimal solution # so that the number of sets remaining at the end is maximum
def return_disjoint_sets(list_of_sets):
index_hash = {} # key is element, # value is the index of set
set_idx = 0
for set in list_of_sets:
for elem in list(set):
try:
index_hash[elem].append(set_idx)
except:
index_hash[elem] = [set_idx]
set_idx+=1
# now we have a index hash that looks like
# for example when a list of sets l=[{1,2,3},{4,5},{6,7},{3,5}] is given
# index_hash = {1:[0], 2:[0], 3:[0,3], 4:[1], 5:[1,3], 6:[2], 7:[2]}
# now we know what element exist more than one sets
remove_idx = []
set_idx = 0
for set in list_of_sets:
for elem in list(set):
if len(index_hash[elem])>1:
remove_idx.append(set_idx)
break
set_idx+=1
# make a list of disjoint sets
return_list = []
set_idx = 0
for set in list_of_sets:
if set_idx not in remove_idx:
return_list.append(set)
set_idx+=1
return return_list
def main():
list_of_set_example = [set([1,2,3]),set([4,5]),set([6,7]),set([3,5])]
print return_disjoint_sets(list_of_set_example)
if __name__=="__main__":
main()
```

I would allude to JerseyDukes solution on the definition of Dynamic (Bottom Up)

The Maximum Number of DisjointSets = Max( MaxNumber Of DisjointSets with A Already in the List, MaxNumber of DisjointSets with B Already in the List) and so on.

Rough Pseudo,

So we iterate over the remaining in workingList

if (ListInCUrrentIndex NOT DISJOINT to CurrentSuccessList) continue;

SuccessList.Add(ListInCurrentIndex)

n = 1+ Maximum(givenList - ListInCurrentIndex);

if n > max, Update variables.

Note, you are repeatedly calculating if two sets are disjoint, you can memoize it in a hashmap (ToString_ToString as the Key). You can use one other HashMap to Hash Set to ToString if the interviewer will let you.

I don't see anything that beats a simple combinatoric approach.

```
/**
* @param max indices of the maximum list of disjoint sets
* @param disjoint the union of all used sets
* @param used indices of the sets included in the disjoint set. HashSet isn't necessarily efficient, but it's simple
* @return indices of the maximum number of disjoint sets
*/
public Set<Integer> getDisjointSets(Set<Integer> max,
Set<Integer> disjoint,
Set<Integer> used,
int idx,
List<Set<Integer>> sets) {
if(max.size() < used.size()) {
max.clear();
max.addAll(used);
}
// while used has a chance to beat max
for(; idx < sets.size() && max.size() < used.size() + (sets.size()-idx); idx++) {
Set<Integer> set = sets.get(idx);
if(!set.stream().anyMatch(item->disjoint.contains(item))) {
disjoint.addAll(set);
used.add(idx);
getDisjointSets(max, disjoint, used, idx+1, sets);
used.remove(idx);
disjoint.removeAll(set);
}
}
return max;
}
```

PS: the given problem is NP hard

what they are expecting ...

1.A backtracking problem non polynomail probelm

2. you can give a greeyd problem with the argument that it will not work in most of the cases

done!!!

Construct a graph where sets are nodes and an edge exists between them if there is a common element. Start from the node with maximum number of outgoing edges and keep removing the nodes until there are no edges between nodes

- BalajiN July 12, 2014