Given a set of numbers, find t

Google Interview Question for Applications Developers

5

of 5 votes

57
Answers
Given a set of numbers, find the longest subset with consecutive numbers be it any order.

Input:
S = { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3}

we have 2 consecutive sets
s1 = {1, 2, 3, 4, 5}
s2 = { 8, 9, 10, 11}

Ans.
s1 = {1, 2, 3, 4, 5}
- anonymous June 14, 2013 in India | Report Duplicate | Flag | PURGE
Google Applications Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 4 vote

here is my answer to this question

1. Create a new hashmap. Its key will contain starting index of contiguous set and value be ending index of contiguous set.
2. Iterator through array.
3. Let each value be x. look for x-1 and x+1 in hashmap.
case1. there exists an entry for x-1, so update its value to x if x is greater than existing value
case2. there exists an entry for x+1, get its value, say vx and insert an entry with key as x and value as vx.
case 3. there exists an entry for x; simply ignore
case 4: insert into hashmap, key x and value x

Ex: S = { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11}
5 => { (5,5) }
1 => { (5,5), (1,1) }
9 => { (5,5), (1,1), (9,9) }
3 => { (5,5), (1,1), (9,9), (3,3) }
8 => { (5,5), (1,1), (8,9), (3,3) }
20 => { (5,5), (1,1), (8,9), (3,3), (20,20) }
4 => { (5,5), (1,1), (8,9), (3,4), (20,20) }
10 => { (5,5), (1,1), (8,9), (3,4), (20,20), (10,10) }
2 => { (5,5), (1,2), (8,9), (3,4), (20,20), (10,10) }
11 => { (5,5), (1,1), (8,9), (2,4), (20,20), (10,11) }
3 => { (5,5), (1,2), (8,9), (3,4), (20,20), (10,11) } -> (ignored)

after array iteration is done, repeat step 3 with minor modifications by iterating over map again and longest set will be with max diff b/w key and value.

Let key be k and value be x
case1. there exists an entry for x-1, so update its value to x if x is greater than existing value. delete entry with key = k
case2. there exists an entry for x+1, get its value, say vx and update k's value to vx.
delete entry for with key = x+1. repeat this step with x = vx

Now iterate over { (5,5), (1,2), (8,9), (3,4), (20,20), (10,11) }
5 => no action,
1 => update 1's value to 4 and delete (2,4) resulting in
{ (5,5), (1,4), (8,9), (20,20), (10,11) }
repeat with x = 4
update 1's key to 5 and delete (4,5) resulting in
{ (1,5), (8,9), (20,20), (10,11) }
8 -> update 8's value to 11 and delete (10, 11) resulting in
{ (1,5), (8,11), (20,20) }
repeat with x = 11 -> no action
20 => no action
{ (1,5), (8,11), (20,20) }

Haven't mentioned above, at each step we will maintain key-value pair with max diff. So at the end of map iteration, we have result.

Complexit = O(n)...though I'm finding this complex..Is there a simplified way of doing this.

- anonymous June 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-2

of 2 votes

you put a lot of effort into writing that up in plain text ... i wrote a similar solution shorter in code below : )

- emalaj23 June 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@emalaj23: because he is here to learn and not brag about it that he solved the problem.Please explain your algorithm rather than just duming some lines of code.

- aka June 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

didn't mean to brag - just suggesting that sometimes its easier to try to code it rather than explaining it in text

my solution is very similar to his in that we keep a map where the key and value represent the range of consecutive numbers, except that i combine the ranges of consecutives before and after, so when inserting 4 with existing entries (2,3) (3,2) (5,7) (7,5), all of that is simplified to (2,7) (7,2)

- emalaj23 June 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

It is very hard to implement the iteration over hash map. It is simply not good to modify the hash map while erasing existing entries and inserting new entries. Of course I don't mean the solution is incorrect, I am just expressing the practicality of writing a source code.

- Anonymous December 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

When you iterate over { (5,5), (1,2), (8,9), (3,4), (20,20), (10,11) } and merge (1,2) and (3,4) for instance, how do you know there won't be (5, 6) and others for instance ?

If so, how can you guarantee that this merging operation is in constant time (not linear, for instance) ?

- Thomas February 09, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 4 vote

Sorting seems too obvious. If this is a legit Google interview question, I'd suspect there must be a better solution than O(n log n).

I can think of a linear solution, but it has a lot of assumptions; if the numbers are small positive numbers, let's say [0,31), then you can use a 32 bit integer and set the bits with the corresponding number in the int array. e.g., S = { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3} becomes binary digit:
00000000 00010000 00001111 00111110

Then you can check the longest contiguous set bit sequence. This is linear time, but the worst case is the length of the bit vector, not the array length. Similarly, you can use 64 bit number for [0,64), for bigger numbers, you'll have to use a bit vector or a bit set.

Note that this solution will only make sense, if the numbers are positive, small, and/or there are a lot of repetitions.

- oOZz June 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

On similar lines -- instead of trying to store them in bit array (or a binary number), we can simply iterate over [min, max] range and compute the max range that has all numbers present in given array.

This has fewer constraints. Only constraint :- [min, max] range should be reasonable. Numbers can be negative/big/repeated etc.

- X June 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@oOZz: what if there are duplicates? I am not sure how to interpret the word "set of numbers" here - does it mean they are unique? But if not then for instance [1,1,1,4,5] your answer would return [4,5]. Basically what you did is countring sort and we can extend it a bit for negative/bigger/not unique numbers using a different data structure.

- Anonymous June 18, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@anon: numbers in a set have to be unique

for [1,1,1,4,5] -> answer is [4,5]

- anonymous June 19, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

//Time Complexity: O(n)
	public static int[] longestConsecutiveSequence(int[] data){
		int first = Integer.MAX_VALUE; // the first number of the maximum consecutive sequence
        int length = 0; // the length of the maximum consecutive sequence
        Map<Integer, Integer> table = new Hashtable<Integer, Integer>();
        for(int value: data) {
        	if(!table.containsKey(value)) {
        		int start = value;
        		int end = value;
        		if(table.containsKey(value + 1) && table.get(value + 1) >= value + 1) {
        			end = table.get(value + 1);
        			table.remove(value + 1);
        			table.remove(end);
        		}
        		if(table.containsKey(value - 1) && table.get(value - 1) <= value - 1) {
        			start = table.get(value - 1);
        			table.remove(value - 1);
        			table.remove(start);
        		}
        		table.put(start, end);
        		table.put(end, start);
        		if(end - start + 1 > length) {
        			first = start;
        			length = end - start + 1;
        		}
        	}
        }
        int[] seq = new int[length];
        for(int i = 0; i < length; i++){
        	seq[i] = first + i;
        }
        return seq;
    }

- Adnan Ahmad September 30, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

What is the expected time complexity? Sorting can get the job done easily in O(nlgn). Also, are there any pecularities about data that they all are +ve numbers and that they fall within a certain range, duplicates allowed etc etc?

- Murali Mohan June 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

sorting is definitely one solution with O(nlogn). Can we do it better?

there is no data pattern and numbers are not unique....

- anonymous June 14, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 votes

@anonymous

I think we can try do better. The notion of 'consecutive numbers' divides the set into equivalence classes. We can construct an undirected graph with numbers as node values and edges between two nodes if they differ in value by 1. Run DFS to find connected components(which are equivalent classes) finding their size along. The connected component with maximum size is the maximum contiguous subset. Complexity would be O(|V| + |E|). In this case, since we have |E|<|V|, we can say that we have got a O(|V|) solution.

The disjoint-set data structure also enables to paritition the set into equivalent classes. I suppose we can use that as well to find a solution to this problem.

- Murali Mohan June 14, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@dumbo...this looks like good sol.

not sure how easy/difficult is to code this..

- anonymous June 14, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Dumbo: please explain how undirected graph can help here?

- aka June 14, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

@Dumbo you can get the connected components in liner time. However, how will you create a graph with edges that differ in value by 1? Doesn't that require you to search the graph every time you try to insert a node/vertex and add the edge? That'll be a quadratic operation.

- oOZz June 14, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Build the adjacency list using a hash table.Keys would be the numbers and values would be (number, groupid) pairs.

Traverse the array. Add number k as a key to the hash table if not already present. If numbers k+1 and k-1 are present in the hash table, add k to their adjacency lists and vice-versa. (To add more to the implementation details, we need to add k to it's own adjacency list as well). We will then have constant time per insertion.

Initially the groupid is set to null for all of the elements.Iterate over the elements of the hash table and run DFS marking the groupid of the connected components if not already set. Alongside, maintain the size of the largest connected component and its groupid. In the end print all elements of the groupid with the largest size

- Murali Mohan June 14, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Dumbo Is groupid just a bit set/not-set? If the elements come in the order 1,4,2,3....3 will go both in 2 and 4, how do we print that?

- Is_it? June 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 6 votes

@oOZz

Thought of giving a shape to my ideas using connected components of a graph. Finally, a flawless solution with O(n) time complexity that works putting an end to speculations.

Provide input as:
19 16 5 1 9 3 18 17 8 20 4 10 2 11 3 6 13 15 12 14

import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
import java.util.regex.Pattern;

public class MaxConsecSubset {
	
	int ccSize = 0, ccMin = 0, ccMax = 0;
	int tmpSize = 0, tmpMin = 0, tmpMax = 0;
	static HashMap<Integer, HashMap<Integer, Integer>> graph = new HashMap<Integer, HashMap<Integer, Integer>>();
	
	
	public void DFS(int nodeId, int groupId) {
		if (graph.get(nodeId).get(nodeId) == null) {
			graph.get(nodeId).put(nodeId, groupId);
						
			tmpSize++;
			if (nodeId > tmpMax) {
				tmpMax = nodeId;
			}
			
			if (nodeId < tmpMin) {
				tmpMin = nodeId;
			}
			
			for (Map.Entry<Integer, Integer> entry: graph.get(nodeId).entrySet()) {
				if (entry.getKey() != nodeId) {
					DFS(entry.getKey(), groupId);
				}
			}
		}
	}
	
	public void findEquivalenceClasses() {
		
		for (Map.Entry<Integer, HashMap<Integer, Integer>> entry: graph.entrySet()) {
			if (entry.getValue().get(entry.getKey()) == null) {
				tmpSize = 0; tmpMax = -999999; tmpMin = 999999;
				DFS(entry.getKey(), entry.getKey());
				
				if (tmpSize > ccSize) {
					ccSize = tmpSize;
					ccMin = tmpMin;
					ccMax = tmpMax;
				}
			}
		}		
	}	
	
	public void readInput() {
		Scanner scanner = new Scanner(System.in);
		scanner.useDelimiter(System.getProperty("line.separator"));
		Pattern delimiters = Pattern.compile(System.getProperty("line.separator")+"|\\s");
		scanner.useDelimiter(delimiters);
		int i;
		while(scanner.hasNextInt()) {
			i = scanner.nextInt();			
			
			if (!graph.containsKey(i)) {
				HashMap<Integer, Integer> adjElements = new HashMap<Integer, Integer>();
				adjElements.put(i, null);
				graph.put(i, adjElements);
			}
			
			if (graph.containsKey(i + 1)) {
				graph.get(i).put(i+1, null);				
				graph.get(i+1).put(i, null);
			}
			
			if (graph.containsKey(i - 1)) {
				graph.get(i).put(i-1, null);				
				graph.get(i-1).put(i, null);
			}
		}
	}
	
	public static void main(String[] args) {
		MaxConsecSubset maxCS = new MaxConsecSubset();
		maxCS.readInput();
		maxCS.findEquivalenceClasses();
		
		System.out.println("The maximum consecutive subset is:");
		for (int j = maxCS.ccMin; j <= maxCS.ccMax; j++) {
			System.out.print(j + " ");
		}
		System.out.println();
	}
}

- Murali Mohan June 15, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@dumbo..this looks good..

- anonymous June 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Dumbo +1 my friend. This code works.

- oOZz June 17, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Dumbo..

I see that you call 2 functions - readInput() which is O(n) and then findEquivalenceClasses().

In second function, you call DFS for each node. DFS is O(n) and so findEquivalenceClasses() becomes O(n^2) and thus overall complexity as O(n^2).

Not sure, I am missing out something in ur algo...

- anonymous June 19, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Anonymous:

Regardless of the number of times DFS is called, the algorithm to find connected components runs in O(|V| + |E|). The reason for this is upon the first visit, DFS() marks a node as visited and explores its adjacency list. Once a node is marked as visited, its adjacency list is never explored again. Thus the adjacency list of each node in a graph is explored only once, hence the complexity O(|V| + |E|).

In this specific case, |E| < |V|, so the complexity is O(|V|) = O(n).

- Murali Mohan June 19, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Dumbo: I agree with you that since we mark nodes as visited so essentially we visit each node only once. However, how are marking nodes as visited in algo..

or shouldn't hashmap entries be removed to prevent them from being visited again.

- anonymous June 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@anonymous

Here, I use the notion of a groupid, which is the id of a connected component. Initially the groupid is null. When a node is visited, groupid is set to a non-null value. This serves as a 'visited' flag in my algorithm. The groupid though is not much significance and is exclusively serving the purpose of a 'visited' flag.

- Murali Mohan June 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

int max_seq = 0;
int start = 0;
int end = 0;
for(int i = 0; i < l -1 ; i++) {
 
 int current = a[i];
 int next = a[i + 1];
 
 // first sequence match
 if (current + 1 == next) {
  int tmp = max_seq;
  max_seq = 2;
  // check for further sequence
  int k;
  for(k = i + 1; i < l -1 ; i ++) {
   current = a[k];
   next = a[k + 1];
   if (current + 1 != next) {
    // sequence break
    break;
   } else {
    max_seq++;
   }
  
  // maximum info poulation
  if (tmp > max_seq) {
   max_seq = tmp;
  } else {
   start = i;
   end = k;
  }
  
 }
 }
}

May be some variables are unused and could be optimized.
Variable name(s) are self explanatory

I think solution is so simple, does Google really look for it?
Guys if you have any better solution ....

- Debasis Jana June 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I worked out same logic!! :-)

- Karthik June 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I made a mistake..... So in that case we need to sort it first.
This may require typical nlog(n) time then some more operations.

So alternate solution is to maintain a HashTable (provided if numbers may not be unique)

NB: If the numbers are not unique then above code fragment should be changed ( a[k] == a[k + 1] + 1 || a[k] == a[k + 1]

Set<Integer> max_set = ...

HashSet<Integer> numbers = ...
numbers.add(number);

Iterator<Integer> iterator = numbers.iterator();
// iterates numbers
while(iterator.hasNext()) {
 int number = iterator.next();
 // checking for this number sequence
 int c = 0;
 int s = numbers.size();
 Set<Integer> tmp_set = ...
 while(c < s - 1) {
  int nxt = number + ++c;
  if(!numbers.contains(nxt)) {
   break;
  } else {
   tmp_set.add(nxt);
  }
 }

 // max sequence determination
 if(tmp_set.size() > max_set.size()) {
  max_set = tmp_set;
 }
}

To find something hashing is better

- Debasis Jana June 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Can we use a hash table? we'll take the smallest element from the array and add 1 to it, and will check whether hash table has new element or not. if it has we'll again add one to the new element and check. (here we got a set of consecutive numbers).
When new element is not present, we'll take the next smallest element in the hash table and go on with the same process.

- Is_it? June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Declare a struct

struct consecseq{
int first;
int last;
}

Declare an array of this structure where elements will be stored in sorted order of first/last.

For every element in array;

1) Do binary search to find the position where to place the value
2) if currval == (structarray[i].start - 1), add this element and change the first. Check if this can be merged with structarray[i-1] if exists
3) if currval == (structarray[j].last + 1), add this element and change the last. Check if this can be merged with structarray[i+1] if exists
4) If none of above, add a new structarray element in the current position and move the later elements.
5) At the same maximum and index of maximum length should be maintained.

- praneeth June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

O(n) solution using hash-table
Need to traverse the array twice
1. First, fill the array elements in the hash-table Now, start with the first element (say k),
2. Check if k is present in the hash-table. If not, go to the next element.
3. If yes, find the max n such that elements between k & k+n are all present in hash-table, and then remove them from the hash-table.
4. Similarly, find the max m such that elements between k-m and k are present in the hash-table and remove them from the hash-table.
5. Now we know a sequence which is present in the array.
6. Save k-m and the count (m-n) (this is the sequence found so far). Update the min_element and the count if m-n is larger than so far count.
7. Remove k from the hash-table.
8. Move to the next element in the array and repeat step 2
9. The largest sequence is : min_element, min_element+1, ..., min_element+count

- ACP Pradyuman June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

After sorting the whole list

We need two pointers start and end
->n1, n2, n3 .................................. N<-
For each start pointer move pointer from end once end number == start number + (start index + difference between and start and end index) terminate program

Here if the probability is too high of getting a maximum sequence this provides best solution.
If there is too low probability then it does not provide then HashTable provides best solution

- Debasis Jana June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

O(n) solution traversing the list only *once*

public static Set<Integer> compute(Set<Integer> ints){
		Map<Integer, Integer> map = new HashMap<Integer, Integer>();
		
		// produce map of ranges of consecutive numbers
		for(Integer key : ints){
			if(map.containsKey(key)) continue;
			map.put(key, key);
			
			if(map.containsKey(key+1)){
				map.put(map.get(key), map.get(key+1));
				map.put(map.get(key+1), map.get(key));
				if(map.get(key) != key+1) map.remove(key+1);
			}
			
			if(map.containsKey(key-1)){
				map.put(map.get(key), map.get(key-1));
				map.put(map.get(key-1), map.get(key));
				if(map.get(key) != key-1) map.remove(key-1);
			}
		}
		
		// find longest range
		int longest = Integer.MIN_VALUE, start = 0;
		for(Entry<Integer, Integer> entry : map.entrySet()){
			if(Math.abs(entry.getKey() - entry.getValue()) > longest)
				start = Math.min(entry.getKey(), entry.getValue());
		}
		
		// create the list to return
		Set<Integer> answer = new HashSet<Integer>();
		for(int i = start; i <= start+longest; i++)
			answer.add(i);
		
		return answer;
	}

- emalaj23 June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

The map STL itself is implemented as a BST. Even if you use it to store (key,value) pairs, it'll use logn operations every time you check whether a key exists or not. I don't really see this as an efficient solution.

- Anonymous June 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Instead of "map", you can use STL "unordered_map" that is implemented using a hash table. Easy as that.

- arturo June 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

sites.google.com/site/spaceofjameschen/home/array/find-the-longest-subset-with-consecutive-numbers----google

- Anonymous June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I have implemented an algorithm with complexity O(n) using a Hash Table (I used STL "unordered_map") that only has to iterate over the array once, and has onl a max of 6 accesses to the Hash Table per iteration.

I have implemented other working versions of this functions with complexity O(n) (one of them using a bitset), but this is the one I like the most. Here is the C++ implementation:

vector<int> findLongestSubset(int* arr, int size)
{
	unordered_map<int,int> mp;
	unordered_map<int,int>::iterator it;

	mp[arr[0]] = 1;	

	int max_seq = 1;
	int value_of_sequence = 0;
	int max_seq_first_elem = 0;
	int first_elem = 0;

	for(int i=1;i<size;i++)
	{	
		first_elem = arr[i];

		//Check if element is already in the map
		it = mp.find(arr[i]);
		if(it!=mp.end())
			continue;

		value_of_sequence = 0;
	
		//Check previous element in order by value if exists in map
		it = mp.find(arr[i]-1);
		if(it!=mp.end())
		{
			value_of_sequence = it->second;
			first_elem = arr[i]-value_of_sequence;
			//Update beginning of sequence
			mp[first_elem] = value_of_sequence+1;
		}
	
		value_of_sequence++;
		mp[arr[i]]=value_of_sequence;

		//Check next element in order by value if exists in map
		it = mp.find(arr[i]+1);
		if(it!=mp.end())
		{
			value_of_sequence = value_of_sequence+it->second;
			mp[first_elem] = value_of_sequence;
			mp[arr[i]+it->second] = value_of_sequence;
		}

		//Update max sequence and max sequence's first element
		if(value_of_sequence > max_seq)
		{
			max_seq = value_of_sequence;
			max_seq_first_elem = first_elem;
		}
	}

	vector<int> ret(max_seq); 

	for(int i=0; i<max_seq;i++)
		ret[i] = max_seq_first_elem+i;

	return ret;
}

- arturo June 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I thought of this solution:

1. Create all the subsets and store the integer of each line in a set (disregarding order of integer values). Store each set to a list. Lets say call the list as
List<Set<Integer>> numberSubsetList = new ArrayList<Set<Integer>>();
2. set int longestSetIdx = -1, currSetLen = -1, longestSet = -1
2. For each set in the list
a. set currSetLen = # of item in current set
a. get max and min while computing sum of integers (call the sum as totalVal)
b. based on max and min compute sum of consecutive numbers between min and max
consecSum ((#of items in set)/2)*(max+min)
c. if totalVal = consecSum and longestSetLen < currSetLen then
longestSetIdx = index of set
longestSetLen = currSetLen
3. Return numberSubsetList.get(longestSetIdx)

Technically, this is O(N) disregarding of course the cost of populating numberSubsetList.

- slbb June 18, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Everyone does really complicated stuff....
Here is an order n solution that is very easy to read (in Python).

def longest_common_subset( input ):
  hash_map = {}
  # O(n)
  for elem in input:
    hash_map[elem] = True
  
  lcs = []
  for elem in input:
    #Try higher first
    higher_lcs = []
    counter = elem
    while counter in hash_map:
      higher_lcs.append(counter)
      counter = counter + 1
    
    #Try lower now
    lower_lcs = []
    counter = elem
    while counter in hash_map:
      lower_lcs.append(counter)
      counter = counter - 1
      
    longest_option_lcs = higher_lcs if len(higher_lcs) > len(lower_lcs) else lower_lcs
    lcs = longest_option_lcs if len(longest_option_lcs) > len(lcs)  else  lcs
    
  return lcs

- Farid June 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Whoops forgot about those while loops :(
(it's pretty late).

nvm them

- F June 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Plz explain your algorithm

- Murali Mohan June 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Farid:
1. i think longest_option_lcs should be
longest_option_lcs = lower_lcs + higher_lcs

2. It will blow up to O(n^2) when whole set is answer. O(n) for traversing the set and then for each element, you go through all elements again to construct longest set

- anon June 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

the worst time complexity for this algorithm will be O(n^2), but i think this can easily be fixed. After using element to construct lcs, you can set hashmap[ele] = false. in the future traversal, if hashmap[ele] == false, then visit the following node. In this way, we make sure that each node is only visited once.

- Anonymous July 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

thanks for explaination dumbo

- anonymous June 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1. find the sequences in a hashmap with H[x] = x-1 if x, x-1 both are in the list, otherwise H[X] = 0, through one iteration of the set. O(N)
for example: { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3}
maps to : { 1->-1, 2->1, 3->2, 4->3, 5->4, 8->-1, 9->8, 10->9, 11->10, 20->-1}
2. reverse the map and also get the headers of the sequences, through one iteration of the original list.
new hashmap: { 1->2, 2->3, 3->4,4->5, 8->9, 9->10, 10->11 }, with headers {1, 8, 20 }
(3) from each header find the sequences and return the longest one, iterate throght all numbers once. O(N)

vector<int> findLongestSubset(int* arr, int size)
{
	unordered_map<int,int> mp;
	// iteration 1 to get connected components
	for ( int i = 0; i < size; ++i )
	{
		int n = arr[i];
		if ( mp.find(n-1) != mp.end() )
			mp[n] = n-1;
		else 
			mp[n] = -1;
		if ( mp.find(n+1) != mp.end() )
			mp[n+1] = n;
	}
	// iteration 2 to reverse the sequences
	unordered_map<int,int> mp2;
	vector<int> headers;
	for ( int i = 0; i < size; ++i )
	{
		int curr = arr[i];
		int next = mp[curr];
		if ( next == -1 )
			headers.push_back(curr);
		else
			mp2[next] = curr;
	}
	// iteration 3 to comparing the sequences
	// we use vector here to store all numbers in a sequences, but start and end number are actually sufficient
	vector<int> maxSequences;
	for (int i = 0; i < headers.size(); ++i )
	{
		int curr = headers[i];
		vector<int> sequence;
		sequence.push_back(curr);
		while ( mp2.find(curr ) != mp2.end() ) 
		{
			curr = mp2[curr];
			sequence.push_back(curr);
		}
		if ( sequence.size() > maxSequence.size() )
			maxSequence = sequence;
	}
	return maxSequences;
}

- testjay June 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

First we can sort the array and then start a check from the first index, checking next index value that it is consecutive with previous index value and keep incrementing count. As soon as we get a break from checking we can print the required subset up to the latest value of client.

- Ashish July 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

O(n) solution in Python.

def longest_sequential_subset(S):
    A = {}
    longest = set()
    for x in S:
        low = x - 1
        high = x + 1
        if low in A and high in A and A[low] is not A[high]:
            A[low].update(A[high])
            A[high] = A[low]
        s = A[low] if low in A else A[high] if high in A else set()
        s.add(x)
        if len(s) > len(longest):
            longest = s
        A[x] = s
    return longest

- Jcon July 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I also prefer using the hash table to solve the problem, and the expected time complexity is O(n), and here is my code in C++:

#include <iostream>
#include <vector>
#include <list>

struct node
{
	int key;
	int beg, end;
	node( int key ) : key(key), beg(key), end(key){}
	int size() const{ return end + 1 - beg; }
};

struct hash_table
{
	const int divisor;
	std::vector< std::list<node> > buckets;
	hash_table( int divisor ) : divisor(divisor), buckets( divisor, std::list<node>() ) {}

	typedef std::list<node>::iterator nd_iter;

	std::pair< bool, nd_iter > insert( const node& nd )
	{
		int i = nd.key % divisor;
		for( nd_iter it = buckets[i].begin(); it != buckets[i].end(); ++it )
			if( it->key == nd.key ) return std::make_pair( false, it );
		buckets[i].push_back(nd);
		return std::make_pair( true, std::prev( buckets[i].end() ) );
	}

	std::pair< bool, nd_iter > search( int key )
	{
		int i = key % divisor;
		for( nd_iter it = buckets[i].begin(); it != buckets[i].end(); ++it )
			if( it->key == key ) return std::make_pair( true, it );
		return std::make_pair( false, buckets[i].end() );
	}

	void find_longest()
	{
		int max_size = 0;
		std::list<node>::iterator kt;
		for( std::vector< std::list<node> >::iterator it = buckets.begin(); it != buckets.end(); ++it )
		{
			for( std::list<node>::iterator jt = it->begin(); jt != it->end(); ++jt )
			{
				if( jt->size() > max_size ) 
				{
					max_size = jt->size();
					kt = jt;
				}
			}
		}

		for( int i = kt->beg; i <= kt->end; ++i ) std::cout << i << " ";
		std::cout << std::endl;
	}
};

void longest_increasing_sequence( int a[], int n )
{
	hash_table ht(11);
	std::pair< bool, hash_table::nd_iter > ait, lait, rait;
	for( int i = 0; i < n; ++i )
	{
		ait = ht.insert( node(a[i]) );
		if( ait.first )
		{
			lait = ht.search(a[i]-1);
			rait = ht.search(a[i]+1);

			if( lait.first ) ait.second->beg = lait.second->beg;
			if( rait.first )
			{
				ait.second->end = rait.second->end;
				rait.second->beg = ait.second->beg;
			}
			if( lait.first ) lait.second->end = ait.second->end;
		}
	}

	ht.find_longest();
}

template< std::size_t n >
std::size_t size( int (&a)[n] )
{
	return n;
}

int main( int argc, char* argv[] )
{
	int a[] = { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3 };
	longest_increasing_sequence( a, size(a) );

	return 0;
}

- weijjcg August 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I first figured out the O(nlogn) solution (sort -> go through it once to find the longest sequence), and then immediately realized there must be an O(n) solution. Others have found it, but just to throw my C++, O(n) solution up here (I think others have come up with similar, but most seem to be more complex than they need to be):

1 #include <stdio.h>
  2 #include <iostream>
  3 #include <vector>
  4 #include <unordered_set>
  5
  6 using namespace std;
  7
  8 int getCount(unordered_set<int> *nums, int ndx, bool goingUp) {
  9   if(nums->find(ndx) != nums->end()) {
 10     nums->erase(ndx);
 11     int next = goingUp ? ndx+1 : ndx-1;
 12     return 1 + getCount(nums, next, goingUp);
 13   }
 14
 15   return 0;
 16 }
 17
 18 void getLongestSequence(vector<int> nums) {
 19   unordered_set<int> allNums;
 20
 21   for(int i = 0; i < nums.size(); i++) { // O(n)
 22     allNums.insert(nums[i]);
 23   }
 24
 25   int largest = 0;
 26   int start = 0;
 27
 28   unordered_set<int>::iterator itr = allNums.begin();
 29
 30   // O(n) because only touching each node 1x (either here, or in getCount)
 31   while(itr != allNums.end()) {
 32     int total = 1;
 33     int lowest = getCount(&allNums, *itr-1, false);
 34     int highest = getCount(&allNums, *itr+1, true);
 35
 36     total += lowest + highest;
 37
 38     if(total > largest) {
 39       largest = total;
 40       start = *itr - lowest;
 41     }
 42
 43     itr = allNums.erase(itr);
 44   }
 45
 46   printf("Largest sequence is: { ");
 47   for(int i = start; i < start+largest; i++) {
 48     printf("%d ", i);
 49   }
 50   printf("}\n");
 51
 52 }
 53
 54 int main(int argc, char *argv[]) {
 55   vector<int> nums;
 56   int val;
 57   while(cin >> val)
 58     nums.push_back(val);
 59
 60
 61   printf("Numbers are: { ");
 62   for(int i = 0; i < nums.size(); i++)
 63     printf("%d ", nums[i]);
 64
 65   printf("}\n");
 66   getLongestSequence(nums);
 67
 68   return 0;
 69 }

- Bryan August 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

A solution in Python:

def f(A):
	
	hash = {}
	
	# Make a 'hash-table' with each number
	for a in A:
		hash[a] = a
		
	ret = []
	
	while hash:
		# Get first item in the hash-list
		cont = []
		a = hash.pop(hash.items()[0][0])
		
		# Find all < numbers
		c = a - 1
		while hash.has_key(c):
			cont.append(c)
			del hash[c]
			c -= 1
		
		cont += [a]
		
		# Find all > numbers
		c = a + 1	
		while hash.has_key(c):
			cont.append(c)
			del hash[c]
			c += 1
			
		if len(cont) > len(ret): ret = cont
	
	return ret	
	
A = [5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3]
print f(A)

- Gerard September 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

A solution in Python:

def f(A):
	
	hash = {}
	
	# Make a 'hash-table' with each number
	for a in A:
		hash[a] = a
		
	ret = []
	
	while hash:
		# Get first item in the hash-list
		cont = []
		a = hash.pop(hash.items()[0][0])
		
		# Find all < numbers
		c = a - 1
		while hash.has_key(c):
			cont.append(c)
			del hash[c]
			c -= 1
		
		cont += [a]
		
		# Find all > numbers
		c = a + 1	
		while hash.has_key(c):
			cont.append(c)
			del hash[c]
			c += 1
			
		if len(cont) > len(ret): ret = cont
	
	return ret	
	
A = [5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3]
print f(A)

- Gerard September 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

we can also try doing it with tree binary tree and then using in order traversal we can find a sequence. this sequence can be stored in an array...and then we can determine the next consecutive number in the complete sequence and so the subset.

- anonymous October 02, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Why can't we just use SortedSet<Integer> (close to O(1) add operation) ? This can solve the problem easily..

- AJ October 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Amortized O(N) solution using HashSet and HashMap

import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;


public class MaxConseqSubseequence {
	public static int[]  getLongestConseqSeq(int [] arr){
		HashSet<Integer> seen = new HashSet<>();
		HashMap<Integer, int[] > numToRange = new HashMap<>();	

		for( int cur : arr){
			if(!seen.add(cur)) continue;

			int[] left = numToRange.get(cur-1);
			int[] right = numToRange.get(cur+1);

			if(left != null && right !=null){
				if(left[0] != left[1]) numToRange.remove(left[1]);
				numToRange.remove(right[0]);
				left[1] = right[1];
				numToRange.put(right[1], left);
			} else {
				if(left != null) {
					if(left[0] != left[1]) numToRange.remove(left[1]);
					left[1] = cur;	
					numToRange.put(cur, left);

				} else if(right != null){
					if(right[0] != right[1]) numToRange.remove(right[0]);
					right[0] = cur;
					numToRange.put(cur, right);
				} else { // both null
					int [] newRange = { cur, cur };
					numToRange.put(cur, newRange);
				}	
			}
		}

		int [] maxRange = null;
		int maxRangeSize = -1;
		for( int [] range: numToRange.values() ){
			int rSize = range[1] - range[0];
			if(rSize > maxRangeSize){
				maxRangeSize = rSize;
				maxRange = range;
			}
		}



		return maxRange;
	}


	public static void main(String[] args) {
		int [] arr = { 1,3, 2, 5, 1, 9, 3, 8, 6, 4, 7, 2, 11, 3} ;
		System.out.println("The longest sequence in " + Arrays.toString(arr)+ " is "+ Arrays.toString(getLongestConseqSeq(arr)));
		
	}

}

- konst June 09, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

int main()
{
	int S[] = { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3};
	int N = sizeof(S)/sizeof(int);
	unordered_map<int,bool> map;
	int longest = 0, start; 
	
	for (int i = 0; i < N; i++)
		map.insert(make_pair<int,bool>(S[i],true));

	for (int i = 0; i < N; i++)
	if (map[S[i]])
	{
		map[S[i]] = false;
	
		int p = S[i];
		while (map.find(p-1) != map.end())
			map[--p] = false;
		
		int q = S[i];
		while (map.find(q+1) != map.end())
			map[++q] = false;
		
		if (q-p+1 > longest)
		{
			longest = q-p+1;
			start = p;
		}
	}
	for (int i = 0; i < longest; i++)
		cout << start+i << " ";
	return 0;
}

- Anonymous July 21, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

- if (map[S[i]]) { map[S[i]] = false; int p = S[i]; while (map.find(p-1) != map.end()) map[--p] = false; int q = S[i]; while (map.find(q+1) != map.end()) map[++q] = false; if (q-p+1 > longest) { longest = q-p+1; start = p; } August 02, 2014 | Flag Reply

CareerCup

Google Interview Question for Applications Developers

Books

Videos

Resume Review

Mock Interviews