Given a array of integers ther

unknown Interview Question for Software Engineers

0

of 0 votes

14
Answers
Given a array of integers there is one that is repeated several time. How would you compute the length of the sequence of repeated elements.
Assuming the initial array is sorted can you do better than O(n) both for spatial and temporal cost?
- Fernando May 29, 2017 | Report Duplicate | Flag | PURGE
unknown Software Engineer Coding

Email me when people comment.

An error occurred in subscribing you.

Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

actually, if it is sorted and if max-diference of two adjacent elements is 1, then the length of the repeated sequence is n-1-(array[n-1]-array[0]). If you want to know the elements value, you can binary search i:

#include <vector>
#include <iostream>
#include <utility>

using namespace std;

// vector a is sorted, max-diference of two adjacent elements is 1
pair<int,int> lengthOfRepeatedSeq(const vector<int>& a) {
	if (a.size() = 0) return {0, 0};
	int o = a[0];
	int s = 0;
	int e = a.size() - 1;
	while (s < e) {
		int m = (s + e) / 2;
		if (a[m] >= m + o) s = m + 1; // if a[m] = m + o, there is no repeating character in [s..m]
		else e = m; // if a[m] < m, there is a repeating character in [s..m]
	}
	return{ a[s], a.size() - (a[a.size() - 1] - o) };
}

ostream& operator << (ostream& o, pair<int, int> p) {
	o << p.first << ", " << p.second;
	return o;
}

int main()
{
	cout << lengthOfRepeatedSeq({ 1,2,3,4,4,4,5,6 }) << endl;
	cout << lengthOfRepeatedSeq({ 4,4,4,4,4 }) << endl;
	cout << lengthOfRepeatedSeq({ 2 }) << endl; // not really a valid input
	cout << lengthOfRepeatedSeq({ 6,7,8,9,10,10,11 }) << endl;
	cout << lengthOfRepeatedSeq({ 6,7,8,9,10,10,10 }) << endl;
	cout << lengthOfRepeatedSeq({ 5,5,6,7 }) << endl;
	/* output:
	4, 3
	4, 5
	2, 1
	10, 2
	10, 3
	5, 2
	*/
}

- Chris May 29, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Here is a solution in python for the version of O(n)

from collections import Counter

def length_of_repeated_elements(s):
    c = Counter(s)
    max_length = v = -1
    for value, length in c.iteritems():
        if length > max_length:
            max_length = length
            v = value
    return v, max_length

- Fernando May 29, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

@Fernando: I didn't quite understand how you find the middle element using binary search. Could you point that out, e.g. how to decide whether to go left or right of the middle element in the binary search loop?

- Chris May 29, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

@NoOne: Maybe I'm bit slow today, but let's take this example with only one repetition.
array = [5,9,13,15,20,20]
start = 0, end = 5, middle = 2 (value 13)
how to know it's in the interval (middle..end] when looking at 13 and maybe 9 and 15?
the array could be 5,5,13,15,17,20 and it would look the same at the first step of binary search, but in order to get the O(lg(n)) behavior one has to decide for interval [start..middle] or (middle..end]
...
if you could post the python/java/c code, that finds the single repeated element in a sorted array in less than O(N), that would be awesome ;-) for example with the array above.

- Chris May 29, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

@ChrisK you are right you can't perform that search in less than O(n). You can perform the search in a binary fashion but you still would have to go trough all the elements in the array in order to find the repeated sequence in the worst case. To be honest the answer to the question comes before that part so I did not think that very thoroughly. The real answer is that you can't solve the problem better than O(n). The source code I gave is the proof. You are solving a relaxed version of the problem and it still takes O(n). 2log(n) is still linear as the base of the logarithm is also 2. You can get a formal demonstration using the master theorem. The recursion is T(n) = 2T(n/2) + O(1). The thing is that on the interview I got assured by the interviewer that you can do better. What a nice feeling when I finally got the proof that you can't. I think I am satisfied with that answer so I will stop looking for a way to find an improvement over O(n).

- Fernando May 30, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

@Fernando: I agree. Concluded:
1) The best conceivable run-time is O(n) for the initial problem.(an informal "prove" is that you need to look at all elements because if you do not look at one element and if this one is the repeated element, there is no way to tell which element (value) repeats.

2) You can relax it in several ways:
a) define that adjacent elements differ by at most 1 (see my initial post), you can find the number of repeated elements then in O(1) and the element in O(lg(n))
b) you know the position of one element and thus need to find the end and start of the sequence as you pointed out

BUT, O(2*log(n)) is O(log(n) if "*" is a multiplication) because constant factors can be eliminated (if it was O(2^log(n)) where "^" being power then this is n because 2^log(n) = n, if log means the logarithm with base 2)

I don't agree with your recursion that's why the application of the master theorem turned out to be O(n): The recursion is T(n) = T(n/2). You do it twice for n, but not on every "recursion level" that's why you can't put the 2 in front of the T. They way you wrote it would be, in your binary search you had to either use recursion or introduce a stack so you can follow both path, the left and the right side which is clearly not the case, a T(n) = 2*T(n/2) recursion would look like:

def fun(s, e):
	if s == e: return 1
	m = (s + e + 1) // 2
	return fun(s, m) + fun(m, e)     # or max(fun(s, m), fun(m, e))

fun(0, n-1)

but you do clearly:

def fun(s, e):
	if s == e: return 1
	m = (s + e + 1) // 2
	if (condition): fun(s, m)
	else fun(m, e)

fun(0, n-1)
fun(0, n-1)

an other way to thin about is that doing a single binary search leads to O(lg(n)) why should two binary searches done after each other lead to O(n).

Even your prove might be wrong, I think it's important to remember interviewer and interviewee are humans with their own background in terms of professional experience, education and personal, cultural background. So whatever is said, is said from an implicit context, which is a bias. It can be, the person really thought he helped you, even if it wasn't the case for you since you didn't have his context...

- Chris May 30, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

@ChrisK you are right with that

- Fernando May 30, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If you know that the array it is sorted you can reduce the memory cost to O(1). Here is an implementation in python:

def length_of_repeated_elements(s):
    l = 0
    for x in xrange(len(s) - 1):
        y = x
        while s[y] == s[y + 1]:
            l += 1
            y += 1
        if x != y:
            break
    return l

To see if we can do better than linear cost let's assume that we know the repeated value and a random position in the array that it occurs. In that situation we can perform a binary search towards both sides to get the lower and upper limits of the repeated sequence.
For example the array [1,2,3,4,4,4,7], the repeated value 4, and the position 5
We could perform from both sides a binary search to reach the limits. In this case 3 and 5.
Here is a python code that solves this relaxed problem.

def length_of_repeated_elements(s, value, position):
    if s[0] == s[-1]:
        return len(s)
        
    high = position
    low = 0
    while (low < high):
        middle = (low + high) / 2
        print "Low", low, high, middle
        if s[middle] == value:
            if s[middle-1] != value:
                break
            else:
                high = middle
        else:
            if low == middle:
                middle = high
                break
            else:
                low = middle
                           
    first_element = middle
    
    high = len(s)
    low = position
    while (low < high):
        middle = (low + high) / 2
        print "High", low, high, middle
        if s[middle] == value:
            if middle == len(s) - 1 or s[middle+1] != value:
                break
            else:
                low = middle
        else:
            high = middle
    last_element = middle
    
    return first_element, last_element

The cost is 2 log(n).
Now to answer the question about how to find the repeated value and the position the answer is similar we can perform the binary search until we hit a random position with element as we have shown using the previous code it doesn't matter which position we find.

- Fernando May 29, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

def find_max_repeated_elem(a){
    m = mset(a)
    #(min,max) = minmax(m){ $.l.value < $.r.value }
    max.value 
}
def find_max_repeated_elem_sorted(a){
    p = { 'item': a[0] , 'count' : 0, 'max' : 0 }
    p = fold ( a , p ) -> {
        if( $.o == $.p.item ){  
           $.p.count += 1
        } else {
           if ( $.p.count > $.p.max ){
                $.p.max = $.p.count 
           }
           $.p.count = 1 
           $.p.item = $.o 
        }
        $.p // return 
    } 
    p.item  
}

- NoOne May 29, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@ChrisK,
Suppose there is only one element that is repeated. Only one. The number of repetitions unknown. Then, what we can do is - simple go to middle and search for left/right is same or not. That would work.

- NoOne May 29, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@ChrisK is right. I guess bad health is getting into my mind too. I do not think it is possible to get down below exact n.

- NoOne May 30, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

First Case: Array Not Sorted. We iterate through the input. We keep track of previously encountered elements in an unordered_set. Once we find a previously encountered element we stop searching and count the number of remaining elements in the input. Space complexity: O(n); (If we never find a match we will store all the elements in the unordered_set). Time complexity: O(n).

template<typename iter>
std::size_t
lengthOfRepeatedSequence(iter first, iter last)
{
  using value_t = std::iterator_traits<iter>::value;
  std::unordered_set<value_t> previously_encountered;
  std::size_t result = 0;
  value_t repeated_elem = value_t();
  bool notFound = true;
  for(; first != last && notFound; ++first){
    const auto term = *first;
    if(1 == previously_encountered.count(term)){
      notFound = false;
      repeated_elem = *term;
    } else {
      previously_encountered.insert(term);
    }
  }
  if(!notFound){
    assert((first != last));
    result = 1 + std::count(first,last,repeated_elem);
	   
  }
  return result;
}

If the array is already sorted we can cut down on the space complexity because repeated elements will "live" next to each other. (I don't think) we can cut down on the time complexity because we need to examine every element in the list to conclude a repeated element doesn't exist. If the input has n elements and we only examine $n-1$ of them, it is possible that the element we neglected is part of the the only pair of repeated elements.

template<typename iter>
std::size_t
lengthOfRepeatedSequenceSorted(iter first, iter last)
{
  assert(std::is_sorted(first,last) );
  bool notFound = true;
  std::size_t result = 0;
  while(first != last && notFound){
    const auto val = *first;
    int count = 0;
    while(first != last && *first == val){
      ++count;
      ++first;
    }
    if(count != 1){
      notFound = false;
      result = count;      
    }
  }
  return count;
}

- fred May 31, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

def length_of_repeated_elements(s):
    l = 0
    for i in range(0, len(s)-1):
        t = i
        if s[i] == s[t+1]:
            if l != 0:
                l += 1
            else:
                l += 2
    return l

- e June 13, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

It is possible by binary search, if the array is sorted.
1. Apply binary search to find the first index of the repeated value.
2. Apply binary search to find the last index of the repeated value.
3. Subtract last_index and first_index(you will get the total frequency).

- shankar upadhyay July 11, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

{{{ and }} - Anonymous July 29, 2017 | Flag Reply

CareerCup

unknown Interview Question for Software Engineers

Books

Videos

Resume Review

Mock Interviews