How would you find a random no

Knewton Interview Question for Software Engineer Interns

-1

of 1 vote

28
Answers
How would you find a random non-null element from an array of 10 million, where most elements are null, in a reasonable amount of time?
- anant March 15, 2014 in United States | Report Duplicate | Flag | PURGE
Knewton Software Engineer Intern

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 3 vote

Pick an index at random and check if the element there is non-null. Keep re-picking indexes at random until you get a non-null element. This approach makes no assumptions about the distribution of the data and has an expected running time of O(N/K) if there are K non-null elements in the array, which is the best you can hope for if you have no information about how elements are distributed.

The algorithm as stated above is unbounded in terms of worst-case (as opposed to expected) time, but that can be remedied by adding a condition that if we fail to find a non-null element after n steps, we will do a linear scan of the array to find all the non-null values, and pick randomly from that list. This change caps the algorithm at O(n) time. Since the linear-time scan is only done when O(n) time has already been used anyway, the bounded algorithm is slower than the unbounded one by at most a constant factor for any input. It is therefore O(N/K) expected time and O(N) worst-case time.

- eugene.yarovoi March 16, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Actually, you can generate a random sequence of distinct indices in a lazy fashion (or, generate a random permutation in a lazy fashion).

You don't really have to do all the scan stuff etc at the end.

The algorithm will look like

for index in random_permutation(n):
    if a[index-1] != null:
        return a[index-1]
    throw AllNullException

random_permutation can be implemented to give the permutation lazily, by using an algorithm similar to Fischer-Yates, but maintaing the array as a hashtable.

# Generates random permutation of 1,2,...,n lazily.
def random_permutation(n):
         rand = random.SystemRandom()
        num_left = n
        table = {}
        while num_left > 0:
                r = rand.randint(1, num_left)
                oldr = r
                if r in table:
                       # the new element in that slot.
                        r = table[r]
                if num_left in table:
                        # The case when the last element was picked.
                        table[oldr] = table[num_left]
                else:
                       # the swap last element with picked element step
                       # of Fisher-Yaters.
                        table[oldr] = num_left 
                num_left -= 1
                yield r

Each non-null is equally likely to show up first.

- Le Subbu March 18, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Subbu is right.

- vz March 18, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@vz, well Eugene is sometimes right too, and here his idea doe work. His problem is he thinks too highly of himself to spend too much time on any problem, so he ends up mentioning some theoretical opinions of his which are right about only half the time.

- Le Subbu March 18, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Imposter.

- Le Subbu March 19, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Real subbu doesn't code in python.

- Le Snob March 19, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I do. I just did.

The guy attacking Eugene wasn't me.

- Le Subbu March 19, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Le Subbu: That works too. I like that idea actually -- that way you don't have to take a special case just to guarantee termination.

@Fake Le Subbu: Most of the "theoretical opinions" you're criticizing are situations where the algorithm is stated without code. This has nothing to do with not spending enough time on the problem, and everything to do with an assumption that readers can code for themselves if they know the algorithm.

- eugene.yarovoi May 14, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

* What is implied by 10million? Doesn't look large to me to hold in memory - only 40MB.
* Can I modify the array?
If so, perform binary search until you find a non-null element, then swap out that element with central-most 'null' element in the array. Then return it.

The modifiability and the swap step is so that over time you collect non-null elements near center of the array, gradually making hunt more efficient and perhaps seize to do swap step eventually - say after about (10million/# non-nulls) calls to the method.

- Nix March 15, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

An alternative solution is to iterate through the array to filter out the null elements. You can then just choose an element at random from the filtered array.

This has a time complexity of O(n) and a space complexity of O(n). However, if the array does not change regularly, you only need to do the filter stage once ( O(n) ) meaning the select method is constant time.

For methods where the space matters, the array changes regularly or the method is only called once, my other answer is better (as it is constant space and linear time).

- Jay March 15, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use HashSet set=new HashSet();
and then add all the array elements to a Set....... There will be only one null value in this set...

and then select any random number from that set by using set to array conversion....

- bloggersumitsoni March 16, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Create multiple threads, each get to search a segment of an array and look up from random position and direction, if not found switch direction starting at the same random position. When one thread finds a value, it returns and interrupts other threads.

- tracer March 16, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

randomly add random numbers until we get a non null value.

- Anonymous March 17, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public static Object randomElement(Object [] arr) {
	Random rand = new Random();
	int starting = rand.nextInt(arr.length);
		
	for(int i = starting, count = 0; count < arr.length; i = ++i % arr.length, count++) {
		if(arr[i] != null) {
			return arr[i];
		}
	}
	return null; // All elements are null
}

- hyunwoo1006 March 17, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

To remove the non-null elements , you don't need extra space

int i;
int j =0;
for (i = 0;i <n;i++)
{
if(a[i]!=NULL) a[j++] = a[i];
}

In O(n) time, and without using extra space, you've reduced the search space to j elements, then you can probably use linear search . O(n) time,O(1) space

}

- Engineer1111 March 18, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

To remove the non-null elements , you don't need extra space

int i;
int j =0;
for (i = 0;i <n;i++)
{
if(a[i]!=NULL) a[j++] = a[i];
}

In O(n) time, and without using extra space, you've reduced the search space to j elements, then you can probably use linear search . O(n) time,O(1) space

}

- Engineer1111 March 18, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If the numbers in given array are randomly distributed, without any prior information on distribution type, then simple linear scan is good enough and would be equally likely to find non-null as would picking random index.
It is because, randomizing an already random array doesn't add any benefit.

- Linear August 27, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

So, in this problem, we have a sparse matrix with mostly nulls.
Are there any efficient algorithms for finding elements in sparse matrices?

Also, I would imagine that there is a very elegant solution using some C hackery.

- Anonymous May 27, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

So we have a sparse matrix, with mostly nulls.
Are there any efficient algorithms for finding elements in sparse matrices?

Also, I would imagine that there is a very elegant solution using some C hackery.

- Uber Coder May 27, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

A simple (and quick, O(n) ) answer is to choose a random start position then linear search to find a non-null element.

public static int randomElement(Object[] a, int start, int end) {
    int offset = (new Random()).nextInt(end - start);
    for (int counter = 0; counter < end - start; counter++) {
      // i is the start + counter + offset, but looped such that start <= i < end
      int i = start + ((counter + offset) % (end - start));
      if (a[i] != null) {
        return i;
      }
    }
    // Return -1 if all elements are null
    return -1;
  }

This will be fine if the elements are evenly distributed, but if they are grouped, the element at the start of the group will be returned more often. For example, in the array [null, null, null, foo, bar], foo will be returned more often than bar.

- Jay March 15, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Interesting question. Good answer.

- Anonymous March 15, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

What happens if the random start position generated contains no non-null elements after it?

- Anonymous March 16, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

If you look at the assignement to i, you can see that it is the combined offset (counter + offset) is modulo size of array plus the start index.
This means that i is always within the range start <= i < end and it's value goes [end-2, end-1, start, start+1].
Therefore, if there are no non-null elements after the (start + offset) position (and the linear search reaches end), it continues the linear search from the start position.
If there are no non-null elements, the loop will finish (after end-start iterations) and return -1 (to indicate there are no non-null elements)

- Jay March 16, 2014 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 vote

I don't like this question.
1) Situation is stupid. Why would someone who wants this query allow such an array to be built?
2) We don't know what process created the 10million element array (i.e., if we can assume it was a uniform random process, then we can scan to the first non-null element an return it) How many such queries will be performed? This, along with other considerations, will determine if preprocessing makes sense or not.
3) Why would

- S O U N D W A V E March 17, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 votes

Eh?

This is a simplified version.

Think of it as selecting a random element satisfying a specific predicate and/or that predicate isn't known before.

- Le Subbu March 17, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

If you are totally in the dark in predicting possible predicates (even sets of them?), then make sure you place the elements into a data structure that randomizes on insertion.

- S O U N D W A V E March 17, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Perhaps the data structure is that way because of other constraints unknown to us?

Don't call a problem stupid so easily. This is one of the problems which sounds likely to have come from a "real world" problem.

Funny that you of all people find it stupid.

Anyway...

- Le Subbu March 17, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

The way the question is posed has no interesting/optimal/"cool" answers.

Prove the above wrong and then talk.

- S O U N D W A V E March 17, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 2 votes

Why should there be optimal/cool solutions? Real world problems usually don't.

btw, your points are what make this a good question!

It is ripe for discussion, discuss a solution, see if a different data structure works, etc.

Anyway. I don't want to "talk" anymore. You are worse than I thought.

- Le Subbu March 17, 2014 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

The question is stupid the way it is posted ("in a reasonable amount of time" ??). The only answers possible with the current form of the question are brute-forcish.

A good engineer would know the predicates before hand and split off relevant streams into auxillary arrayLists/vectors, or without knowing the predicates, would have duplicated/stiphened off the whole stream into a ranomize-on-insert DS.

The OP should post the discussions he had with the interviewer or he should mention more details.

Don't support the current quality of postings on this website.

- S O U N D W A V E March 18, 2014 | Flag

CareerCup

Knewton Interview Question for Software Engineer Interns

Books

Videos

Resume Review

Mock Interviews