Programming Interview Questions

showell30@yahoo.com

Questions (1)

Comments (50)

Followers (6)

Reputation 1485

Page:

1 2 3 4 5 6

Comment hidden because of low score. Click to expand.

of 0 vote

It's reasonable to assume that the interviewer is talking about a balanced binary tree, even though the question doesn't explicitly state it. Even for a horribly imbalanced tree, this approach is better than a naive inorder traversal.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The question is poorly written. It's not clear whether the goal is to enumerate all the nodes in the range or simply find any matching node.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The question doesn't specifically mention storage constraints, but you might want to upvote "Storage-efficient C Sieve" anyway. :)

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Storage-efficient C Sieve.

This code conserves memory (for large N) by applying the sieve to only 1000 numbers at a time. It also skips even numbers.

#include <stdlib.h>
#include <stdio.h>

struct prime {
    int n;
    int max_multiple;
};

void sieve_out_multiples(
        int *sieve,
        struct prime *primes,
        int i,
        int n_start,
        int n_end)
{
    int offset;
    while (primes[i].max_multiple < n_end) {
        offset = (primes[i].max_multiple - n_start) / 2;
        sieve[offset] = 1;
        primes[i].max_multiple += 2 * primes[i].n;
    }
}

int count_odd_primes(n) {
    if (n % 2 == 0) --n;
    int chunk_size = 1000; // number of odds
    int n_start = 3;
    int n_end = n_start + 2 * chunk_size;
    if (n_end > n+2) n_end = n+2;
    int cnt = 0;
    chunk_size = (n_end - n_start) / 2;
    int alloc_size = chunk_size; // enough to get started
    struct prime *primes = malloc(alloc_size * sizeof(*primes));
    int *sieve = malloc(chunk_size * sizeof(int));
    int i;
    while (n_start <= n) {
        n_end = n_start + 2 * chunk_size;
        if (n_end > n+2) n_end = n+2;
        chunk_size = (n_end - n_start) / 2;
        for (i = 0; i < chunk_size; ++i) {
            sieve[i] = 0;
        }
        // mark all composities from primes so far
        for (i = 0; i < cnt; ++i) {
            sieve_out_multiples(sieve, primes, i, n_start, n_end);
        }
        for (i = n_start; i < n_end; i += 2) {
            int offset = (i - n_start) / 2;
            if (sieve[offset] == 0) {
                // we found a prime!
                // printf("prime %d\n", i);
                if (cnt >= alloc_size) {
                    alloc_size += chunk_size / 2 + 1;
                    primes = realloc(primes, alloc_size * sizeof(*primes));
                }
                primes[cnt].n = i;
                primes[cnt].max_multiple = 3 * i;
                sieve_out_multiples(sieve, primes, cnt, n_start, n_end);
                ++cnt;
            }
        }
        n_start = n_end;
        if (n_start % 2 == 0) ++n_start;
    }
    return cnt;
}

int count_primes(int n) {
    if (n <= 1) return 0;
    if (n == 2) return 1;
    return count_odd_primes(n) + 1;
}

int main(int argc, char **argv) {
    printf("%d\n", count_primes(100));
    printf("%d\n", count_primes(1000000));
    return 0;
}

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Yep, it "stops" being O(1) as soon as the number of bits in the binary number can't be stored in a word. While not theoretically O(1), the algorithm does allow you to increment numbers up to 2 ** (2**32) very quickly, whereas a 32-bit machine only increments numbers up to 2**32 quickly.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Recursive single-pass Python solution:

def diameter_height(tree):
    if tree is None: return (0,0)
    left, val, right = tree
    ld, lh = diameter_height(left)
    rd, rh = diameter_height(right)
    d = max([ld, rd, lh+rh+1])
    h = max([lh+1, ld+1])
    return d, h

def diameter(tree):
    d, h = diameter_height(tree)
    return d

assert 0 == diameter(None)
assert 1 == diameter((None, 1, None))
three_node_tree = ((None, 0, None), 1, (None, 2, None))
assert 3 == diameter(three_node_tree)
seven_node_tree = (three_node_tree, 'foo', three_node_tree)
assert 5 == diameter(seven_node_tree)
imbalanced_tree = (seven_node_tree, 'bar', None)
assert 5 == diameter(imbalanced_tree)

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Python sort () and sorted() have supported a key argument for a while, which allows you to specify a key for sorting. Python also sorts tuples lexographically. So, it's easy to compose a key function from two other key functions:

def sorted2(lst, key1, key2):
    def f(v): return (key1(v), key2(v))
    return sorted(lst, key=f)

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The problem with inorder traversal is that it's O(N). See "Divide and Conquer Solution" for a simple approach that is much more efficient for large trees.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 6 vote

Divide and Conquer Solution:

If the tree is empty, return not found. If min <= root <= max, then return root. If root < min, recursively search the right subtree. Otherwise, recursively search the left subtree.

Assuming the tree is balanced, this runs in O(log N) time, vs. O(N) time for an inorder traversal.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Can you determine which bit to flip in constant time? I assume the answer is yes.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

P.P.S. I said all "even numbers are composite," but of course I meant all even numbers >= 4. A real world implementation would special case for small values of N.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

It's only lazy with respect to B values. Yes, of course you'd have to sort the As, which is at least 50% of the work. I didn't mean to overstate the advantage.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

P.S. This solution is less time-efficient than a simple Sieve (although not drastically so), but it's much better in terms of storage.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If storage is constained, see my solution "Maintain an ordered list of upcoming prime multiples." My solution requires storing a tuple for every prime less than N, which is of course better than O(N).

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Maintain an ordered list of upcoming prime multiples. When n=20, your list will look like this:

(21, 3)
(21, 7)
(22, 2)
(22, 11)
(25, 5)
(26, 13)
(34, 17)
(38,19)

Pop off the top of the list, and now you know that 21 is composite. Push (24,3) on to the list:

(21, 7)
(22, 2)
(22, 11)
(24, 3)
(25, 5)
(26, 13)
(34, 17)
(38,19)

Pop off the top of the list again. 21 is a duplicate, so no new info on composites/primes here, but you can update the list:

(22, 2)
(22, 11)
(24, 3)
(25, 5)
(26, 13)
(28,7)
(34, 17)
(38,19)

Pop off (22,2) and push (24,2):

(22, 11)
(24, 2)
(24, 3)
(22, 11)
(25, 5)
(26, 13)
(28,7)
(34, 17)
(38,19)

Pop off (22,11) and push (33,11)

(24, 2)
(24, 3)
(25, 5)
(26, 13)
(28,7)
(33, 11)
(34, 17)
(38,19)

Next pop off (24,2). You know your last composite was 22, and your newest composite is 24, so you just found a prime! Now you push two values into the list:

(24, 3)
(25, 5)
(26, 2) # next multiple of 2
(26, 13)
(28,7)
(33, 11)
(34, 17)
(38,19)
(46, 23) # next multiple of 23

Hopefully you get the idea by now. An easy optimization is to only deal with odd numbers, since all even numbers are composite. When excluding evens, your list would look like this after 30:

(33, 3)
(33, 11)
(35, 5)
(35, 7)
(39, 13)
(51, 17)
(57, 19)
(69, 23)
(87, 29)

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 8 vote

Three approaches come to mind.
1) Sort on A, then scan the list for runs of elements with the same value for A, then sort each sublist for field B.
2) Sort on A/B simultaneously by using a compare function or key function that uses both fields.
3) Sort the list on B, then stable-sort the list on A.

A nice feature of approach #1 is that you can do it lazily, i.e. only sort the sublists when the second field is queried.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Nax, you need to follow the process. Start with a. The suggested solution says that b, c, and d stand to the right of him (in some order) and then e is to left. In the recursive step, you arrange b, c, and d. There's no way you'll ever have e standing next to d if you follow the instructions.

- showell30@yahoo.com February 23, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

P.S. To avoid off-by-one errors, make sure you understand how you define height and diameter for segments. I define these lengths in terms of the number of nodes, so a segment from A to B to C has length 3 by my definition, not 2, despite 2 making more sense from a physical/intuitive perspective. Likewise, a tree with a single node has height 1 by definition.

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 5 vote

Write a function that returns a tuple/struct of two values for a given tree. The first value is the height, and the second value is the diameter length.

The terminating case is a an empty tree, which returns 0,0.

The recursive calculations are these:

height = max(height(left), height(left)) + 1
diameter = max(diameter(left), diameter(right), height(left)+height(right)+1)

To fully complete the exercise, have a helper function that calls the other function and returns only the diameter as the final answer.

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The reason there's a gap in the representation for 4 (the None as the second value) is to facilitate the transformation to 5 without having to shift values. For small numbers this seems overkill, but think about incrementing b1011111001 to b1011111010. When the number of runs change, you don't want to shift over all the runs, because that would be O(N).

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

If you are asked this, try to make your first answer as simple as possible: "I would use a set to keep track of integers that I already encountered." Then engage the interviewer in dialog. Does she have any special constraints? Is time scarce, storage scarce, or both? Also, make sure that the interviewer knows that you know the difference between interface and implementation. There are a zillion ways to implement a set, but the most important feature of a set is that you can put a value into it, and the set will remember that it's part of the, um, set. The actual implementation can be based on a hash, but be clear that it's an implementation detail.

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I believe this should work.

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If you have to do an O(N) algorithm to find the rotation point, then you might as well just do a straightforward O(N) search as well, since O(N) + O(log N) is O(N).

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This is the way to do it. Even though the underlying representation is only binary in an abstract sense, it's totally trivial to serialize the number in binary form from the compressed representation that the OP suggests. See my Python implementation on this page for more details.

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This is not for the faint of heart. :)

# In this completely contrived problem, we represent a binary
# number as a list of runs of zeros and ones, and we minimize
# bit flipping and storage insertion/deletion by having the
# number of bits in a run represent the number of bits to
# jump in the array.  Incrementing a binary number, no matter
# how large, involves only modifying up to three tuples in place
# and possibly appending a sentinel to the end of the list.
def test():
    numbers = {
        0: [None],
        1: [(1,1), None],
        2: [(1,0), (1,1), None],
        3: [(2,1), None,  None],
        4: [(2,0), None,  (1,1), None],
        5: [(1,1), (1,0), (1,1), None],
        6: [(1,0), (2,1), None,  None],
        7: [(3,1), None,  None,  None],
        8: [(3,0), None,  None,  (1,1), None],
        9: [(1,1), (2,0), None,  (1,1), None],
       10: [(1,0), (1,1), (1,0), (1,1), None],
    }
    for k, v in numbers.items():
        assert k == to_decimal(v)
    num = [None] # zero
    for i in range(1024 * 128 + 5):
        assert i == to_decimal(num)
        incr_bin(num)
    print i
    print num
    print to_binary(num)
    print 'DONE!'

def incr_bin(num):
    if num[0] is None:
        # 0 -> 1
        num[0] = (1, 1)
        num.append(None)
        return
    num_bits, bit = num[0]
    if bit == 0:
        num_zeros = num_bits
        if num_zeros == 1:
            # 011 -> 111
            num_ones, _ = num[1]
            num[0] = (num_ones+1, 1)
        else:
            # 0001 -> 1001
            num[0] = (1,1)
            num[1] = (num_zeros-1, 0)
    else:
        num_ones = num_bits
        if num[num_ones] is None:
            # 1111 -> 00001
            num[0] = (num_ones, 0)
            num[num_ones] = (1,1)
            # We only need to allocate storage
            # when we reach a power of 2!
            num.append(None)
        else:
            num_zeros, _ = num[num_ones]
            if num_zeros == 1:
                # 11101111 -> 0001111
                num_higher_ones, _ = num[num_ones+1]
                num[0] = (num_ones, 0)
                num[num_ones] = (num_higher_ones+1, 1)
            else:
                # 1001xxx -> 0101xxx
                num[0] = (num_ones, 0)
                num[num_ones] = (1, 1)
                num[num_ones+1] = (num_zeros-1, 0)

# helper function for testing
def to_decimal(num, i=0):
    if num[i] is None:
        return 0
    result = 0
    num_bits, bit = num[i]
    if bit == 1:
        result = 2 ** num_bits - 1
    return result + (2 ** num_bits) * to_decimal(num, i+num_bits)

# another helper function
def to_binary(num):
    s = ''
    i = 0
    while num[i]:
        num_bits, bit = num[i]
        s = str(bit) * num_bits + s
        i += num_bits
    return s

test()

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This Python solution assumes two challenges. First, we have to do it in O(logN) time. Second, we aren't told in advance how many positions the array was rotated.

def test():
    # Assume we have a list that is sorted, except that is
    # rotated by an unknown offset.  First, we build a helper
    # function to determine the position of the smallest
    # element in the list (aka the rotation offset).
    assert 0 == pos_smallest_element([0, 0, 0])
    assert 0 == pos_smallest_element([0, 1, 2])
    assert 1 == pos_smallest_element([2, 0, 1])
    assert 2 == pos_smallest_element([1, 2, 0])
    assert 3 == pos_smallest_element([4, 5, 6, 0, 1, 2, 3])
    # Once we know the smallest element, it's pretty trivial
    # to implement search in terms of a generic binary search.
    test_lst = [10, 2, 4, 6, 8]
    assert 0 == search(10, test_lst)
    assert 1 == search(2, test_lst)
    assert 2 == search(4, test_lst)
    assert 3 == search(6, test_lst)
    assert 4 == search(8, test_lst)
    assert None == search(0, test_lst)
    assert None == search(7, test_lst)
    assert None == search(11, test_lst)

def bsearch(val, lowest, highest, get):
    # A generic binary search uses a get() function, so
    # that we're not coupled to the storage scheme.  We
    # just need a guarantee that get(a) <= get(b) when a <= b.
    def f(low, high):
        if low > high:
            return None
        mid = (low+high) / 2
        mid_val = get(mid)
        if val == mid_val:
            return mid
        if val < mid_val:
            return f(low, mid-1)
        else:
            return f(mid+1, high)
    return f(lowest, highest)

def search(val, lst):
    offset = pos_smallest_element(lst)
    def actual_pos(i):
        return (i+offset) % len(lst)
    def get(i):
        return lst[actual_pos(i)]
    answer = bsearch(val, 0, len(lst)-1, get)
    if answer is None:
        return None
    return actual_pos(answer)

def pos_smallest_element(lst):
    def pos(low, high):
        if low == high:
            return low
        if low + 1 == high:
            if lst[low] <= lst[high]:
                return low
            else:
                return high
        mid = (low + high) / 2
        if lst[low] <= lst[mid] <= lst[high]:
            return low
        if lst[low] <= lst[mid]:
            return pos(mid, high)
        else:
            return pos(low, mid)
            
    return pos(0, len(lst) - 1)

test()

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Try finding 2 in 10, 2, 4, 6, 8. I think it breaks your algorithm, but there's an easy way to prove me wrong. :)

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If you aren't told rotateAmount in advance, how do you determine it in non-linear time?

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

# You can simulate linked lists in Python with tuples.
# You want the least significant bit to be the head of
# the list.
def test():
    one = (1, None)
    two = (0, (1, None))
    three = (1, (1, None))
    six = (0, (1, (1, None)))
    seven = (1, (1, (1, None)))
    thirteen = (1, (0, (1, (1, None))))
    assert three ==  sum(one, two)
    assert thirteen == sum(six, seven)
    assert 15 == to_decimal(sum(two, sum(six, seven)))

def sum(lst1, lst2, carry=0):
    if lst1 is None and lst2 is None:
        if carry:
            return (1, None)
        else:
            return None
    if lst1 is None:
        b1, lst1 = 0, None
    else:
        b1, lst1 = lst1
    if lst2 is None:
        b2, lst2 = 0, None
    else:
        b2, lst2 = lst2
    b = b1 + b2 + carry
    if b >= 2:
        return (b-2, sum(lst1, lst2, 1))
    else:
        return (b, sum(lst1, lst2, 0))
    
def to_decimal(lst):
    if lst is None:
        return 0
    b, rest = lst
    return b + 2 * to_decimal(rest)

test()

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

It appears that you are working with a reversed list, not a rotated list. It's not clear that you read the problem correctly.

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Solving this is O(N) time is trivial, so the challenge is to solve it in sub-linear time. If the interviewer tells you where the smallest element in the list is, then you can do a binary search with the known rotation offset. If you don't know where the smallest element is, then the challenge is to find the smallest element in the list.

To find the smallest element in a rotated sorted list, break the lists in half. If low <= mid and mid <= low, then the smallest element is the first element, and you're done. If low > mid, then the smallest element must be in the first half of the list, so recurse on the first half of the list. If mid > high, then the smallest element must be in the second half of the list, so recurse on that.

The key insight here is that when you break a rotated sorted list in half, only one half of the list will have the smallest element, and the other half of the list will be in sequence.

- showell30@yahoo.com February 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

P.S. After the first pass you will already have an estimate of the median that is within 16 million of the exact median. On the second pass you will have an estimate that is within 64k of the precise median. After the third pass you will be at most 256 off. After the 4th pass you'll know the precise integer median. If you're dealing with 64-bit numbers, you will do eight passes.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Assume that the integers are 32-bit integers and that you can read the data stream four different times (either by asking for it again or persisting it to disk). You can solve this in four passes. In the first pass, shift each number by 24 bits and increment one of 256 buckets. After the first pass, iterate through the buckets to find out which contains the median value and what its rank will be. On the second pass, mask out the highest eight bits to filter values then use the next eight bits to to populate 256 buckets again. Repeat for the third pass and fourth pass. After the fourth pass you will have your result.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Treating the x dimension separately from the y dimension is a nice touch here. For any interview problem in 2D or 3D, a good think to ask yourself is whether you can treat the movements in the different dimensions as being independent. If you can treat the dimensions as being independent, then you basically can reduce the problem to a 1D problem.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Here is a Python solution.

def compute_spiral_area(movements):
    i = 0
    max_x = 0
    min_x = 0
    max_y = 0
    min_y = 0
    x = 0
    y = 0
    while True:
        if i >= len(movements): break
        x += movements[i]
        if x > max_x: max_x = x
        i += 1
        if i >= len(movements): break
        y += movements[i]
        if y > max_y: max_y = y
        i += 1
        if i >= len(movements): break
        x -= movements[i]
        if x < min_x: min_x = x
        i += 1
        if i >= len(movements): break
        y -= movements[i]
        if y < min_y: min_y = y
        i += 1
    return (max_x - min_x) * (max_y - min_y)

input = [4,3,5,4,10]
print compute_spiral_area(input)

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

See my solution "Here's the in-place solution in Python..." to see how you can solve this working backward after first compressing the 1s in an initial pass.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Read the problem again. The challenge here is rewriting the array in constant space. Streaming this to standard output is trivial.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Here's the in-place solution in Python, with tests. It's done in constant space. There's no recursion and the array never grows larger than the final result.

def test():
    test_strings = [
        ('a1b1c1', 'abc'),
        ('a1b3c5', 'abbbccccc'),
        ('a5b3c1', 'aaaaabbbc'),
        ('a5b5c5', 'aaaaabbbbbccccc'),
    ]
    for s, expected_result in test_strings:
        # strings are immutable in Python, so make this a list
        lst = list(s)
        expand(lst)
        assert expected_result == ''.join(lst)

def expand(a):
    # i = input index
    # j = output index
    # In our first pass, we compress the 1s
    # and compute how much space we'll have.
    i = 0
    j = 0
    new_len = 0
    while i < len(a):
        c = a[i]
        cnt = int(a[i+1])
        new_len += cnt
        i += 2
        a[j] = c
        j += 1
        if cnt > 1:
            a[j] = str(cnt)
            j += 1
    # Set the array to correct length.
    while len(a) > new_len:
        a.pop()
    # Note that this is the only place we append
    # to the list.  Everything is in-place, and we
    # have constant storage for the locals: i, j, cnt, new_len.
    while len(a) < new_len:
        a.append(None) 
    # In the second pass, we work backward through
    # the string and write out the final result.  By
    # working backward, we know we'll have room.
    i = j - 1 # last element in compress string
    j = new_len - 1 # where we write
    while i >= 0:
        c = a[i]
        i -= 1
        try:
            cnt = int(c)
            c = a[i]
            i -= 1
        except:
            cnt = 1
        while cnt > 0:
            a[j] = c
            j -= 1
            cnt -= 1

test()

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

You just need to compress the 1s in a first pass. After that, the reverse methodology works fine. In the first pass, you compress the 1s and compute the length of the expanded string. In the second pass, you start from the back. Read a character. If it's a letter, just write it to the end of the array. If it's a number, read back one more to get the letter, and then write the appropriate numbers to the back of the string.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If the integers can range from 0 to 2<<32 (approximately 4 billion), then you still have scaling issues. Assume the first half of your dataset ranges from 1 billion to 1.5 billion. You need to track every number, since the second half of the dataset could all be less than 1 billion or all be more than 1.5 billion.

It's a good question, though. If you constrain the size of the numbers more, than the "bucket" approach becomes practical.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

See my answer that begins with "Let's assume memory and disk are both constrained...". It elaborates on the dual-heap approach in the context of large datasets.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Let's assume memory and disk are both constrained, but you can assume that your dataset is ordered in a fairly random way. Use the first element as the provisional median. Maintain two heaps, one for numbers lower than the median and one for numbers higher than the median. For each new number, decide which heap to update and keep track of the size of both heaps. Whenever you get two elements in a row that go to the same heap, you pop off the top element from the large heap, compare it to the most recent number, and decide the new provisional median. Put the other number in the smaller heap.

Since memory and disk are constrained, you will want to limit the size of the heaps. This means that numbers that are "extreme" will get tossed. This is risky, because the "extreme" number might actually be a non-extreme number that's only "extreme" relative to early numbers in the dataset. For real world problems, this is often an acceptable risk, as you're effectively treating early values as outliers. You can detect the situation by keeping track of the max number discarded for being too small and keeping track of the min number discarded for being too big.

If you want to compute the *exact* median, then you have to store the first half billion integers *somewhere*, since every one of the first half billion integers could theoretically still be the median without any knowledge of the second half of the sample.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Yup, if parens aren't allowed, then you're better off just generating all the possible answers and evaluating them. To generate the sequence of numbers, use a permutation algorithm. To generate the symbols, use an odometer algorithm. To evaluate the expression, first scan for * and /, then scan for + and -.

- showell30@yahoo.com February 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

To clarify, I am defining N as the perimeter of the rectangle. The algorithm does not depend on it being a square. At worst it does m+m comparisons (N/2), which makes it O(N).

- showell30@yahoo.com February 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

The only problem with this solution is that you don't really take advantage of the parallelism of a distributed system. You want to come up with some divide and conquer scheme that allows multiple nodes to be computing partial sums simultaneously. See the Python binary tree solution, for example.

- showell30@yahoo.com February 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 6 vote

Let the nodes form a binary tree. Every node has 0 or 1 parents and between 0 and 2 children. Nodes with children wait for them to report their values and pass a partial sum up to their parents, then wait for the parents to report the eventual sums. For parallel nodes this runs in O(logN) time. The synchronous simulation below runs in linear time.

import random

class Node:
    def __init__(self, n, num_nodes, val, send, report_final_result):
        self.send = send
        self.report_final_result = report_final_result
        self.children = []
        self.sum = val
        if 2*n <= num_nodes:
            self.children.append(2*n)
        if 2*n+1 <= num_nodes:
            self.children.append(2*n+1)
        self.parent = n / 2
        self.values_to_wait_for = len(self.children)
        self.child_values = []
    def start(self):
        if self.values_to_wait_for == 0:
            if self.parent:
                self.send(self.parent, self.sum)
            else:
                self.report_final_result(self.sum)
    def receive(self, val):
        if self.values_to_wait_for > 0:
            self._handle_child_val(val)
        else:
            self._handle_parent_val(val)
    def _handle_child_val(self, val):
        self.sum += val
        self.values_to_wait_for -= 1
        if self.values_to_wait_for == 0:
            if self.parent:
                self.send(self.parent, self.sum)
            else:
                # we're the root
                self.report_final_result(self.sum)
                for child in self.children:
                    self.send(child, self.sum)
    def _handle_parent_val(self, val):
        self.report_final_result(val)
        for child in self.children:
            self.send(child, val)

def simulate(num_nodes):
    nodes = []
    pending_nodes = set()
    sum = 0
    def make_dispatch(sender):
        def f(n, val):
            # connect send to recipient's receive() method
            print 'sender %d sent %d to %d' % (sender, val, n)
            nodes[n-1].receive(val)
        return f
    def make_report_final_result(sender):
        def f(val):
            print sender, val
            assert val == sum
            pending_nodes.remove(sender)
        return f
    for i in range(num_nodes):
        node_val = random.randint(0, 100)
        sum += node_val
        node = Node(i+1, num_nodes, node_val,
                make_dispatch(i+1),
                make_report_final_result(i+1)
        )
        nodes.append(node)
        pending_nodes.add(i+1)
    for i in range(num_nodes):
        nodes[i].start()
    assert 0 == len(pending_nodes)

simulate(2000)

- showell30@yahoo.com February 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 5 vote

Write all the elements to disk in a first O(N) pass, and randomly capture about 99 of them into memory. Sort the 99 elements to create 100 ranges for the billion elements to be bucketed into. Each range will be a file on disk. For each of the billion integers, do a binary search on your 99 randomly sampled integers to determine which file they go to next. Keep track of how many integers are in each file. At the end of the second pass you will have a hundred files, each with about ten million elements each. You'll know how many elements are in each file, and you'll be able to easily compute which file contains the median, and where the median ranks within that file. At that point, use an in-memory solution to find the nth element in the file that you know contains the median. Sorting ten million integers is pretty fast, so you can probably just sort all of them without having to go to a fancy heap-based approach. The total cost here is roughly 2 billion disk writes and roughly 11 billion integer comparisons.

This will get the job done easily for most data sets, but you can refine it by having provisions for one range taking an unexpectedly large number of values. Also, after a certain amount of time, you'll know that it will be impossible for certain ranges to ever contain the median, so you can short circuit writing values to their files and simply update their counts.

- showell30@yahoo.com February 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

# In Python a simple way to represent a binary
# tree is as a tuple: (left_tree, val, right_tree)
def create_tree(lst):
    def _maketree(low, high):
        # Making a balanced tree is simply a matter
        # of taking the middle value as the root
        # and recursively building the subtrees
        # on the left and right.
        if low > high:
            return None
        if low == high:
            return (None, lst[low], None)
        mid = (low + high) / 2
        val = lst[mid]
        left = _maketree(low, mid - 1)
        right = _maketree(mid + 1, high)
        return (left, val, right)
    return _maketree(0, len(lst) - 1)

def traverse(tree):
    if tree is None:
        return
    left, val, right = tree
    traverse(left)
    print val
    traverse(right)

def tree_height(tree):
    if tree is None:
        return 0
    left, val, right = tree
    lh = tree_height(left)
    rh = tree_height(right)
    return max([lh, rh]) + 1

lst = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
tree = create_tree(lst)
print tree
assert 4 == tree_height(tree)
traverse(tree)

- showell30@yahoo.com February 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Here is an efficient algorithm that uses divide and conquer. You divide up runners into the first division and second division and let them recursively order themselves. Then you combine the divisions. The winner of the bottom division tries to beat the last place person in the top division, and if successful, works his way up the ranks. As soon as the top division's loser beats the bottom division's winner, you can simply chain the two divisions together.

import random

class Runner:
    pass

def race(race_results, c1, c2):
    if (c1, c2) in race_results:
        return race_results[(c1, c2)]
    if random.random() < 0.5:
    # if c1 > c2:
        print '%d beats %d' % (c1, c2)
        race_results[(c1, c2)] = c1
        return c1
    else:
        print '%d beats %d' % (c2, c1)
        race_results[(c1, c2)] = c2
        return c2

def race_runners(race_results, runners, low, high):
    if low == high:
        return low, low
    if low + 1 == high:
        winner = race(race_results, low, high)
        if winner == low:
            runners[low].inferior = high
            runners[high].superior = low
            return low, high
        else:
            runners[low].superior = high
            runners[high].inferior = low
            return high, low
    mid = (low + high) / 2
    winner1, loser1 = race_runners(race_results, runners, low, mid)
    winner2, loser2 = race_runners(race_results, runners, mid + 1, high)
    print 'merge:'
    print_chain(runners, winner1)
    print_chain(runners, winner2)
    while winner2 is not None:
        on_deck = runners[winner2].inferior
        # Try to promote the winner of the second division
        # up through the ranks of the first division.
        defeated = None
        victor = None
        loser = loser1
        while loser is not None:
            winner = race(race_results, winner2, loser)
            if winner == winner2:
                defeated = loser
                loser = runners[loser].superior
            else:
                victor = loser
                break
        if defeated is None:
            # winner2 beat nobody, so
            # connect the ladders and we're done
            print 'simple connect'
            runners[loser1].inferior = winner2
            runners[winner2].superior = loser1
            break
        if victor is None:
            # we have a new alpha dog
            runners[defeated].superior = winner2
            runners[winner2].inferior = defeated
            print 'new alpha dog'
            winner1 = winner2
        else:
            # insert our new place
            print 'insertion'
            runners[victor].inferior = winner2
            runners[winner2].superior = victor
            runners[winner2].inferior = defeated
            runners[defeated].superior = winner2
        if on_deck is None:
            # top division loser couldn't hold off anybody
            loser2 = loser1
            break
        runners[on_deck].superior = None
        winner2 = on_deck
    print_chain(runners, winner1)
    return winner1, loser2


def print_chain(runners, head):
    chain = []
    while head is not None:
        chain.append(head)
        next = runners[head].inferior
        if next and runners[next].superior != head:
            print chain
            raise Exception('bad reverse linking')
        head = next
    print 'chain', chain

def ladder(num_runners):
    runners = []
    for i in range(num_runners):
        runner = Runner()
        runner.superior = None
        runner.inferior = None
        runners.append(runner)
    race_results = {}
    winner, loser = race_runners(race_results, runners, 0, num_runners-1)
    print_chain(runners, winner)

def test():
    for i in range(1, 25):
        print '-----'
        ladder(i)

test()

- showell30@yahoo.com February 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This seems overly complicated, when this problem is easily solved in O(M) by traversing adjacent cells and leaving breadcrumbs. See the Python solution, for example.

- showell30@yahoo.com February 19, 2013 | Flag Reply

Page:

1 2 3 4 5 6

CareerCup

showell30@yahoo.com

Books

Videos

Resume Review

Mock Interviews