adr
 0of 0 votes
AnswersYou are standing in the topleft corner of n*m grid. At each step you can only move up, down, right or left. Count the number of unique paths to the bottomright corner of the grid (paths cannot cross themselves). The interviewer suggested that a backtracking solution is not the most performant one.
 adr Report Duplicate  Flag
Software Engineer
OK here's some code. It can be further optimised to not use naive unionfind and to not bruteforce when searching for overlapping areas. Hope it helps.
# Assume all sensors are within a room, the actual width of the room does not matter.
def canGoFromLeftToRight(roomHeight, sensors, r):
ids = range(len(sensors))
def union(i,j):
ids[find(i)] = find(j)
def find(i):
while (i != ids[i]):
ids[i] = ids[ids[i]]
i = ids[i]
return i
top = []
bottom = []
for i,[x,y] in enumerate(sensors):
if y+r >= roomHeight: # overlaps top side of the room
top += [i]
if y <= r: # overlaps bottom side of the room
bottom += [i]
if not top or not bottom:
return True
# unite all sensors overlapping the top
for i,j in zip(top, top[1:]):
union(j,i)
# unite all sensors overlapping the bottom
for i,j in zip(bottom, bottom[1:]):
union(i,j)
# unite all sensors overlapping each other
for i,[x,y] in enumerate(sensors):
for I,[X,Y] in enumerate(sensors[i+1:],i+1):
if (Xx)*(Xx) + (Yy)*(Yy) <= r*r:
union(i,I)
return find(top[0]) != find(bottom[0])
canGoFromLeftToRight(1, [(0,0),(0.5,0.2),(0.7,0.4),(0.6,0.6),(1,1)], 0.5) # False
canGoFromLeftToRight(1, [(0,0),(0.5,0.2),(0.7,0.4),(1,1)], 0.5) # True

adr
September 16, 2018 This is a percolation problem solvable using unionfind.
The thief cannot go from left to right iff there exists a path from top to bottom via sensor coverage areas. You have a graph where two sensors are connected if their coverage areas overlap. You need to find if there exists a sensor overlapping the top side of the room that is connected with a sensor overlapping the bottom side of the room.
Solution 1: minpriority queue where the key for each point is the distance to (0,0). Heapify the input array and pop first K. Complexity: O(n+k*log(n))
Solution 2: use a selection algorithm like quick select to get Kth closest element. Complexity: O(n) average, O(n*n) worst case. You'll get a partially sorted array where Kth element is on it's place and all elements to the left are closer to (0,0) or same distance. Note that the complexity of this method does not even depend on K.
from collections import Counter
def count(arr):
fs = [(tuple(Counter(x[::2]).items()),
(tuple(Counter(x[1::2]).items())))
for x in arr]
return sum(z for (x,z) in Counter(fs).items() if z>1)
print count(["abcd","cbad","bacd"])
print count(["abcd", "acbd", "adcb", "cdba",
"bcda", "badc"])

adr
August 09, 2018 @robb.krakow Sure. This has some similarities with unionfind. Use an array to store for N elements which of the two partitions they belong to with 0 meaning unmarked/undecided. When you see a rule (a,b) you look up a,b in the array. If both of them are unmarked, i.e. rel[a] == rel[b] (we assume a != b and they cannot be already marked as belonging to the same partition as the rules are noncontradictary), we mark them as belonging to the opposite partitions. Otherwise identify which of a,b is marked and mark the other one as belonging to the opposite partition.
 adr July 27, 2018Assuming the set of rules is complete and noncontradictary, i.e after applying the rules every element can be in exactly one of the two partitions:
def part(n, k):
rel = [0]*n
for a,b in k:
if rel[a1] == rel[b1]:
rel[a1] = 1
rel[b1] = 1
else:
if rel[b1]==0:
a,b = b,a
rel[a1] = rel[b1]
r1 = []
r2 = []
for i in xrange(n):
r = r1 if rel[i] == 1 else r2
r += [i+1]
return (r1,r2)

adr
July 26, 2018 You only need to know P and K. The list of N integers is irrelevant to the problem.
The first part is to factor P into primes. Then, if all the elements in the resultant list of prime factors are unique, there is a simple combinatorical solution: 1+sum_i=1..K1 (C(i, L1)), where C(k,n) = n!/k!(nk)! and L is length of the list of primes.
I find the general case hard though. When we have duplicates in the list of primes (think of [2, 2, 2, 2, 2] where [2,2],[2,2,2] and [2,2,2],[2,2] result in the same terms, 4*8 = 8*4) I can't think of any better way than to iterate over all 2way, 3way, .. Kway splits of the list of prime factors, compute the resultant product terms, and count ignoring repetitions.
The idea is to accommodate "cheapest" tasks first using a cpu with the least free cores.
def count(tasks, cpus):
tasks = sorted(tasks)
heapq.heapify(cpus)
res = 0
for t in tasks:
while cpus and cpus[0] < t:
heapq.heappop(cpus)
if not cpus:
return res
cores = cpus[0]
res += 1
heapq.heapreplace(cpus, cores  t)
return res

adr
July 23, 2018 Similar to @Flint solution but a bit shorter, O(len(seq) * (sum(seq)  magic_number)):
def is_magic(seq, magic_number):
if not seq:
return False
magic_number = seq[0] # The first element must be positive
seq = seq[1:]
target = sum(seq)  magic_number
if (target % 2) or (target < 0):
return False
target /= 2
return reduce(lambda w, i: [w[m] or (w[mseq[i]] if seq[i]<= m else False) for m in xrange(target+1)],
xrange(1, len(seq)),
[True]+[False if i+1 != seq[0] else True for i in xrange(target)])[1]

adr
July 21, 2018 Assuming you are on a 64bit system where practically any file on your hard drive can fit in your virtual address space, you just do an external sort, mmap the resultant file and then do a usual 3sum with 3 pointers. OS will take care of (un)loading pages to/from real memory.
 adr July 20, 2018The enabling logic should be straightforward provided that you have ample memory. You could use a hash map for the counters id > (timestamp, count). If someone makes a request you check if there is an entry. If there isn't or if the timestamp in the entry is less than the current timestampX, add/replace it with (timestamp, 0). Otherwise, if the counter is less than Y, replace the entry with (timestamp, count+1). Otherwise, block the request. You will also want to limit the size of the map if the range of ids is too big by evicting old values. Different strategies could be used like having a priority queue timestamp > id where the top element in the queue is the oldest. Before making checks against the hash map you could repeatedly pop elements off the queue if they are too old and remove the corresponding elements from the hash map.
 adr July 18, 2018So, if I understand the problem description, you are given (timestamp, id) data stream and asked to compute the most frequent ids falling into the specified time window?
This is a well known problem. We need to know the size of the data stream that fits in the biggest time window of interest as compared to the available memory. If you need to compute the exact topK heavy hitters it will require O(N) memory (it's proved that even exact top1 requires O(N) space worstcase), where N is the biggest number of data points fitting into the time window. If you can satisfy this requirement, then it's one thing and the solution is relatively easy. On the other hand (I suspect this is what the interviewer wanted to hear) if the size of the stream is too huge and you can tolerate errors, then it calls for an approximation solution like lossy counting etc. see https://en.m.wikipedia.org/wiki/Streaming_algorithm#Frequent_elements
See also here https://cstheory.stackexchange.com/questions/19802/topkfrequentitemsindatastream
@Flint The problem can be rephrased as such: your input is an integer K>0, and a sequence S = s1 s2 .. sn, 1<=si<=K, 1<=i<=n. Consider a set Q of all possible sequences q1 q2 .. qz, 1 <= qi <= K, z >= 1 that are not subsequences of S (that cannot be derived from S by deleting some or no elements without changing the order of the remaining elements). You need to find the length of the shortest sequence in Q.
 adr July 15, 2018@aonecoding provided a cool solution. It took me a while to convince myself that it actually works. Here is the sketch of a proof if you are interested:
1) If an input string S does not contain all K characters, then the answer is obviously 1, as any missing character forms a string that is not a substring of S
2) Otherwise, an input string S can be represented as a concatenation S1 S2 where
S1 is the *shortest* prefix of S that contains all K characters and (a possibly empty) suffix S2.
Now, if the suffix S2 contains all K characters it can also be represented as a concatenation of a shortest prefix that contains all K characters and (a possibly empty) suffix.
Repeating this process until we get a suffix that does not contain all K characters, an input string S can uniquely be represented as a concatenation of S1 S2 .. Sn where Si (i in [1..n1]) is the shortest prefix of the rest of S (to be more precise, of the suffix of S produced by taking S1 S2 .. Si1 prefix off) containing all K characters, and Sn  is the (possibly empty) suffix that does not contain all K characters (meaning some of K characters are not in Sn).
3) Since Si is the shortest prefix with all K characters, its last character occurs only once (otherwise this character could be omitted producing a shorter prefix with all K characters).
4) This is how to build a string that is guaranteed to not be a subsequence of S and also having minimum length among other strings having this property, meaning that any shorter string is necessarily a subsequence of S: s1 s2 .. sn, where si (i in [1..n1]) is the last character of Si, and sn is any of the K characters not present in Sn.
First, let's prove that any shorter sequence q1 q2 .. qk, k<n is a subsequence of S. This part is obvious since qi is in Si, since Si contains all K characters (i in [1..k]).
Now, let's prove that the string s1 s2 .. sn is not a subsequence of S. Let's assume otherwise. Since S1 contains s1 only as the last character, either the whole s1 s2 .. sn is a subsequence of S2 .. Sn, or just s2 s3 .. sn is a subsequence of S2 .. Sn. Either way, s2 s3 .. sn is a subsequence of S2 S3 .. Sn. Now, since S2 contains s2 only as the last character, s3 .. sn is a subsequence of S3 .. Sn. By repeating this n1 times we get that sn is a subsequence of Sn, but this cannot be true because sn was specifically chosen as one of the K characters not in Sn. We get a contradiction which means s1 s2 .. sn is not a subsequence of S.
Now, having proved that s1 .. sn is the minimal string that is not a subsequence of S, we see that in order to compute n we just need to break up the input string S into concatenations S1 .. Sn1 Sn, as outlined in 2) and count the number of these. This is exactly what the algorithm is doing.
Sounds like you only need 10s worth of data for your check so you could use a priority queue where the key is the timestamp and the value is (word, timestamp). The top element in the queue is the one with the smallest timestamp (oldest). So every time the iterator is used it could repeatedly remove the top element (if it is too old) both from the queue and from the hash map until the oldest stored element's timestamp is not smaller than the current timestamp  10s.
 adr July 05, 2018Quickselect. No extra space needed. O(n) expected time, O(n*n) worst case time
def nSmallest(ps, n, key):
beg = 0
end = len(ps)
while beg != end:
first = beg
last = end1
piv = last
pivk = key(piv)
while first < last:
if key(first) < pivk:
first += 1
else:
last = 1
ps[first],ps[last] = ps[last], ps[first]
ps[first],ps[piv] = ps[piv], ps[first]
if first < n1:
beg = first + 1
elif first > n1:
end = first
else:
return ps[:first+1]
return ps[:beg]
def nNearest(ps, n):
def key(i):
(px, py) = ps[i]
return px*px + py*py
return nSmallest(ps, n, key)
O(n) extra space if mutating the input is not allowed:
def nSmallest(ps, n, key):
r = []
while ps:
piv = ps[1]
pivk = key(piv)
lh, rh = [], [piv]
for t in ps[:1]:
(lh if key(t) < pivk else rh).append(t)
if len(lh) < n:
r += lh
ps = rh
n = len(lh)
elif len(lh) > n:
ps = lh
else:
return r+lh
return r

adr
June 26, 2018 If one can fit the entire map in memory, then unionfind is an overkill
import random
n,m=3,4
grid=[[random.choice("XO") for _ in xrange(m)] for _ in xrange(n)]
def numIslands(grid):
count = 0
for p in xrange(n*m):
r,c = p/m,p%m
if grid[r][c] == 'X':
count += 1
l = [p]
while l:
p = l.pop()
for z in filter(lambda z: 0<=z<n*m
and grid[z/m][z%m] == 'X'
and abs(z/mp/m)*abs(z%mp%m)<2,
[p1,p+1,pm,p+m]):
grid[z/m][z%m] = 'V'
l += [z]
return count
for v in grid:
print v
print numIslands(grid)

adr
June 17, 2018 Recursive:
def topoSort(ts):
r,g,v = ([[]],{},set())
for [a,b] in ts:
g.update({a: g.get(a, []) + [b]})
def go(k):
if k not in v:
for m in g.get(k, []):
go(m)
v.add(k)
r[0] += [k]
for k in g.keys():
go(k)
return r[0][::1]
print topoSort(["AB", "CE", "CD", "DE", "AC", "BC"])
Iterative:
def topoSort(ts):
r,g,v = ([],{},set())
for [a,b] in ts:
g.update({a: g.get(a, []) + [b]})
for k in g.keys():
st = [(k, g.get(k, []))]
while st:
(k,ms) = st.pop()
if k not in v:
if ms:
st += [(k, ms[1:])]
st += [(ms[0], g.get(k, []))]
else:
v.add(k)
r += [k]
return r[::1]
print topoSort(["AB", "CE", "CD", "DE", "AC", "BC"])

adr
June 14, 2018 If the generator string has the property that no two prefix and suffix substrings share a common prefix (e.g. "ab" in the description has this property wheras "aba" does not) then the problem has a simple O(n) solution:
def isValid(s, g):
if not g:
return False
(st, gi) = ([], 0)
for c in s:
if c == g[gi]:
gi += 1
if gi == len(g):
gi = st.pop() if st else 0
elif c == g[0]:
st += [gi]
gi = 1
else:
return False
return not st and gi == 0
print isValid("abcababccabc", "abc")

adr
June 12, 2018 O(n) solution, uses O(2**k) memory.
def foo(s, k):
if len(s) < k + 2**k  1:
return False
(count, v) = (2**k1, [0]*2**k)
q = reduce(lambda a,b:2*a+b, s[:k], 0)
v[q] = 1
z = 2**(k1)
for si in xrange(len(s)k):
q = (qs[si]*z)*2+s[si+k]
if v[q] == 0:
v[q] = 1
count = 1
if count == 0:
return True
return False
print foo([1, 0, 0, 1, 1], 2)

adr
June 11, 2018 BFS solution:
import random
(n, m) = (10, 15)
(mx,s) = ([random.randint(0, 9) for _ in xrange(n*m)], random.randint(1, 50))
(v,l,r) = ([0]*m*n, [[[0], mx[0]]], [])
while l:
nl = []
for [p, ps] in l:
if ps == s:
r = p
break
if ps < s:
pp = p[1]
for np in [pp + d for d in [1, 1, m, m]
if 0<=pp+d<n*m and abs((pp+d)%m  pp%m)<2 and v[pp+d] == 0]:
v[np] = 1
nl += [[p + [np], ps + mx[np]]]
l = nl
def printm(mx):
print ""
for i in xrange(n):
print '[' + ' '.join(map(str, mx[i*m:i*m+m])) + ']'
print "target =", s
printm(mx)
if r:
printm([1 if i in r else 0 for i in xrange(n*m)])
Output:
target = 34
[9 1 1 6 9 1 4 8 5 3 2 5 9 4 4]
[7 1 6 6 3 6 9 3 3 7 0 9 3 0 1]
[9 2 6 7 3 7 0 7 8 1 6 7 0 7 5]
[0 3 3 3 4 2 9 4 8 5 4 5 0 9 0]
[4 3 1 4 4 4 5 0 6 9 5 7 8 7 5]
[4 7 1 4 6 5 4 0 1 9 0 1 8 6 0]
[6 8 5 9 7 8 3 6 7 5 5 1 0 4 4]
[1 6 2 9 8 2 5 9 7 4 9 9 7 5 5]
[9 1 9 9 0 9 5 6 4 9 4 4 9 9 9]
[4 1 5 3 1 5 9 0 2 4 1 2 8 4 3]
[1 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

adr
June 10, 2018 import random
(n, m, bcount) = (5, 10, 4)
(pos, bpos) = (random.randint(0, n*m1), set(random.sample(xrange(n*m), bcount)))
(l, visited) = (set([pos]), set())
(r, steps) = (bpos.intersection(l), 0)
while l and not r:
steps += 1
l = set([p + d for p in l for d in [1, 1, m, +m]
if 0<=p+d<n*m and abs(p%m  (p+d)%m)<2 and p+d not in visited])
visited.update(l)
r = bpos.intersection(l)
bmap = ['*' if i == pos else 'X' if i in r else 'x' if i in bpos else ' ' for i in xrange(n*m)]
for i in xrange(n):
print "" + ''.join(bmap[i*m:i*m+m]) + ""
print "distance =", steps
Output:
x * 
 x X 
 
 x 
 
distance = 2

adr
June 10, 2018 I assume by infinitely large you mean an array that does not fit in memory otherwise your program will never terminate obviously. If that's the case this sounds like a simple problem
* Create k files (k  number of tags)
* Read the input array by blocks that fit in memory
* If an element in the block has tag i add it to ith file
* When you are done processing the array, just traverse your files in order
We just need to maintain k largest elements seen so far. We can use a minheap for this:
* Load the huge array by blocks that fit in memory
* If a number in the block is greater or equal to the min element in the heap or if the heap is empty, add it to the heap and remove the min element if the number of elements in the heap became k+1
* When we are done processing the array, the min element in the heap is the result
Under these assumptions I would do this:
* Create a bit array of 2**32 bits (only 512MB required).
* Load the huge array by blocks that fit in memory
* If a number in the block has been seen before (bit corresponding to this number is set), output it as a duplicate, otherwise set the bit.
O(n*n*m), uses O(m) memory
from operator import add, sub
import random
def maxseq (s, k):
first = i = cost = 0
r = [0,[0,0]]
def update():
if r[0] < ifirst:
r[0] = ifirst
r[1] = [first, i]
while i < len(s):
if s[i] + cost <= k:
cost += s[i]
i += 1
else:
update()
cost = s[first]
first += 1
update()
return r[1]
def maxrect(v, k):
first = 0
end = len(v)
r = [0,[0,0,0,0]]
def update (beg, end, arr):
[b, e] = maxseq (arr, k)
a = (eb)*(endbeg)
if r[0] < a:
r[0] = a
r[1] = [beg, end, b, e]
return e  b
while first < end:
arr = v[first]
ub = (end  first) * len(arr)
for j in xrange (first, end):
if r[0] < ub:
ub = (end  first) * update(first, j+1, arr)
if j+1 != end:
arr = map(add, arr, v[j+1])
arr = map(sub, arr, v[first])
first += 1
for j in reversed(xrange(first, end)):
if r[0] < (jfirst+1)*len(arr):
update(first, j+1, arr)
if first != j:
arr = map(sub, arr, v[j])
first += 1
return r
random.seed(42)
k = random.randint(1, 42)
n = 10
m = 15
v = [[random.randint(1, 7) for _ in range(m)] for _ in range(n)]
for vv in v:
print vv
print k
[a, r] = maxrect(v, k)
for i in xrange(r[0], r[1]):
print v[i][r[2]:r[3]]

adr
June 04, 2018 import string
import math
import operator
alph = string.ascii_lowercase
primes = [x for x in range(2, 100) if all(x % y != 0 for y in range(2, int(math.ceil(math.log10(x)))))]
foo = lambda str: reduce(operator.mul, map(lambda x: primes[x], map(lambda x: alph.index(x), str)), 1)
def bar(dict, xdict, str, n):
for p in [str[n:z:] for z in range(n, len(str)+1)]:
x = foo(p)
if (x in xdict):
r = [dict[xdict.index(x)]]
if (n + len(p) != len(str)):
rs = bar(dict, xdict, str, n + len(p))
if (len(rs) != 0):
return r+rs
else:
return r
return []
def unscramble(dict, str):
xdict = map(foo, dict)
return (" ".join(bar(dict, xdict, str, 0)))
d = ['llo', 'hell', 'ocrazyworl', 'ld', 'crazyw', 'ell', 'llocra', 'azyworl', 'll', 'razywo', 'ocr', 'zywo', 'ocrazyw', 'ywor', 'azywo', 'llocrazyworl', 'razywor', 'ellocr', 'yw', 'zyworl', 'lloc', 'ellocrazyw', 'h', 'l', 'cra', 'ywo', 'el', 'llocrazywo', 'ocrazywo', 'crazywo', 'hellocra', 'zy', 'cr', 'ocrazy', 'azyw', 'locraz', 'wor', 'ra', 'rl', 'llocrazy', 'hellocraz', 'razyworl', 'llocr', 'wo', 'ocra', 'ellocrazy', 'orl', 'llocrazywor', 'llocraz', 'razy', 'c', 'ocraz', 'hellocrazy', 'llocrazyw', 'oc', 'o', 'hellocrazyw', 'w', 'lo', 'or', 'hellocrazywo', 'hellocr', 'locrazyworl', 'loc', 'elloc', 'craz', 'hel', 'locrazy', 'hellocrazywor', 'locrazywo', 'locrazyw', 'crazyworl', 'he', 'locr', 'ellocraz', 'worl', 'azy', 'r', 'z', 'crazy', 'helloc', 'azywor', 'ellocrazyworl', 'az', 'raz', 'locrazywor', 'crazywor', 'zywor', 'hellocrazyworl', 'ellocra', 'ellocrazywo', 'ello', 'ocrazywor', 'a', 'locra', 'razyw', 'yworl', 'e', 'ellocrazywor', 'zyw', 'y', 'hello']
str = "hellocrazyworld"
assert(unscramble(d, str) == "h e l l o c r a z y w o r ld")

adr
May 27, 2018 def match(s, p):
n = len(s)
m = len(p)
ms = [[False] * (m + 1) for _ in xrange(n + 1)]
ms[0][0] = True
for j in xrange(1, m+1):
ms[0][j] = ms[0][j1] and p[j1] in ['*', '+']
for i in xrange(1, n+1):
for j in xrange(1, m+1):
if p[j1] == '*':
ms[i][j] = ms[i][j1] or ms[i1][j]
elif p[j1] == '+':
ms[i][j] = ms[i][j1] or ms[i1][j1]
else:
ms[i][j] = p[j1] == s[i1] and ms[i1][j1]
return ms[n][m]

adr
May 27, 2018 Open Chat in New Window
No sorting or priority queue is required. It is cheaper to just find a median in the difference value array using a selection algorithm like quickselect (linear time on average) and assign all people with values less than that to NY.
 adr September 23, 2018