Chris
BAN USERSWE
- 0of 0 votes
AnswersA new species has been discovered and we are trying to understand their form of communication. So, we start by identifying their alphabet. Tough Mr. Brown managed to steal a vocabulary that has words in it in ascending order and a team already identified the letters. So, we have now ordered words as sequences of numbers, print every letter once in ascending order.
- Chris in United States
[3,2]
[4,3,2]
[4,1,1]
[1, 2]
[1, 1]
alphabet: [3, 4, 2, 1]| Report Duplicate | Flag | PURGE
Software Engineer Data Structures
@dizhu2016: geeks4geeks, leetcode, topcoder or more academic, there are good CS courses on MIT OCW (open courseware) which I found very inspiring. Wikipedia has sometimes good explanations and not to forget CLRS (introduction to algorithms, cormen, leiserson, rivest, stein) and if you like math and consider yourself really smart, probably knuth's art of computer programming is worth it (I'd take a year or so in a very quite place to fully go through volume 1 I think...)
- Chris July 21, 2017you can create a priority queue (max heap) where the key is the rounded up number of tasks per machine, that is (a[i]+b[i]-1)/b[i], assuming a[i] is the number of tasks for i 0<=i<N and i is an additional attribute of the queue item, b[i] = 1 for all i.
Then you can distribute the B-N remaining machines by removing the top element from the queue and re-inserting it after b[i] was incremented. After this B-N operations the key of the top element is the answer. This will have time complexity of O(B-N).
This question's can be seperated into two categories:
a) discrete time:
resolution of time is coarse enough, one can count per "time"
e.g. people entering on a per houre, minute, second, millisecond,
b) continuous time:
e.g. start and end times are floating points
if you have continous time, a good approach is to seperate start
and end times and sort both arrays. Starting at both arrays at the
beginning and using the smallest value of both, incrementing the
number of people in the room if the next smallest value is on start
and decrementing if it's on end times array.
this question has coarse grained descret time, so, per hour just
collect how many are leaving (x) and how meany are entering (y) as
change = y - x and then go through the array and count...
in python 3, assuming time values 0 <= time_value < 24
def max_people_in_room(intervals):
changes = [0 for i in range(24)]
for i in intervals:
changes[i[0]] += 1
changes[i[1]] -= 1
people = 0
max_people = 0
for c in changes:
people += c
max_people = max(max_people, people)
return max_people
print(max_people_in_room([(1,4), (3,5), (2,7), (5,10)]))
@sxs155933: Dijkstra is good on graphs with a weight function that will result in different positive weights for edges. if weights are always the same, Dijkstra isn't helping, it's actually just an over complicated BFS - so it doesn't hurt either, except maybe of the performance hit of the heap in the prio queue... A* with its heuristics is a natural enhancement of Dijkstra because it does not only look at the already explored paths an pick the shortest of those, but it does as well "guess" the unexplored and use that hint as well for selecting the best node. That's all the magic... I mean if you remove all the boilerplate and comments from my code above, it's maybe 10 lines of code... and once you understand it, it kind of hurts seeing people runing BFS or Dijkstra on a map problem... that's why I tried to print the map with the explored nodes, compare it to BFS, you'll be surprised.
But anyway, for the interviews you probably don't need it, so it's a question on how to optimize your personal strategy
@NoOne: it's then just linear runtime O(dividend/divisor) whereas aone did actually place a logarithm in there (O(lg(dividend/divisor)).
1. terminology:
quotient = dividend / divisor
2. assumptions, constraints:
a) integer values (including signs)
b) positive and negative
c) since python, integers are 'unbound' no need to check overflow, but with Java or c++ you should check overflow, actually!
3. solution
1) turn the signs around until only positive integers remain
2) now do a 'high school division' on paper kind of thing in base 2, e.g.
1001001101 : 101 = (1 + 4 + 8 + 64 + 512) / 5
1111234567
1) 1001 / 101 = 1 remainder 100
2) 1000 / 101 = 1 remainder 11
3) 110 / 101 = 1 remainder 1
4) 11 / 101 = 0 remainder 11
5) 111 / 101 = 1 remainder 10
6) 100 / 101 = 0 remainder 100
7) 1001 / 100 = 1 remainder 100
= 1110101 Remainder 100
def division(dividend, divisor):
# division by zero case
if divisor == 0: raise ZeroDivisionError()
# turn signs to get rid of -dividend, -divisor
if dividend < 0: return -division(-dividend, divisor)
if divisor < 0: return -division(dividend, -divisor)
# sdiv = divisor * 2^n, sdiv > divisor and sdiv <= divisor * 2
sdiv = divisor
while sdiv < dividend:
sdiv <<= 1
# remove divisior * 2^i from dividend until divisor * 2^0
# therefore the quotient is multiplied by two in each iteration
quotient = 0
while sdiv >= divisor:
quotient <<= 1
if dividend >= sdiv:
dividend -= sdiv
quotient |= 1
sdiv >>= 1
return quotient
print(division(981, 12) == 81)
print(division(11, 12) == 0)
print(division(13, 13) == 1)
print(division(99191, 15) == 6612)
print(division(-1, 1) == -1)
print(division(1, -2) == 0)
print(division(2, -1) == -2)
print(division(-9, -3) == 3)
@sxs155933: this is correct, although you need a measure to find out for each origin if you already visited that field. So, count is good to find out if completed, but you will still need a "visited"-flag per origin...
something like this, hope I didn't type to many bugs in, but it should definitely transport the idea... on a generic graph, one need to adapt to the matrix thing
from collections import deque
def find_center(adjacents, origins):
# adjacents being a dictionary of list {0: [1,2,3], 4: [1, 3], 5: [1]} for a directed graph
# origins a list of verices [0, 2]
queue = deque([(o,o,0) for o in origins]) #(original origin, current node, distance)
visited = {} # dictionary of dictionary, vertex as 1st key, origin as 2nd key, distance as value
while queue:
origin, current, dist = queue.popleft()
if len(visited[current]) == len(origins):
return sum(visited[current].values) # done, return sum of path lengths
adjs = adjacents.get(current)
if adjs is not None:
for next in adjs:
vis1 = visited.get(next)
vis2 = None
if vis1 is not None:
vis2 = vis1.get(origin)
else:
vis1 = {}
visited[next] = vis1
if vis2 is None:
vis1[origin] = dist + 1
queue.append((origin, next, dist + 1))
@PraTrick:
1) how can you reply, I can't... pressing "submit" just doesn't do anything.
2) your question: well, 2+1+1+1*9 is not (2+1+1+1)*9 but (1*9)+2+1+1+1 because * takes precedence over + that's why it should use 2*1*1*1*9 ...
of course you can simplify by telling you ignore operator precedence, but that you really must state, it's a major simplification and a "broad" assumption
The whole thing was much easier when precedence wouldn't matter, but I thought, I'd want to stay with basic arithmetic's
@PraTrick: just saw, you removed your question ;-) never mind :-)
Assumptions:
- add a single {+,-,*,/} between two numbers forming an algebraic expression that should be maximized: (n-1) choices
- *,/ takes precedence over +,-
- the order in which the numbers occur is given by the array, no re-ordering
- positive and negative numbers occur
Solution:
First, I just write down the recursion, by assumeing x/0 as not defined and
thus not valid in integer space (one could also argue x/0 = + infinite...)
rec(prod, i) =
max [
prod + rec(array[i], i + 1),
prod - rec(array[i], i + 1),
rec(prod * array[i], i + 1),
rec(prod / array[i], i + 1), if array[i] != 0, can help
] if i < n
prod if i = n
this will do it... it's exponential: O(4^n).
Greedy choice won't help because e.g.:
-5 * 5 * 5 * -1 > -5 + 5 + 5 + -1
but
-5 * 5 < -5 + 5
etc...
or
2 * 1 * 4 > 2 + 1 + 4
but
2 * 1 * 2 < 2 + 1 + 2
etc...
now one can go and chase down cases where fewer
choices will do it in the recursion. For example, it's not right away
obvious when division helps at all and there might cases one doesn't
need to explore the division part of the branch.
the recursion has two arguments, so, I'd create a table like the one
below and further analyze...
case |prod | array[i] | {operations}, comment
----------------------------------------------------------
1.a |= 0 | = 0 | {+}, doesn't matter, + will do it
b | | = 1 | {+,-}, +: e.g. 0+1*8, -: e.g. 0-1*-8
c | | = -1 | {+,-}, see above
d | | > 1 | {+,-}, see above
e | | < -1 | {+,-}, see above
----------------------------------------------------------
2.a |= 1 | = 0 | {+}, (see case 1.a)
b | | = 1 | {+,-}, +: e.g. 1+1*8, -: e.g. 1-1*-8
c | | = -1 | {+,-}, see above
d | | > 1 | {+,-}, see above
e | | < -1 | {+,-}, see above
----------------------------------------------------------
3.a |= -1 | = 0 | {+} (see case 1.a)
b | | = 1 | etc.
c | | = -1 | ..
d | | > 1 |
e | | < -1 |
----------------------------------------------------------
4.a |> 1 | = 0 |
b | | = 1 |
c | | = -1 |
d | | > 1 |
e | | < -1 |
----------------------------------------------------------
5.a |< -1 | = 0 |
b | | = 1 |
c | | = -1 |
d | | > 1 |
e | | < -1 |
there are many cases, not sure this leads to an elegant solution. How ever
one would figure out if '/' is useful at all...
or, when I was more familiar with existing NP complete and NP hard problems
I would start a discussion to prove the thing is NP complete or NP hard.
or I would ask for a hint, but I'm quite not sure it exists
or I'd start to add constraints, like no 0, no negative numbers, no 1s etc... to
simplify the problem
anyway, here the exponential, recursive version in python 3 with an incomplete
list of tests.
def max_number(arr):
def sub_problem(prod, i):
if i == len(arr):
return prod
c = max(prod + sub_problem(arr[i], i + 1), prod - sub_problem(arr[i], i + 1), sub_problem(prod * arr[i], i + 1))
if arr[i] != 0:
return max(c, sub_problem(prod / arr[i], i + 1))
else:
return c
return sub_problem(0, 0)
print(max_number([2,1,1,1])) #2+1+1+1 = 5
print(max_number([2,1,1,1,9])) # 2*1*1*1*9 = 18
print(max_number([2,1,-2])) #2+1--2 = 5
print(max_number([2,0,2])) # 2+0+2 = 4
print(max_number([-5,1,1,1,1,-3])) # 15
I assume, target is the sum of all fields visited by traveling from an arbitrary start to an arbitrary end coordinate using any possible path in the matrix.
The question needs clarifications like:
- which movements to form a path (what is adjacent to a field, e.g. left, right, top, down)
- can a field be visited more than once e.g. by going forth and back between two fields
- are there negative values in the matrix
approaches I thought of:
1) think about visiting all fields in a zig zag pattern, then start by looking for a sub-array in this array.
This would be possible in O(n*m) but it would reduce allowed paths
2) if the elements are unique, it's a sub-set sum problem, which is NP complete, how ever, not every sub set will form a path. There is a pseudo exponential algo to solve the subset problem, but then one would need to check if any of the subsets summing up to target would form a path. Feasible but dirty to code.
3) try all the paths that cover the whole field and then find a sub-array in the path that results in the desired sum. The problem here is there are many path's roughly 3^n*m from a single starting point and all starting points must be tried. So there are O(3^(n*m)*n*m) paths to explore. I'd go for that one although I'm not happy with it...
Since not the whole path is used but only a sub-path of it, the below code does check continuously whether any sub-path of the currently explored path will satisfy the target. The method is known as finding a sum in a sub-array and can be done in O(length(array)).
Its logic is that to achieve a target value within a sub-array, one has a part of the array on the left that is not used (maybe empty-array) to sum up a sub array in range (0, i) to target. if I continuously collect the running sum and put it into a HT, I'll find that left part if it exists:
run_sum = 0
sums = set([0])
for i in range(0, n):
run_sum += array[i]
if run_sum - target in sums:
return True
sums.add(run_sum)
the rest is dfs using recursion, a stack implementation is possible (was actually my first attempt) but I found it slightly difficult to read
in python 3
def find_sum_in_matrix(matrix, target):
def dfs(pos, run_sum):
# visit matrix at pos and check if the sum needed to reach target was encountered
run_sum += matrix[pos[0]][pos[1]]
if run_sum - target in sums:
return True
# check if a new sum has been built that is a sub-array on the left of the current path
new_sum = run_sum not in sums
if new_sum:
sums.add(run_sum)
# I can't visit that node again, since I only check paths that visit a given node only once
visited.add(pos)
for move in [(1,0), (-1,0), (0, 1), (0, -1)]:
next = (pos[0] + move[0], pos[1] + move[1])
if next[0] >= 0 and next[1] >= 0 and next[0] < m and next[1] < n and next not in visited:
if dfs(next, run_sum):
return True
# if the runing sum was added on this level, need to remove it
if new_sum:
sums.remove(run_sum)
# this branch of the DFS is finished, so I free up this node
visited.remove(pos)
return False
n = len(matrix)
m = len(matrix[0]) if n > 0 else 0
if m == 0: return False
sums = set([0])
visited = set([(0, 0)])
for x in range(0, m):
for y in range(0, n):
if dfs((x, y), 0):
return True
return False
print(find_sum_in_matrix(\
[[1,1,1],\
[1,1,1],\
[1,1,1]],\
10)) # False
print(find_sum_in_matrix(\
[[1,1,1],\
[1,1,1],\
[1,1,1]],\
9)) # True
print(find_sum_in_matrix(\
[[1, 5, 10],\
[1, 1, -1],\
[1, 1, 5]],\
20)) # True
print(find_sum_in_matrix(\
[[1, 5, 10],\
[1, 2, -2],\
[1, 1, 5]],\
25)) # True (must start at (2,2) or (0, 2))
Assumption for simplicity:
1) every letter is used at most once in the upper cased input string
"AaB" wouldn't be a valid input, "AbC" is ok
2) input string is a mix of lower and upper case letters (no numbers, etc.)
do permutations by swaping elements in a recursion:
recursion(string, pos) = string is a permutation, if pos = 0
recurse(swap(string, pos, j), pos - 1),
for j = pos - 1 to 0
swap(string, i, j): creates a new string that is equal to string with
characters at position i and j exchanged.
the recursion tree would look like this:
pos = 2 "ABC"
------------------------------------
/ | \
pos = 1 "ABC" "BAC" "CBA"
------ ------ ------
/ \ / \ / \
pos = 0 "ABC" "ACB" "BAC" "BCA" "CBA" "CAB"
now, adding the upper / lower case is just similar, the recursion then
is:
recursion(string, pos) = string is a permutation, if pos = 0
recurse(swap(string, pos, j), pos - 1),
for j = pos - 1 to 0
recurse(swap_lower(string, pos, j))
for j = pos - 1 to 0
swap_lower is like swap, but it will convert the character that is
placed at position pos to lowe case
it asumes the input string is all upper case
this is O(n!*2^n), where n is the length of the string or other way
to think of, there are n! permutations and for each of this every
letter can be upper or lower case, so another 2^n
'''
in python 3
def permutation_case(input):
def aux(i):
if i == n:
print(''.join(map(lambda x: chr(x), input_a)))
else:
for j in range(i, n):
# swap to get a changing prefix and recurse
input_a[i], input_a[j] = input_a[j], input_a[i]
aux(i + 1)
# change upper, lower case of the just swaped element in prefix and recurse
temp = input_a[i]
input_a[i] = input_a[i] + (ord('a') - ord('A'))
aux(i + 1)
# set the old 'string' back
input_a[i], input_a[j] = input_a[j], temp
n = len(input)
input_a = bytearray(input, 'ascii').upper()
aux(0)
permutation_case('aBc')
I assume the question is, where do I need to place the boss, so that it minimizes the total distance he has to walk if he walks to each office space once.
Or, more formal, find the field that minimizes the sum of the shortest path from this field to each office space.
First approach is to run a BFS simulatenously (increment distance one by one) from each office space. The field that is first visited by all BFSs is the place for the boss.
This is inefficient but guarantees the best solution. More efficient solutions are thinkable, such as using heuristics to cover only parts of the map (like in A*). Another approach might be to run A* towards the center of gravity of all office spaces, assuming that there are no closed-off areas (like a rectangles surounded by walls). Maybe one can even change the A* behavior by adapting the center of gravity based on the progress of the simultaneously running A*'s etc.
A similar but less difficult question was:
[https][://][www]careercup.com/question?id=5084886537863168
which gives an idea about A*.
How ever, in an interview I'd probably go for save grounds and use BFS from multiple origins.
create permutations of a number looks easy, but actually we should
take care of special cases. Let's assume, it's about positive
integers (no decimal and no sign). Let's further assume, the integer
may allow repetition of a digit (e.g. 112) and it's desired to get
only unique permutations.
I work with strings here, because it's simpler to understand / work.
The problem becomes: find all unique permutations of a string when
I tolerate leading '0's in input and output.
1) let's assume the string has only unique digits (e.g. "123") in
it. The recursion would look:
recursion(string, pos) = string is a permutation, if pos = 0
recurse(swap(string, pos, j), pos - 1),
for j = pos - 1 to 0
swap(string, i, j): creates a new string that is equal to string with
characters at position i and j exchanged.
the recursion tree would look like this:
pos = 2 "123"
------------------------------------
/ | \
pos = 1 "123" "213" "312"
------ ------ ------
/ \ / \ / \
pos = 0 "123" "132" "213" "231" "312" "321"
2) ensure uniqueness of the result if the input contains repeating digits:
one way to do that is by maintaining a set of generated results and then
only create a result if it's not already in that set. That's fine, but
looking up the hash table will require to create a hash and thus it
will not behave ideally at runtime to O(n*n!) instead of O(n!)
It's okay, but not elegant. A more elegant approach for me was to better
understand the recursion and prevent duplication out of this understanding.
What happens with the recursion is, that it starts with a suffix (or
prefix in 'illustrations' above) of size 0 and grows that prefix
("", "1", "12", "123") while it recurses on the suffix which creates all
permutations of that suffix. That said, if the prefix repeats,
the suffix, will be a permutation of an other suffix, for the same prefix
that results in duplicates. So, it seems it's enough to avoid repeating
prefixes since for the suffixes all permutations will anyway be created.
the creating of a repeated prefix is simple, just avoid placing the same
character twice to the same position when extending a prefix.
e.g. when extending "" with "1" it's "1" and if I extend it again with "1", I
have a repeating prefix:
pos = 2 "112"
-----------------------------------------
/ X: 2nd 1 \
pos = 1 "112" "211"
------ ------
/ \ / X: 2nd 1
pos = 0 "112" "121" "211"
or an other example
pos = 2 "100"
-----------------------------------------
/ | X
pos = 1 "100" "010"
----- ------
/ X / \
pos = 0 "100" "010" "001"
this will as well create a better runtime in case n > E, where
E is the number of characters in the alphabet (10 in the case of numbers)
and n the length of the string.
Because it's O(max(E^n,n!)), for reasonable small alphabets, for example
binary, this is then O(2^n) vs. O(n!) - an improvement.
the code in python 3
def unique_permutations(str):
def aux(pos):
if pos == 0:
# create string from byarray and append to results
result.append("".join(map(lambda x: chr(x), bstr)))
else:
used_chars = set()
for j in range(pos, -1, -1):
if bstr[j] in used_chars:
continue
used_chars.add(bstr[j])
bstr[pos], bstr[j] = bstr[j], bstr[pos] # swap
aux(pos - 1)
bstr[pos], bstr[j] = bstr[j], bstr[pos] # swap back
result = []
bstr = bytearray(str, 'ascii')
aux(len(bstr) - 1)
return result
print(unique_permutations("123"))
print(unique_permutations("111"))
print(unique_permutations("100"))
print(unique_permutations("1010101"))
print(unique_permutations("0000001"))
#output:
#['123', '213', '132', '312', '321', '231']
#['111']
#['100', '010', '001']
#['1010101', '0110101', '1100101', '1001101', '0101101', '0011101', '1011001', '0111001', \
'1101001', '1110001', '1010011', '0110011', '1100011', '1001011', '0101011', '0011011', \
'1000111', '0100111', '0010111', '0001111', '1010110', '0110110', '1100110', '1001110', \
'0101110', '0011110', '1011010', '0111010', '1101010', '1110010', '1011100', '0111100', \
'1101100', '1110100', '1111000']
#['0000001', '0000010', '0000100', '0001000', '0010000', '0100000', '1000000']
@NoOne, interpreting your code:
1) crate a set of all seen letters
2) create a list of that set
3) sort that set according to what? Here it seems, a topological sort is needed (of a DAG...) ... since from the vocabulary you get only pairs like 3 --> 4, and 3 --> 1, 4 --> 1 and 2 --> 1
so 3 --> 4 --> 2 --> 1, if --> indicates "comes before"
cool problem, twist one's head a bit, specially after birthday party
Clarification:
- string with single digit numbers between 0 and 9, no negative numbers (no '-' in the string)
- algebraic interpretation is infix where * takes preceedence
- the order in which the numbers occure is given by the string sequence (no re-ordering)
Solution:
- 0: a * 0 = 0 but n + 0 = n, always pick '+' before and after a '0'
- 1: a * 1 < a + 1, but 9*1*5 > 9+1*5, assuming we give multiplication preceedence
Samples:
- 1*2 = 2
1+2 = 3
- 5+1+1+2 = 9
5*1*1*2 = 10
Therefore, the rules is not as easy as placing +-ses around 0 and 1s
the working recursion, supposing array[i] contains the 0 <= integer <= 9 is:
- sub_problem(prefix_prod, i) =
max [
prefix_prod + sub_problem(array[i], i + 1),
sub_problem(prefix_prod * array[i], i + 1)
] if i < n
prefix_prod if i = n
- this recursion works for 0's and 1's automatically, but it's inefficient
because for every number it will make the choice '+' or '*' which will
create two branches at each level, thus leading to O(2^n) runtime -
not good -
- special case 1:
if num is a '0' we do not need to explore the '*' branch because it will loose
add
prefix_prod + sub_problem(array[i], i + 1) if i < n and array[i] = 0
to the recursion to handle this special case more efficient
- special case 2:
if num is > 1 and prefix_prod > 1 it's clear that multiplication will win
add
sub_problem(prefix_prod * array[i], i + 1) if i < n and array[i] > 1 and prefix_prod > 1
to the recursion to handle that special case
- special case 3:
finally, if num is 1 and prefix_prod is 1 it's clear that multiplication will loose
because k*1 + anything > 1^k + anything
add
prefix_prod + sub_problem(array[i], i + 1) if i < n and array[i] = 1 and prefix_prod = 1
things like "3111111111111111111112" will still lead to exponential behaviour depending on
the length of the '1's sequence. Of course one can improve this with more sophisticated
"look ahead" etc.
in python 3
def max_number(str):
def sub_problem(prefix_prod, i):
# base case
if i == n:
return prefix_prod
# since input is a string, num = array[i] from above
num = ord(str[i]) - ord('0')
# special case 1 and 3 (see above)
if num == 0 or (num == 1 and prefix_prod == 1):
return prefix_prod + sub_problem(num, i + 1)
# special case 2 (see above)
if num > 1 and prefix_prod > 1:
return sub_problem(prefix_prod * num, i + 1)
# recursion if it's not decidable whether to add or multiply
return max(prefix_prod + sub_problem(num, i + 1), sub_problem(prefix_prod * num, i + 1))
n = len(str)
return sub_problem(0, 0)
print(max_number('2111')) #2+1+1+1 = 5
print(max_number('21119')) # 2*1*1*1*9 = 18 vs. 2+1+1+1+9 = 14
print(max_number('212')) # 2+1+2 = 5 vs. 2*1*2 = 4
print(max_number('202')) # 2+0+2 = 4 vs. 2*0*2 = 0 or 2+0*2 ...
print(max_number('89')) # 72
output:
5
18
5
4
72
@missing, how can you pass 1,1,1,1,1,1,0,0,0,0? please provide the jumps
best I see from your description is 0-->2, 2--> 5, 5 --> 9, but 9 is 0, so can't
other interpretations of your question may be: 0-->3, 3-->7, but 7 is 0, so can't pass
I don't see how you can pass that array with the problem statement given, please clearify
EDITED July 7th '17:
- changed the assumption about start condition
- added the decrease jump into the recursion
- specified more clearly assumption what "jump of 1" means
- corrected some typos
Assumption:
- the first field must be 1, because that's where it starts, then one can decide to
jump to the 2nd field (keep jump of 1) or the 3rd (increase)
- jump of 1 means jumping from e.g. index 0 to 1, increase that jump by 1, it jumps
from 1 to 3 etc.
Solution:
First start by writing down the recursion of the problem:
recursion = dp(i, j) = max[
dp(i + j, j), // if i < n and arr[i]
dp(i + j + 1, j + 1), // if i < n and arr[i]
dp(i + j - 1, j - 1) // if i < n and arr[i] and j > 1
] // max just acts as or, since dp(...) will return True / False
False, if i < n and not arr[i]
True, if i >= n
i = position, j = jump, n = array length, arr = input-array
passable = dp(0, 1)
this has O(3^n) time complexity, because at most recursion step, it spans three
execution, so it doubles at every step and there are n steps. space is O(n),
used by call stack. There is a tighter bound although.
improvement is memorization which brings it down to O(n^2) time complexity.
alternatively an iterative solution can be created with O(n^2) time and
space complexity.
in python 3
# the recursive solution
def is_passable(arr):
def dp(i, j):
if i >= n:
return True # it
if not arr[i]:
return False # landed on 0
key = i * (n + 1) + j
val = memo.get(key)
if val is not None:
return val
val = dp(i + j, j) # try with same distance jump
if not val: # if not successful, try increasing jump
val = dp(i + j + 1, j + 1)
if not val and j > 1: # if not successful and jump can be decreased
val = dp(i + j - 1, j - 1)
memo[key] = val
return val
n = len(arr)
memo = {}
return dp(0, 1)
# the more elegant iterative solution
def is_passable_itr(arr):
n = len(arr)
dp = [[False for i in range(n + 2)] for j in range(n)]
dp[0][1] = True # start condition
for i in range(1, n):
if not arr[i]:
continue
for j in range(1, i + 1): # this i + 1 is bad, because at i = 27, max jump is 8 ... etc...
if dp[i - j][j] or dp[i - j][j - 1] or dp[i - j][j + 1]:
dp[i][j] = True
if i + j + 1 >= n:
return True
return False
print(is_passable([1,1,0,1,0,0,1,0,0,1])) # sample with increasing steps
print(is_passable_itr([1,1,0,1,0,0,1,0,0,1]))
print(is_passable([1,1,1,1,1,1,0,0,0,0])) # sample given by missing, that shouldn't work according to my understanding of the problem
print(is_passable_itr([1,1,1,1,1,1,0,0,0,0]))
print(is_passable([1,0,1,0,0,1,0,1,0,0,1])) # sample with increasing steps, decreasing and increasing step
print(is_passable_itr([1,0,1,0,0,1,0,1,0,0,1])) # sample with increasing steps, decreasing and increasing step
#output:
#True
#True
#False
#False
#True
#True
with the described DS string concatenation will require O(n+m) for every concatenation operation, where n is the length of one and m the length of the other string. It creates a new array and copies the existing twos in there. If appending many strings to one string that keeps growing, then m << n. Now it's worth implementing some kind of smart buffering, where the concatenated string allocates more memory than needed for a single append operation so a some concatenation or append operations can be done without reallocation. Of course that needs some balance (wasting memory vs. reallocation and copying). Having an immutable string object and a mutable string builder seems to make sense. With the right memory allocation strategy one can get a O(m) where m << n, runtime compensated.
An other alternative might be to have a concatenated string class, which is a list that holds only pointers to the original, immutable string objects...
If it was one start and one end-point in a graph one would do
BFS either from start or two BFSs from start and end which is
considerably faster (one origin vs. two origin)
But this question is for multiple end-points from a single
start in a map. Let's assume one can go left, right, up, down
(from 0,0 to 1,1 is two steps, not one --> manhatten distance)
The problem with BFS on a map is, that it will explore into all
direction with equal priority which builds up a big frontier
very quickly and explore therefore a lot of space. A* helps here
because it will introduce a heuristic that will pull the
exploration to the target (similar like hill climbing, with
similar issues). If the heuristic is guaranteed to not over-
estimate the effective distance it will find the shortest path
much faster. If the heuristic overestimates, it gets faster
(explores less left and right) but is not guaranteed to find
the shortest path.
The question with many destinations (houses) and a single
start, what is the best approach?
1. Single origin BFS vs. BFS originating at all houses and start:
if many houses form a narrow cluster in the map and the start is
far away, that's the best case for single origin because it will
more or less discover all houses together.
But the effort for single origin grows exponentially with the
distance from origin as the frontier advances vs. a factor for
each house. Mutliple origin is faster.
2. Multi origin vs A*
A* can't be done starting from start and end because there is no
guarantee the two frontiers will meet and form a shortest path.
It will depend on the size of the map and the amount of trees. A*
in worst case is worse than BFS due to the priority queues.
One can construct an example that is significant slower with A*.
the core of the A* version is (fully commented version further below):
def find_shortest_path_to_all(map):
def estimated_distance(start, end):
return abs(start[0] - end[0]) + abs(start[1] - end[1])
tot_distance = 0
for house in map.houses:
distance_to_house = -1
explored = {}
queue = [(0, 0, map.start)]
while queue:
top = heapq.heappop(queue)
if top[2] == house:
distance_to_house = top[1]
break
for v in map.get_adjacents(top[2]):
d_v = top[1] + 1
d_v_tot = d_v + estimated_distance(v, house)
current_d_to_v = explored.get(v)
if current_d_to_v is None or current_d_to_v > d_v:
explored[v] = d_v
heapq.heappush(queue, (d_v_tot * 100000 - len(queue), d_v, v))
if distance_to_house == -1:
return -1
else:
tot_distance += distance_to_house
return tot_distance
Below an implemenation using a simple A* implementation printing the map and explored
coordinates.
in python 3
import random
import heapq
def find_shortest_path_to_all_astar(map):
def estimated_distance(start, end):
return abs(start[0] - end[0]) + abs(start[1] - end[1]) * 1 # replace 1 with 2 to see effect when overestimating
tot_distance = 0
for house in map.houses:
distance_to_house = -1
explored = {}
queue = [(0, 0, map.start)] # start-point, with highest prio and distance 0 from origin (start)
while queue:
top = heapq.heappop(queue) # most promising option to further explore
if top[2] == house: # reached the house?
distance_to_house = top[1] # reached house
break
explored[top[2]] = -1 # explored the node (coordinate)
for v in map.get_adjacents(top[2]):
d_v = top[1] + 1 # distance to next node v = distance to current + 1
d_v_tot = d_v + estimated_distance(v, house) # distance to v + estimated distance to house
visit_d = explored.get(v) # check if already explored or placed in queue for exploring
if visit_d is None or visit_d > d_v: # only if we can reach with shorter distance
explored[v] = d_v # distance needed to reach v
heapq.heappush(queue, (d_v_tot * 100000 - len(queue), d_v, v))
# - len(...): is a trick to prefer recently added coordinates over existing with same estimate to house, this will
# force the algorithm to stay with a branch instead of going through all branches of same estimated distance
# and exploring them in parallel...
if distance_to_house == -1:
print('cant reach house {0}, map:'.format(house))
map.print(explored)
print('')
return -1 # abort, return -1
else:
tot_distance += distance_to_house
print('{0} to reach house {1}, map:'.format(distance_to_house, house))
map.print(explored)
print('')
return tot_distance
# represents the map (houses and trees)
class Map:
# generates a reproducable, random map
def __init__(self, m, n, no_trees, no_houses):
self.m = m
self.n = n
self.matrix = [[' ' for j in range(n)] for i in range(m)]
self.houses = []
random.seed(1317)
for i in range(no_trees):
pos = self._get_random_unused_coordinate()
self.matrix[pos[0]][pos[1]] = 'T'
for i in range(no_houses):
pos = self._get_random_unused_coordinate()
self.houses.append(pos)
self.matrix[pos[0]][pos[1]] = 'H'
self.start = self._get_random_unused_coordinate()
self.matrix[self.start[0]][self.start[1]] = 'S'
def _get_random_unused_coordinate(self):
while True:
x = random.randrange(0, self.m)
y = random.randrange(0, self.n)
if self.matrix[x][y] == ' ':
return (x, y)
def print(self, explored=None):
for y in range(self.n):
for x in range(self.m):
if self.matrix[x][y] == ' ':
state = explored.get((x,y))
if state == -1:
print('e', end='') # explored
elif state is not None:
print('´', end='') # in the queue but not explored
else:
print(' ', end='')
else:
print(self.matrix[x][y], end='')
print('')
def get_adjacents(self, pos):
adj = []
for d in [(0,1),(0,-1),(1,0),(-1,0)]:
nx = pos[0] + d[0]
ny = pos[1] + d[1]
if nx >= 0 and ny >=0 and nx < self.m and ny < self.n and self.matrix[nx][ny] != 'T':
adj.append((nx, ny))
return adj
m = Map(80, 25, 150, 5)
find_shortest_path_to_all_astar(m)
@funk
reading your posts, a solution the interviewer was trying to push you for, was to store pairs of (index, value) in a vector sorted by index, so A = [15, 0, 0, 0, 2, 3] would become [(0, 15), (4,2), (5, 3)]
if you have two of them, the dot product is something like a merge of two sorted lists (A, B are such vectors of pairs sorted by index and R is the result with the same properties)
a = 0
b = 0
R = []
while a < len(A) and b < len(B):
if A[a][0] > B[b][0]:
b += 1
elif A[a][0] < B[b][0]:
a += 1
else R.append((A[a][0], A[a][1]*B[b][1]))
It has the same O-properties as your solution but I'd expect it to runs faster in practice because the constants involved in the O(n) are smaler (less work to lookup a hash, sequential memory read (cache))...
- Chris July 06, 2017Solution:
- Two reasonable easy solution exists with O(n^2). One using extra space
and the other doesn't.
- Input is an array arr of length n
- Sum to reach is x
1. Solution with hash table: O(n^2) time, O(n) space
- arr[a] + arr[b] + arr[c] = sum, where 0 <= a < b < c < n
- hash table val_idx contains the all the values of arr
with the last index they occured
- Find all combinations for a and b and check if sum-arr[a]-arr[b]
is in the HT with an index of last occurence > b
2. Solution with sort: O(n^2) time, O(1) space:
- sort the array, now arr[a] <= arr[b] <= arr[c] if 0 <= a < b < c
- pick a in order, so range [a+1..n) contains c and b
- search in (sorted) [a+1..n) for y = sum - arr[a]:
(well known sum of two in sorted array question)
start with b = a+1 and c = n-1
while b < c:
# if the smallest and largest are to big,
# the largest has no space
if arr[b] + arr[c] > y: c -= 1
# if smallest and largest are too small,
# no combination with the smallest will
# work out
elif arr[b] + arr[c] < y: B += 1
else: found 3-sum
notice that some work is still done multiple times.
So it seems further optimization is possible - but not trivial.
python 3
def sum_of_three_ht(arr, x):
val_idx = {}
for a, va in enumerate(arr):
val_idx[va] = a
for a, va in enumerate(arr):
for b in range(a + 1, len(arr)):
c = val_idx.get(x - va - arr[b])
if c is not None and c > b:
return (True, (x, va, arr[b], arr[c]))
return (False, (x, None, None, None))
def sum_of_three_sort(arr, x):
n = len(arr)
arr.sort()
for a, va in enumerate(arr):
b = a + 1
c = n - 1
y = x - va
while b < c:
vbc = arr[b] + arr[c]
if vbc < y:
b += 1
elif vbc > y:
c -= 1
else:
return (True, (x, va, arr[b], arr[c]))
return (False, (x, None, None, None))
Normally u would use a set for that. But if you can restrict the range of characters to ASCII (vs. Unicode for example) you can use an array of booleans or a bit vector. Just mark every character as seen[S.charAt(I)] = true after checking seen[s.charAt(I)] ... I think you get the idea. This has time complexity O(n) and space complexity O(n).
Your solution is of space complexity O(1) but time complexity O(n^2). That square comes from contains that will loop from 0 to i, where i is looped from 0 to n. So n^2 / 2 to be exact...
edit:
no, it's not O(n^3) because substring(..).contains() is not nested (although a bit unnecessary copying...)
if all is positive: a*b*c is largest if you pick the largest three numbers
if negative numbers are alowed, you have two cases, the product of the two smallest and the largest or the product the three largest
to find the three largest and two smallest numbers iterate over the array and pick this numers out (as if you would only scan for max or min). The thing is k-Selection and leads to O(n*k), but if k is constant, it's O(n) and if k is small it's reasonable efficient (among others due to caches on the CPU)
then calculate the two cases and return the max of them.
done.
if you need coaching... ;-)
a number can be a real, ration, integer or natural number:
let's assume we want at least integers:
- a decimal representation (is slower, has little advantage, maybe printing is easier). Here we use typically 4 bits per decimal.
- a binary representation where you use a sequence of machine word that is dynamically grown if more space is needed (the number gets bigger)
- in any case, you need a bit/flag for the sign
With rational numbers, that's just having a nominator and denominator of type integer and you get along with high-school math for all operations, maybe need to look up gcd algorithm.
real numbers grow in size and precision if you apply division. I'm not fit enough in number theory to answer how you need to implement the division so you can keep the precision. For the first division it's relatively simple, because you can easily identify the repeating sequence. But for example if you start calculating PI from a very long series ...?
how is addition done: basically how we learned it in high school decimal by decimal carrying an overflow consider signs. subtraction is very similar. How ever, with binary representation, instead of a decimal place use the machine word and the add / sub on the ALU of CPU which is faster than shifting around manually.
Multiplication is similar, early high-school method as is division. But both division and multiplication have more effective algorithms reducing the amount of machine-word multiplications/divisions dramatically. Multiplication is realtively easy to understand whereas division is more complex using FFT.
MIT OCW 6.006, Lecture 11 found on [http]ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/lecture-videos/lecture-11-integer-arithmetic-karatsuba-multiplication/ has some good first steps into the topic
Simply reversing doesn't produce the right answer: "dlroW olleH" but that wasn't the question. The question was to reverse the words but keeping the order of the words.
How ever the question left open if it's a java linkedlist (which is a doubly linked) or a custom singly linked list.
One way to do it, if space doesn't matter, is to traverse the list, put the characters of a word on the stack and if a space is reached, fill a new list with the elements on the stack by popping the elements (which reverses the order of this word)
however, a proper reverse would be inplace without stack, which is as well possible.
As much space complexity as possible? I'm not sure I understood that requirement, maybe it meant to use space as you like.
A working, not so clean solution would be (issues: redundant code, use of stack, creation of new list)
class Node {
Node next_ = null;
char value_;
public Node(char value) {
value_ = value;
}
public Node getReverseWords(Node n) {
result = null;
Node last = null;
stack<Character> s = new stack<Character>();
while(n != null) {
// read word onto stack
while(n != null && n.value_ != ' ') {
n = n.next_;
s.push(n.value_);
}
// put reveresed wod into different list
while(!s.empty()) {
if(last != null) {
last->next_ = new Node(s.pop());
last = last->next_;
} else {
result = new Node(s.pop());
last = result;
}
}
// copy spaces, whether they are trailing or leadind
while(n != null && n->value_ == ' ') {
if(last != null) {
last->next_ = new Node(' ');
last = last->next_;
} else {
result = new Node(' ');
last = result;
}
}
}
return result;
}
}
Very open question, I'd cover these aspects:
1. What is purpose of this test? Verify the search function, verify correct rendering, ... what is already tested? A single test should fit in some kind of test concept which gives an idea on what to test. A reasonable assumption could be, that search has been tested already on different known samples and information is around how the search has performed compared to this fixed set of data.
(notice, there is no perfect search result, but there will be indications whether it improved or not compared to other "generaions")
2. Define what must be tested, based on 1) e.g. verify that each items price is <= then the next, maybe for the first 20 pages, supposing it is very expensive or impossible to fetch all results. How ever, we still could miss a very low priced element that would only come after position 1000, or maybe even last due to an error. Now, several questions:
- sometimes search is designed to be fast vs. "perfectly accurate", that means, for the case of Google for example, that servers that participate in a query and do not respond within a certain time, will be ignored. Usually this is the case for not so important pages that have less redundancy etc. etc. So a certain error might even be tolerated.
3. Maybe we have statistical data, which gives a price distribution for the first 1000 items of a search and information how this distribution changed over time. Maybe we only want to accept a certain change in this distribution
4. Maybe the test should be around performance, e.g. how fast is the result served or how fast is the result rendered. Here we had more information, like a SLO (service level objective) where we'd say 99.9% of time the first 50 result must be served within 100 mS.
5. Or other tests could be that the system accepts "bad searches" which are searches that will tear down a server, etc. etc.
input: input string
1. keep a set with all characters in a window [s, e]
2. start with s = 0
3. increment e as long as input[e] not already in set
[s, e) is now a candiate, maximize it
now increment s until input[s] = input[e] and remove each input[s] from the set
go to 3
runs in O(n) with O(n) space for result
def find_longest_substring_wo_repeating_char(input):
n = len(input)
s = 0
e = 0
chars = set()
result = ''
while e < n:
while e < n and input[e] not in chars:
chars.add(input[e])
e += 1
if len(result) < e - s:
result = input[s:e]
if e < n:
while input[s] != input[e]:
chars.remove(input[s])
s += 1
chars.remove(input[s])
s += 1
return result
print(find_longest_substring_wo_repeating_char('aabbcde'))
print(find_longest_substring_wo_repeating_char('a'))
print(find_longest_substring_wo_repeating_char('ab'))
print(find_longest_substring_wo_repeating_char('abc'))
print(find_longest_substring_wo_repeating_char('abccc'))
print(find_longest_substring_wo_repeating_char('abcccdefg'))
# output:
# bcde
# a
# ab
# abc
# abc
# cdefg
decimal multiplication on string does it, although converting the string to binary and doing a binary multiplication would obviously be more effective
in python 3
def string_multiplication(a, b):
result = []
result_start_idx = 0
for ai in map(lambda x: ord(x) - ord('0'), reversed(a)):
overflow = 0
result_idx = result_start_idx
for b_idx in range(len(b) - 1, -1, -1):
m = (ord(b[b_idx]) - ord('0')) * ai + overflow
if result_idx < len(result):
m += result[result_idx]
result[result_idx] = m % 10
else :
result.append(m % 10)
overflow = m // 10
result_idx += 1
if overflow > 0:
result.append(overflow)
result_start_idx += 1
return ''.join(map(lambda x: chr(ord('0') + x), reversed(result)))
number_one = 193283492420348904832902348908239048823480823
number_two = 3248234890238902348823940990234
print(number_one*number_two)
print(string_multiplication(str(number_one), str(number_two)))
as mysql query
SELECT id, amount, supp_id FROM Invoice
WHERE year(date) = 2016 and amount = (select amount FROM Invoice WHERE year(date) = 2016 ORDER BY amount LIMIT 1);
and the DDL to create the schema
CREATE TABLE Supplier
(
id INT8 NOT NULL AUTO_INCREMENT,
name VARCHAR(32),
PRIMARY KEY (id)
);
CREATE TABLE Invoice
(
id INT8 NOT NULL AUTO_INCREMENT,
supp_id INT8,
date datetime,
amount decimal(6, 2),
-- payment_date datetime,
-- paid_amount decimal(6, 2),
PRIMARY KEY (id)
);
INSERT INTO Supplier(name) VALUES('chris');
INSERT INTO Supplier(name) VALUES('frank');
INSERT INTO Supplier(name) VALUES('jane');
INSERT INTO Invoice(supp_id, date, amount) VALUES ((select id from Supplier WHERE name = 'chris'), STR_TO_DATE('2017-2-21 16:00','%Y-%m-%d %H:%i'), 110.5);
INSERT INTO Invoice(supp_id, date, amount) VALUES ((select id from Supplier WHERE name = 'chris'), STR_TO_DATE('2016-5-11 16:35','%Y-%m-%d %H:%i'), 95.15);
INSERT INTO Invoice(supp_id, date, amount) VALUES ((select id from Supplier WHERE name = 'chris'), STR_TO_DATE('2016-4-09 10:19','%Y-%m-%d %H:%i'), 85.95);
INSERT INTO Invoice(supp_id, date, amount) VALUES ((select id from Supplier WHERE name = 'chris'), STR_TO_DATE('2016-7-09 9:00','%Y-%m-%d %H:%i'), 95.15);
INSERT INTO Invoice(supp_id, date, amount) VALUES ((select id from Supplier WHERE name = 'jane'), STR_TO_DATE('2016-1-09 08:15','%Y-%m-%d %H:%i'), 95.15);
The problem is somewhat ambigous. What's a redundant cash flow path?
I assume we just want to minimize the number of transactions.
After all transactions have been completed, every friend has a delta
neto cash flow. Thas is either positive, 0 or negative.
I put them into a dictionary, where the balance is key and the person
is the value.
Then I need to match positives to negatives untill all summs to 0. This
seems not trivial tough.
let's say positives are: 1, 9, 13
and negatives are: -14, -2, -7
one way is (arrow indicates cash flow, value in parentesis is new
value after transaction):
1 (0) --> -7 (-6)
9 (3) --> -6 (0)
3 (1) --> -2 (0)
1 (0) --> -14 (-13)
13 (0) --> -13 (0)
but more efficient is:
13 (0) --> -14 (-1)
9 (2) --> -7 (0)
2 (0) --> -2 (0)
1 (0) --> -1 (0)
the problem here is to find the best solution which seems not trivial.
I've implemented one that uses a greedy strategy which obviously won't
find the best solution all the time.
Appriciate a comment for the best solution, seems like an NP problem.
But it certainly removes "redundant transactions" whatever that might
be exactly.
import heapq
from collections import defaultdict
def minimize_transaction(transactions):
# transactions are tubbles with (from, to, amount)
# do all transaction which gives the neto_flow to and from each friends
neto_flow = defaultdict(int)
for transaction in transactions:
neto_flow[transaction[0]] -= transaction[2]
neto_flow[transaction[1]] += transaction[2]
# negatives and positives are min-heaps to be populated
negatives = []
positives = []
for person, amount in neto_flow.items():
if amount < 0:
heapq.heappush(negatives, (amount, person))
elif amount > 0:
heapq.heappush(positives, (-amount, person)) # python has a min heap per default...
# optimize the flow using the two heaps
while positives and negatives:
pos = heapq.heappop(positives)
neg = heapq.heappop(negatives)
amount = max(pos[0], neg[0]) # are both negative values
print("{0} sends ${1} to {2}".format(neg[1], -amount, pos[1]))
if pos[0] - amount < 0:
heapq.heappush(positives, (pos[0] + amount, pos[1]))
elif neg[0] - amount < 0:
heapq.heappush(negatives, (neg[0] + amount, neg[1]))
minimize_transaction([('jane', 'pete', 100),
('pete', 'tiffany', 80),
('tiffany', 'chris', 60),
('tiffany', 'jane', 10),
('chris', 'jane', 10),
('chris', 'pete', 10)])
#output
#jane sends $40 to chris
#jane sends $30 to pete
#jane sends $10 to tiffany
As Fernando pointed out, the problem is solvable in O(n)
More generic, the problem is a k-the element selection
problem that occures in different flavours.
If k is small compared to n, it's worth keeping the k elements
in sorted order to select the kth in O(1) at the end of the algo.
time complexity: O(n*k) that is O(n) if k is constant (e.g. 2)
space complexity: O(n)
n is the number of characters in the input string
from collections import defaultdict
def kth_most_repeating_word(str, kth):
i = 0
n = len(str)
word_freq = defaultdict(int)
kcount = [(0, '') for i in range(kth)]
while i < n:
# extract word
while i < n and str[i] == ' ': i += 1 #skip leading spaces
if i == n: break
j = i + 1
while j < n and str[j] != ' ': j += 1 #skip trailing spaces
word = str[i:j]
i = j + 1
# count word occurence
count = word_freq[word] + 1
word_freq[word] = count
# place the element in the sorted list of k length
if kcount[-1][0] < count:
# check if word is already in k-top (can be done more efficient)
k = len(kcount) - 1
for m in range(kth):
if kcount[m][1] == word:
k = m
break
# do "insertion sort"
kcount[k] = (count, word)
while k > 0 and kcount[k - 1][0] < kcount[k][0]:
kcount[k - 1], kcount[k] = kcount[k], kcount[k - 1] #swap
k -= 1
# return result
return kcount[-1][1] if kcount[-1][0] > 0 else None
assume n is the length of the sequence, potentially infinite, p is the percentile given
This question is quite interesting as NoOne pointed out it's not just selecting kth element due to:
- n is not known until the iterator finally terminates
- there is no random access to the elemnts
- the number of elements may not fit into memory
- k is not known, if n is not known, because k is given as percentile
Options:
1) convert to array and find k-th element: does not deserve further comments. But it's O(n) space and O(n) runtime with quickselect.
2) slect the kth element (with array, binary tree, etc.) takes O(n*p) space and O(n*p*lg(n*p)) time, since p < 1 O(n) and O(n*lg(n)) so, still not better with space if n is getting close to infinite
3) probabilistic approach 1:
model a PMF (probability mass function) that represents the data seen so far. If the values in the stream have a small range, I'd count the occurence of every value. Alternatively I'd create intervals which creates a histogram. I can calculate the "estimated" p*n th element, by going to the right interval. I can be more exact by interpolating the value using linear, spline, etc. interpolation within the interval of interest. Interpolation only works, if the PMF is nearly continous.
The size of the intervals will define the memory consumption and error.
If I know the data distribution in advance, e.g. that it's normal, poisson, ... I could even get along with only the mean and standard deviation, both of wich can be computed online
in O(1)
the only little bit interesting thing about singleton is the initialization with multiple threads. The trick is to first check the instance variable, if it is not null, return it. if it is null, lock a critical section, check again if the instance variable is not null and only initialize if null. Just don't do this with c++ on multi core platforms, because it won't work properly.
- Chris June 19, 2017hum, it's a parallel computing pattern. in it's most primitive way, it's a semaphore, that is released by the producer and waited for by the consumer. Then, a bit more sophisticated would be to do the same with a variable that carries some information. Next level would be with a buffer in between and then the interesting questions come along, what if the buffer is full, etc.
- Chris June 19, 2017the recursion.
Rec(n, f1, f2):
if n = 0: return 0;
c = 1 + rec(n-1, bread butter, f1)
If f1 != pizza: c += 1 + rec(n-1, pizza, f1)
If f2 != Burger and f1 != Burger c += 1 + rec(n-1, burger, f1)
Return c
then the solution is:
Rec(n, no, no)
this is O(3^n), can be O(n) if implemented with memoization or
as an iterative, array based solution
if you are fit with combinatorics, you might create a closed form
formula. Maybe someone takes the time for that, I found it a bit
complicated, given I don't like pizza or burger for breakfast...
No = 0
BreadButter = 1
Pizza = 2
Burger = 3
def numberOfWays4Breakfast(n):
memo = {}
def recursion(n, f1, f2):
if n == 0: return 0
key = n * 16 + f2 * 4 + f1
c = memo.get(key)
if c != None: return c
c = 1 + recursion(n - 1, BreadButter, f1)
if f1 != Pizza: c += 1 + recursion(n - 1, Pizza, f1)
if f2 != Burger: c += 1 + recursion(n - 1, Burger, f1)
memo[key] = c
return c
return recursion(n, No, No)
print(numberOfWays4Breakfast(3))
Well the question is what is changed, if there is a key / value pair, just locate the node and overwrite the value. If there's only one value that defines the trees structure, assuming it's a BST, this will always affect a large part of the tree. B-tree might be a solution then.
- Chris June 19, 20171. do word stemming (eat, eating, eats, ate --> eat) for each word in a twitter message
2. for each word-id (an integer now), increase the occurence count when a message is sent
3. for each word-id decrease the occurence when a message is older than 24 hours (e.g. a batch job identifying this messages)
4. over this huge vector (hashtable) either iterate frequently to pick the 10 biggest and cache that result or maintain the order using a tree (hastable -> tree-item, tree is sorted according to occurence)
this has the advantage that you can distribute it well, k servers can handle the m words, if you want to know the top 10 words, ask each server for the top 10 words and take the top 10 out of those responses...
This can mean anything, like the system was built and from the very beginning with one user it was slow, after introducing a new feature it got slow, after no obvious change it got slow, over night it got slow, ... there are thousand possible causes, ranging from classical scalabillity problems, over attacks, wrong configurations to programming errors or unexpected user behavior, etc. etc.
The approach to bring light into the dark would be:
1) Verify and understand the objectives
2) Based on data, create a thesis of what's wrong and prove it.
3) Solve it.
1: what does "slow" mean, what is the actual objective (e.g. 50 ms end-to-end in 99.9% of times, from clients not further away than 3000 miles, or it may be the serving time, from the moment the request starts to the moment the respond is sent out completely, etc. etc.) being not specific about the service level objective is usually the first cause of many downstream problems. So, figure out what's needed and what should be measured first.
2: If there is monitoring information available, try to understand which measurement changed in correlation of the "slow response" time or depending on the situation other potential evidence may be available.
an other approach may be to gather the people who built it, try to get their intuition on what's wrong, create a thesis and try to prove it. If that's not available either start with standard monitoring information from the webserver and the database, narrow down an isolated and reproducable case that is "slow", understand it, maybe instrument code and analyze accumulated runtimes in call trees, try to identify the component that causes the most delay etc. (work your way downwards)
I think the key is, do not change anything until you understand what the goal is and what causes the problem. There are too many people that start out based on their personal experience which might or might not apply in a specific situation.
do manual division and if the remainder reapears the pattern will repeat. The code below will put the repeating sequence in "()" e.g. 1/3 = 0.(3), 4 / 3 = 1.(3), 1/6 = 0.1(6)
public String fractionToDecimal(int numerator, int denominator) {
StringBuilder sb = new StringBuilder();
if((numerator < 0) ^ (denominator < 0)) sb.append("-");
long num = Math.abs((long)numerator);
long den = Math.abs((long)denominator);
long rem = num % den;
sb.append(num / den);
if(rem > 0) {
sb.append(".");
HashMap<Integer, Integer> seenRemDigitIdx = new HashMap<>();
while(rem != 0) {
int lastSeen = seenRemDigitIdx.getOrDefault((int)rem, -1);
if(lastSeen != -1) {
sb.insert(lastSeen, '(');
sb.append(')');
break;
}
seenRemDigitIdx.put((int)rem, sb.length());
rem *= 10;
sb.append(rem / den);
rem = rem % den;
}
} else if(sb.toString().equals("-0")) {
return "0";
}
return sb.toString();
}
I didn't really find a solution to get ALL possible combinations, maybe someone has an idea. how ever, the solution below will produce SOME combinations (as was asked for).
time complexity is O(#edges) until the results are produced which is linear in the number of results.
# Assumptions:
# 1) the example describes a DAG with one connected component. I assume the
# graph can be characterized in general like this: directed, a-cyclic, only
# one connected component, minimal two nodes
# 2) print some (and not all) possible topological orders
# 3) for simplicity the graph is given by edges [["Shirt", "Tie"], ["Belt", "Trousers"]]
#
# Solution:
# for any graph using DFS with arrival / departure time of a node can be used
# to analyze the graph and perform a topological sort. The problem is, this
# traversal creates only one possible topological sort. If I want an other,
# I have to change the start and the order in which I visit the adjacents.
# keeping track of this without doing way too much work is not trivial.
#
# Since it is a DAG there is an easier way to get some topological orders:
# 0. prepare the graph as adjacency list representation
# 1. find the nodes with in-degree = 0 (roots), give them order 0
# and start a BFS with those nodes as frontier
# 2. expand the frontier, do not check if a node was already visited
# just label the node with the distance from the root (no cycles)
# 3. create any order where the nodes distance from the root increase
from itertools import permutations
from collections import defaultdict
def allTopologicalOrders(edges):
# 0. create a adjacency list representation of the graph given by edges
nodes = defaultdict(set)
for e in edges:
nodes[e[0]].add(e[1])
# 1. find roots
roots = set(nodes.keys())
for u, adjs in nodes.items():
roots.difference_update(adjs)
# 2. expand until expansion is no further possible
# (loop will not terminate with cycles)
maxD = 0
nodeOrder = defaultdict(lambda: 0)
stack = list(roots)
while len(stack) > 0:
u = stack.pop() #node to expand from
du = nodeOrder[u] #distance of that node from "start""
for v in nodes[u]: #for all adjacent nodes v
dv = nodeOrder[v] #current distance of v from "start"
if dv <= du: # if v is currently on shorter distance from "start"
maxD = max(maxD, du + 1)
nodeOrder[v] = du + 1
stack.append(v)
# 3. create an array that has a list of nodes on each
# index representing the distance from "start"
# and create all possible permutations for each distance group...
d2Node = [[] for i in range(maxD + 1)]
for u, du in nodeOrder.items():
d2Node[du].append(u)
result = [[]]
for ng in d2Node:
nr = []
ngp = list(permutations(ng))
for r in result:
for p in ngp:
nr.append(r + list(p))
result = nr
return result
print(allTopologicalOrders([["Shirt", "Tie"], ["Shirt", "Belt"], ["Trouser", "Belt"], ["Trouser", "Socks"], ["Socks", "Shoes"]]))
#[['Trouser', 'Shirt', 'Tie', 'Belt', 'Socks', 'Shoes'],
# ['Trouser', 'Shirt', 'Tie', 'Socks', 'Belt', 'Shoes'],
# ['Trouser', 'Shirt', 'Belt', 'Tie', 'Socks', 'Shoes'],
# ['Trouser', 'Shirt', 'Belt', 'Socks', 'Tie', 'Shoes'],
# ['Trouser', 'Shirt', 'Socks', 'Tie', 'Belt', 'Shoes'],
# ['Trouser', 'Shirt', 'Socks', 'Belt', 'Tie', 'Shoes'],
# ['Shirt', 'Trouser', 'Tie', 'Belt', 'Socks', 'Shoes'],
# ['Shirt', 'Trouser', 'Tie', 'Socks', 'Belt', 'Shoes'],
# ['Shirt', 'Trouser', 'Belt', 'Tie', 'Socks', 'Shoes'],
# ['Shirt', 'Trouser', 'Belt', 'Socks', 'Tie', 'Shoes'],
# ['Shirt', 'Trouser', 'Socks', 'Tie', 'Belt', 'Shoes'],
# ['Shirt', 'Trouser', 'Socks', 'Belt', 'Tie', 'Shoes']]
print(allTopologicalOrders([["Shirt", "Tie"], ["Shirt", "Belt"], ["Trouser", "Belt"], ["Trouser", "Socks"], ["Socks", "Shoes"], ["Shoes", "Shirt"]]))
#[['Trouser', 'Socks', 'Shoes', 'Shirt', 'Tie', 'Belt'], ['Trouser', 'Socks', 'Shoes', 'Shirt', 'Belt', 'Tie']]
# I assume you want to know all unique combinations where k characters
# are replaced by the integer value k in a string. Thereby you can replace
# multiple times some character but the integers placed into the string has to
# to be separated by at least a single character...
# for simplicity I assume the initial text doesn't contain numbers (0..9)
# the recursion, assumed the word does not contain numeric values
# g(text) = toString(k) + substring(text, k, k + 1) ** g(substring(text, k + 1, n - 1)), for all 0 < k < n - 1, n = length of text
# ++ substring(text, 1, 1) ** g(substring(text, 1, n - 1))
# **: every result of g is prefixed with toString(k) + substring ...
# ++: the two list are concatenated
def anotateWithNumCharacterPlaceholder(text):
n = len(text)
if n <= 0: return [""]
result = [str(n)]
for k in range(1, n - 1):
pfx = str(k) + text[k : k + 1]
for sfx in anotateWithNumCharacterPlaceholder(text[k + 1 :]):
result.append(pfx + sfx)
pfx = text[:1]
for sfx in anotateWithNumCharacterPlaceholder(text[1:]):
result.append(pfx + sfx)
return result
print(anotateWithNumCharacterPlaceholder("ABCD"))
# output
# ['4', '1B2', '1BC1', '1BCD', '2C1', '2CD', 'A3', 'A1C1', 'A1CD', 'AB2', 'ABC1', 'ABCD']
edited: of course memoization would avoid repetition of work (different prefix same suffix) and speed up things
- Chris June 06, 2017you mean split, sort and merge (concatenation wouldn't necessarily produce a sorted result)
first pass: read 10 * 1GB sort in memory, write to disk (10 files)
second pass: open the 10 1 GB files (as streams), create a new file and merge the content of the other files into the one target file (pick the smallest record on top of each stream and write it to the target) until done. then delete the 10 intermediary files.
first pass: O(n*lg(n)) second pass, O(n): total O(n*lg(n)) but due to disc activity, a bit slow
I assumed:
- removeLastElement removes the last ellement added.
- getLastElement returns the last element added.
- addElement adds an element
so, in stack terminology, removeLastElement is pop, getLastElement is top and addElement is push. How to deal with the getMin() in O(1). Calculate on the go what the minimum is: currentMin = min(currentMin, newElement). If an element is poped from the stack, the minValue should be restored that was valid before the lastElement was pushed. So, just store the minValue along with the elements on the stack and restore the minValue with that value when poping from the stack.
class SpecialContainer:
def __init__(self):
self.stack = [] # list of tuples: first element value and second: minimum before this item
self.min = None
#returns the last element or None
def getLastElement(self):
e = self.__top()
if e == None: return None
return e[0]
# returns the current minimum or none if datastruct is empty
def getMin(self):
return self.min
# adds value
def addElement(self, value):
self.stack.append([value, self.min])
if self.min == None or value < self.min: self.min = value
# removes last element or does nothin if stack is empty
def removeLastElement(self):
e = self.__top()
if e != None:
self.min = e[1]
self.stack.pop()
def __top(self):
if len(self.stack) == 0: return None
else: return self.stack[len(self.stack) - 1]
sc = SpecialContainer()
print(sc.getMin()) #None
sc.addElement(1)
print(sc.getMin()) #1
sc.removeLastElement()
print(sc.getMin()) #None
sc.addElement(2)
print(sc.getMin()) #2
sc.addElement(1)
print(sc.getMin()) #1
sc.addElement(6)
print(sc.getMin()) #1
sc.removeLastElement()
print(sc.getMin()) #1
sc.removeLastElement()
print(sc.getMin()) #2
# 1) assume the BST constraint is:
# all values in left sub-tree (L) < value of root node < all values in right sub-tree
# a preorder traversal looks like this T{L}{R}
# an inorder traversal looks like this {L}T{R}
# {L} is a sequence of integers that satisfy L < T, ...
# 2) use the array as source (with index s) and destination (with index d), start with s=0, d=0
# let aux(s, d, pv) be a function that returns s, d after a sub-tree has been re-ordered
# to in-order with pv=value of parent node
# - start condition aux(0, 0, +infinity)
# - aux:
# > let v = array[s] (value of the node/sub-tree)
# > if array[s+1] < v, recurse on (s+1, d, v) for left subtree
# > place array[d] = v (inorder: place the value between left and right sub-tree)
# > if array[s+1] < pv (the value of the parent node): recure on (s, d, pv) for right sub-tree
# 3) done
#
# space complexity: O(h), where h is the height of the BST (stack size)
# time complexity: O(n)
def fromPreOrderToInOrder(nums):
def aux(nums, s, d, pv): #s: source, d: destination, pv: parent-value
v = nums[s]
s += 1
if s < len(nums) and nums[s] < v:
s, d = aux(nums, s, d, v)
nums[d] = v
d += 1
if s < len(nums) and nums[s] < pv:
s, d = aux(nums, s, d, pv)
return s, d
aux(nums, 0, 0, float('inf'))
Repednaheeks, Android test engineer at Automated Traders Desk
I am Edna, a blogger . As an editor with a strong background in English and Hindi language. I enjoy writing ...
Rep
Rep
Repjudydschultz1234, Android test engineer at Eterno Infotech Pvt Ltd
Spent a weekend researching weebles in Nigeria. My current pet project is developing strategies for how to break someone's ...
RepAmber Van is the top rated company that offers friendly and professional removals services.Our featured services includes domestic moves ...
RepJames Gulledge, Dev Lead at ADP
Get you driving in comfort and class in Atlanta! Prestige Luxury Rentals offer a wide selection of premium vehicles Our ...
if I remember memcpy right, it checks on which end it overlaps and does then copy from start to end or from end to start... What was the question anyways?
- Chris July 21, 2017