david.sicong.zhou
BAN USER
according to the master theorem, it reduces to big theta (n)?!
there are some small n log n issues which occur in the real world.
edit: when i say becomes n from n log n,
n is the size of the fully merged array
gee whiz i sure hope you're an avid reader! theres 2 sections...
assumptions:
strategy is a 1 thread can merge a pair of arrays, and thus multiple threads can do some simultaneous merging (and we preserve the sorted quality obviously)
for convenience in thinking, assume k arrays are equal length
assume we have more threads and cpus than we will ever need
=========================
improve the time complexity: just theory section
A single thread does mergesort in n log n
we derived that from the master theorem which DOES account for the halving of array size in the recursion formula.
I would like to account for the changing array size. lets use the master theorem.
(if you ignore changing array size you get O(log n) for magically uniform merges hahaha)
n comparisons, split the thing in half, etc for merge sort.
Instead of T(n) = 2T(n/2) + n with a single thread
since we do it in parallel the split is done simultaneously
I imagine the formula becomes T(n) = T(n/2) + n
b still = 2
a now = 1 from 2
now n^(log_2 1) = 1 which is smaller than f(n) = n.
running time is dominated near the root
we get case 3 of the master theorem
the regularity condition easily holds when a = 1 and b is not negative
log_2 1 = 0, which is smaller than c.
case 3 says the time complexity = big theta (f(n)) = big theta (n)
woah. thats definetly an improvement!
=========================
"if so, how?": an implementation
implementation wise i'd have a queue of arrays that need to get worked on, new arrays fresh from a merge get reinserted. a nice way of dealing with unbounded buffer, producer-consumer situations is to employ a semaphore.
this introduces some rare small O(n log n) situations...
in the super rare case where threads finish their merges simultaneously they will have to insert/pop the queue ONE thread at a time as the semaphore's counter must not have a race condition. There are still O(n log n) times that threads can get blocked from locking
also dammit we need to tell when we're finished.
so then we'd need a counter for how many merges we do in order to deduce that we've finished, which would also be incremented O(n log n) times, and O(n) to actually calculate the number of merges done when complete.
Again these are really small things but they could dominate eventually haha. i'd test that by getting employed and using company infrastructure if it ever practically got that big ;)
- david.sicong.zhou May 21, 2017