jvermette
BAN USERYeah... I tried to explain above in my previous comment, is there something specific missing there? I'm not sure how it translates to the NP-complete vertex cover problem, and the only two options there are that either it doesn't or my solution is incomplete. Either that or I just changed the face of computer science ;). My guess is that (assuming my solution works) this translates at most to a limited subset of that problem, which has an easier solution.
The basic idea here is that in $disparates we're keeping a list of the non-intersecting sets. With each fan bitstring we "and" it to each item already in $disparates. If at any time we get a non-zero answer, there's at least one bit in the new bit string which intersects with that group. We replace that spot in $disparates with the and-result.
If we reach the end of the current list of $disparates, it means this fan likes players which don't intersect any encountered set. It's a disparate set of players.
At the end, we have in $disparates a list of non-intersecting sets of players. With max possible length of the number of players. Select one from each, that's the answer.
This works because for each bit string if they fit into a group (an item in $disparates) then they share a common bit. And by fitting in, that bit string shrinks the group if they represent a subset of that group.
take A:10101, B:10100, C:00001. In three passes $disparates is {10101} (A), then {10100} (A&B), then {10100, 00001} (A&B, C).
You see from this that the possible answers are 10001 and 00101 (take one bit from each ending item in $disparates). Note that in step 2 {10100} we lose bit 5 because the algorithm found that A and B intersect only in bits 1 and 3. C happens to contain bit 5 so it's then added back in as a disparate intersection. At that point $disparates contains an entry with just one bit (00001) which means that that bit (bit 5 here) is required to be in the answer. That's why the total size of $disparates is max at the total number of bits -- if we ever get to that size, each entry contains just one bit and any new bit string will match at least one... and for that matter we can stop since we'd know that all bits need to be in the answer.
So... could you explain how this problem converts to vertex cover? That would be an interesting thing to hear, since there has to be something wrong on one side or the other.
Thanks -- I was just having a hard time believing I was analyzing tha right.
I think this can be done in 2^(2n) or less. If you're able to gauge it clearly would you look at my solution posted below and see if I got that bound estimate right? I have been thinking it might actually be 2^(n logn) since the length of the 2^n digit permutations range evenly from 1-n.
Both substrings always need to have the same length so it's the more simple Hamming distance -- substitution only, no insertion or deletion. "Dissimilarity for each set S is measured by the number of index positions where characters of both strings do not match"
Also note that for any substring of one which is shorter in length than D, it's a valid set with every other same-length substring of the other. So the larger D is compared to the string lengths, the simpler the problem becomes.
Do you know what the time bound on this is? I'm a moron with time bounds, but it seems like for each digit you're executing the function three times on the remaining digits and so forth. Is that O(3^n)? Please don't ream me for suggesting the idiotic, I'm having a hard time with this one.
- jvermette October 29, 2013I did it in php because it's cheap and dirty. First calculates a possible permutation of the digits, and only at that point does it calculate the permutations of the +/-. That should save a lot of operations, since other solutions which calculate the permutations along with the +/- at the same time will be calculating the same digit permutation over and over again with different combinations of +/-.
This problem is NP and huge, yes? it seems like for each of the 2^n digit permutations there are 2^n permutations of +/- on the digits. At the risk of asking something really stupid, is that O(2^(2n)) bound?
It uses O(n) extra space though. There's commented code in there to return an answer array which would use much more, but this version just prints. I'm aware this doesn't fit the output requirements of the problem.
This algorithm works up to the max int, could be made to work for much larger numbers using GMP. But the universe might end before that completes.
<?php;
$tests = array(
'12345',
'123412341234',
'3462',
);
function isWallPrime($int) {
if (($int % 2) == 0) return 2;
if (($int % 3) == 0) return 3;
if (($int % 5) == 0) return 5;
if (($int % 7) == 0) return 7;
}
function getWallPrimes($str) {
$localAnswers = array();
$runningNumbers = array();
h_getWallPrimes(intval($str), $localAnswers, $runningNumbers);
return $localAnswers;
}
function h_getWallPrimes($int, &$localAnswers, &$runningNumbers) {
if ($int == 0 ) {
$runningAnswer = array();
h_getWallPrimes_arith(0, 0, $runningNumbers, $localAnswers, $runningAnswer);
return;
}
$int10 = 10 * $int;
$rnLen = count($runningNumbers);
for ($mod = 10 ; $mod < $int10 ; $mod *= 10) {
$runningNumbers[] = $int % $mod;
h_getWallPrimes(floor($int/$mod), $localAnswers, $runningNumbers);
$runningNumbers = array_slice($runningNumbers, 0, $rnLen);
}
}
function h_getWallPrimes_arith($runningNum, $index, &$runningNumbers, &$localAnswers, &$runningAnswer) {
if (! isset($runningNumbers[$index])) {
$ret = isWallPrime($runningNum);
if ($ret) {
//$localAnswers[] = array($runningNum, array_reverse($runningAnswer), $ret);
echo 'Wallprime ', $runningNum, "\t", 'divisible by ', $ret, "\t", 'Permutation ', json_encode(array_reverse($runningAnswer)), "\n";
}
return;
}
$newNum = $runningNumbers[$index];
$runningAnswer[] = $newNum;
h_getWallPrimes_arith($runningNum+$newNum, $index+1, $runningNumbers, $localAnswers, $runningAnswer);
array_pop($runningAnswer);
$runningAnswer[] = 0-$newNum;
h_getWallPrimes_arith($runningNum-$newNum, $index+1, $runningNumbers, $localAnswers, $runningAnswer);
array_pop($runningAnswer);
}
//=======================================
//=======================================
echo "\n\n";
foreach ($tests as $str) {
$ret = getWallPrimes($str);
echo "\n\n";
}
?>
That seems an interesting point -- if the problem truly is that you "get the input on a single line" that suggests its a string of numbers and spaces. In that case could you avoid having to search the string for all the spaces to split out the numbers? I think that makes it a linear left to right search of O(n) in the obvious way, since you have to do that to even get the input in list form. Any better solution in this case?
If we take it as a list, clearly Urik's solution is best and looks like O(log n).
Ah you're right. It breaks when there's a max which never gets caught by an equal or greater max on the right but which has mini-troughs in it. I don't think I have time to adjust, but I'm sure it could be fixed in the same time bound by using a stack for $lastMax instead of a simple value. The concept remains the same.
- jvermette October 26, 2013Seems as simple as traversing left to right and recording a "potential store" based on when the bars start decreasing. When you next get to a bar the same height or higher than the previous highest, save the store as fact and start with that next height. Repeat. If you end without recording a potential store as fact, discard it because that would be a hill running off to the end.
Working code, O(n):
<?php ;
$bars = array(4, 1, 6, 4, 1, 6, 3, 1, 4, 6);
$lastMax = 0;
$potentialStore = 0;
$finalStore = 0;
foreach ($bars as $height) {
//End of trough?
if ($height >= $lastMax) {
$finalStore += $potentialStore;
$potentialStore = 0;
$lastMax = $height;
}
//Not the end. Do we have a last max or are we in initial runoff?
if (! $lastMax) continue;
//We're in a trough. Add the diff
$potentialStore += $lastMax - $height;
}
echo "\n\n", $finalStore, "\n\n";
?>
The way I read {4, 1, 6, 4, 1, 6, 3, 1, 4, 6}, total water stored is 20. (zero indexed) spot 1 stores 3, spot 3 stores 2, spot 4 stores 5, spot 6 stores 3, spot 7 stores 5, spot 8 stores 2. 3+2+5+3+5+2=20. I think this solution doesn't account for multiple troughs.
Assuming I understand the problem correctly -- I just submitted a very small O(n) solution that handles that.
First I must apologize, in that last comment I kept saying "bitwise-or" but as you can see from the original post and code it's "bitwise-and". I just fixed the comment.
Then to your question -- 101,110,001 would yield disparates of 100,001 since the first two combine to 100 and the third is non-intersecting at 001. So the answer is players 1 and 3. And this is the correct answer since all fans like either player 1 or player 3.
Sure. In the test case above with
'11111',
'01000',
'00100',
'11110',
'00001'
we are running through this fan list item by item and doing bitwise-and.
With the first item 11111 there is nothing to compare it to -- just put it in $disparates
Second item 01000 we run through disparates until we get a bitwise-and match or get to the end "11111" is the only thing in disparates. bitwise-and yields 01000. We replace that spot in $disparates with 01000.
Third item 00100 we compare in $disparates in the same way. The only item is 01000. Bitwise-and yields 00000 -- no match. Add 01000 as a second item.
Fourth item 11110 yields a match with the first item 01000 in $disparates. It yields 01000, we "replace" again but it's a no-op since they're the same.
Fifth item 00001 yields 00000 bitwise-and match with both items in $disparates. Add it as a third item.
We're now done with the fan list. We have three items in $disparates (01000, 00100, 00001). These represent the groups of fans who like non-intersecting ("disparate") groups of players. In this case that represents players 2, 3, 5. So that's our answer -- players 2, 3, 5.
If we run this with the second, commented out test case, it yields $disparates having "10000", "00001", and "01010". The answer would be players (1, 2, 5), or (1, 4, 5).
The biggest that $disparates can get is 5 items, since if it's 5 items then they're 10000, 01000, 00100, 00010, and 00001. All other bitwise-and's will match one of those. In fact if it ever gets to 5 items we can stop since all players are needed.
So we're doing one pass of the fan list, O(N), and for each one we're checking at worse K items in $disparates, O(K). Total O(KN).
Run through the list and do a bitwise 'and', and maintain a list of the non-intersecting results. If the bit-and intersects any of the previously non-intersecting items , update the item with the new bit-and result. If it doesn't, add that fan to the non-intersecting list. At the end you'll have a list of all combinations of players which don't intersect with each other. Choose one player from each item, and you're done.
You have to run through the raw fan list once at O(K). The largest the non-intersecting list can possibly get is N. In the worst case the first n-1 fans individually only like the first N-1 players, and the rest of the fans only like the Nth player -- O(N) for each fan for the non-intersecting checks. Making this O(KN).
Here is working PHP code:
<?php
$test = array(
'11111',
'01000',
'00100',
'11110',
'00001'
/*
'10101',
'01010',
'00001',
'10000',
'01110'
*/
);
function compactAnswer(&$arr) {
$disparates = array();
foreach ($arr as $i) {
$i = bindec($i);
$match = false;
foreach ($disparates as $dNum => $dItem) {
$matchNum = $i & $dItem;
if ($matchNum) {
$disparates[$dNum] = $matchNum;
$match = true;
break;
}
}
if (! $match) {
$disparates[] = $i;
}
}
return $disparates;
}
echo "\n\n\n";
echo 'Original: ';
foreach ($test as $i) echo $i, ', ';
echo "\n";
$test = compactAnswer($test);
echo 'Compacted: ';
foreach ($test as $i) echo decbin($i), ', ';
echo "\n", 'need ', count($test), ' players', "\n\n\n";
?>
#include <iostream>
#include <list>
#include <string>
#include <utility>
using namespace std;
typedef pair<string,string> rePair;
typedef list<rePair> rePairList;
bool doMatch(string re, string str) {
int rePos = re.size() - 1;
int strPos = str.size() - 1;
while (rePos >= 0 && strPos >= 0) {
char reChar = re[rePos];
char strChar = str[strPos];
//Wildcard
if (reChar == '*') {
//See if it matches without the wildcard
bool ret = doMatch(string(re, 0, rePos-1), string(str, 0, strPos+1));
if (ret) return true;
//No match without it. match one by one and check again
//if it matches without the wild.
while (str[strPos] == re[rePos-1]) {
strPos--;
ret = doMatch(string(re, 0, rePos-1), string(str, 0, strPos+1));
if (ret) return true;
}
//Pass the wildcard and loop.
rePos -= 2;
continue;
}
//dot and literal matches
if (reChar == '.' || (reChar == strChar)) {
rePos--;
strPos--;
continue;
}
//Not a match!
return false;
}
//Left at the beginning of both?
return (rePos < 0 && strPos < 0);
}
int main() {
rePairList tests;
tests.push_back(rePair("az*b", "azzzzzb")); //yes
tests.push_back(rePair("azz*zb", "azzzzzb")); //yes
tests.push_back(rePair("az*.....z*b", "azzzzzb")); //yes
tests.push_back(rePair("az*.z*b", "azzzzzb")); //yes
tests.push_back(rePair("az.b", "azxb")); //yes
tests.push_back(rePair("azxb", "azxb")); //yes
tests.push_back(rePair("azxb", "azab")); //no
for (rePairList::iterator it = tests.begin() ; it != tests.end() ; it++) {
bool ret = doMatch(it->first, it->second);
cout << it->first << " / " << it->second << " --> " << ret << endl;
}
cout << endl << endl << "DONE" << endl;
}
I assume you mean sort the lines of the file. It seems like they're driving toward the notion that swaps are very expensive -- being a file, and there being no extra space, every time you move something you have to move every character between the removal point and the insertion point. I'd guess Selection sort, which is O(n^2) complexity but has O(n) swaps.
- jvermette October 21, 2013
Vinit -- naw, it gives 00011. After the second item $disparates contains {00001, 00010} and every item after that matches one of them via bitwise-and. So it gives the right answer.
- jvermette November 14, 2013