Google Interview Question
SDE-2sCountry: India
Interview Type: Phone Interview
I would like to (formally) prove that, choosing start index to be the index of first 'b' leads to an optimal solution. (Note, we consider start index only)
Let u be the index of the first 'b' from left to right, S[u] = 'b'. If no such 'b', u is set to zero: u = 0.
Claim: A greedy solution G with start index = u can be an optimal solution, considering start index only.
Proof:
Suppose there exists an optimal solution S* in which the start index st is different from u. This optimal solution has some optimal end index ed -- we don't really care about it in this proof.
The swapped string for that optimal solution is:
S* = S[0,st-1] + Swap[st,ed] + S[ed+1,n); // n = length of S
where Swap[st,ed] is the swapped sub-string from index st to index ed of the input string S.
And, the '+" operation is string concatenating operation (thus, order of the terms is important).
Case 1: If st comes before u, i.e. st < u
=========================================
Then we have S[st,u-1] = 'aaa..a' as u is defined to be the index of first 'b' from left.
Construct another solution S1 from S* as following
S1 = S[0,u-1] + Swap[u,ed] + S[ed+1,n); // swap at index u, instead of st
We have:
S1 = S[0,st-1] + S[st,u-1] + Swap[u,ed] + S[ed+1,n);
= S[0,st-1] + 'aaa..a' + S'[u,ed] + S[ed+1,n);
= S[0,st-1] + NewStr + S[ed+1,n);
Now compare the substring Swap[st,ed] in the optimal solution with the string NewStr = ('aaa..a' + Swap[u,ed]) of the newly constructed solution (these two substrings have equal length).
We have:
Swap[st,ed] = Swap[u,ed] + Swap[st,u-1];
NewStr = 'aaa..a' + Swap[u,ed];
We can see that NewStr <= Swap[st,ed], since NewStr starts with 'aaa..a', and ends with the same start portion of Swap[st,ed];
Thus, S1 <= S*. This means that, the newly constructed solution S1 is not worse than the optimal.
Case 2: If st comes after u, i.e. st > u
========================================
We have S[u-1] = 'a', S[u] = 'b'.
S* = S[0,st-1] + Swap[st,ed] + S[ed+1,n)
= S[0,u-1] + S[u] + S[u+1,st-1] + Swap[st,ed] + S[ed+1,n);
= S[0,u-1] + 'b' + S[u+1,st-1] + Swap[st,ed] + S[ed+1,n);
Case 2a: there's an 'a' in S[u+1,st-1] at position k. That is, S[k] ='a', k in [u+1,st-1]
Construct a solution S2a as following:
S2a = S[0..u-1] + Swap[u,k] + S[k+1,n); // swap at [u,k] instead of [st,ed]
= S[0..u-1] + S[k] + Swap[u,k-1] + S[k+1,n);
= S[0..u-1] + 'a' + remainingS2a;
Compare S2a with the S*, we easily see S2a < S*. Thus S* can't be an optimal solution. In other words, this case 2a doesn't happen.
Case 2b: there's no 'a' in S[u+1,st-1]. Thus, S[u+1,st-1] = "bbb..b";
Construct another solution S2b from S* as following
S2b = S[0,u-1] + Swap[u,ed] + S[ed+1,n); // swap at u instead of st;
We have:
S2b = S[0,u-1] + Swap[u,ed] + S[ed+1,n)
= S[0,u-1] + Swap[st,ed] + Swap[u,st+1] + S[ed+1,n)
= S[0,u-1] + NewStr + S[ed+1,n);
where NewStr = Swap[st,ed] + Swap[u,st+1];
On the other hand, for S*:
S* = S[0,u-1] + 'b' + S[u+1,st-1] + Swap[st,ed] + S[ed+1,n);
= S[0,u-1] + 'b' + "bbb..b" + Swap[st,ed] + S[ed+1,n);
= S[0,u-1] + "bbbb..b" + Swap[st,ed] + S[ed+1,n);
= S[0,u-1] + B_start_String + S[ed+1,n);
where, B_start_String = "bbbb..b" + Swap[st,ed];
Now compare the substring B_start_String from optimal solution with the substring NewStr of newly contructed solution S2b (these two substrings have equal length):
We can see that: NewStr <= B_start_String, since the B_start_String string starts with "bbb..b" and ends with the same start of NewStr;
Thus, S2b <= S*. This means that, the newly constructed solution S2b is not worse than the optimal.
Overall in all cases, the greedy solution G is not worse than any optimal solution. (QED)
Back to the solution I provided:
=======================
The start index is u;
The end index is one of index of 'a' after u. (It is obviously that, swapping the first 'b' with an 'a' after u is better than swapping 'b' with an other 'b'.)
Thus, a brute force solution is: checking all possible values of the end index, do the swap, and record/compare with the best string so far.
This must be a correct solution ??
This is correct approach. I have tested it on online judge and the best I've seen for this problem.
Those who are thinking for the solution based on the suffix array like :
After finding the first b at position i;
Reverse the whole string from i to last index.
Then find the first string in the suffix array.
NOTE : make sure this is not passing the following case
baababaa
According to this algo :
Suffix array will be like :
aab
aababaab
-------------
---------------
------------------
so if you chose first :
Resultant string will be :
aabbabaa
While if you chose 2nd
Resultant string will be :
aababaab (THE CORRECT ONE)
SO DEVISE AN ALGORITHM THAT PASSES THIS TEST CASE ,IF YOU ARE TRYING IT BY SUFFIX ARRAY
simple greedy approach will work:
1. start from left side find the first occurence of b that is i; if not found i=0 , j=0;
2.from that i find the point where max number of consecutive a's are that point is j.
for ex,
aabbbaabaaabaaaabb now i=2 and j=15.
This is a simple dynamic programming algorithm. I added comments to make it readable. The output for "aaaa" is 3,3, which different from the results given by the example, which is 0,0. But it actually does not matter because reverse (0,0) and (3,3) all results in the same string.
void maxSwap(string s) {
if(s.length() == 0)
return;
//dynamic programming algorithm
//This matrix records if the substring of s[j]....s[i] should be reversed or not
vector<vector<bool>> goodSwap(s.length(), vector<bool>(s.length(), false));
string minStr = s;//records the smallest string we can achieve
int start = 0;
int end = s.length() - 1;
for(int i = 0; i < s.length(); i++) {
for(int j = 0; j <= i; j++) {
bool good;
if(i - j < 2)
good = true;//if i == j or i = j + 1, always good to reverse
else {
if(s[j] == 'b' && s[i] == 'a')//always good to reverse
good = true;
else if (s[j] == 'a' && s[i] == 'b')//always bad to reverse
good = false;
else// if (s[i] == s[j]), depends on the reversability of s[j + 1]....s[i - 1]
good = goodSwap[i - 1][j + 1];
}
if(good = true){//if it is good to reverse
goodSwap[i][j] = true;
string newStr = s;
reverse(newStr.begin() + j, newStr.begin() + i + 1);//reverse the string
if(newStr <= minStr){//if the reversed string is smaller
start = j;
end = i;
minStr = newStr;//records the smallest string we can achieve
}
}
}
}
cout << start << end << endl;
}
int _tmain(int argc, _TCHAR* argv[])
{
maxSwap("abab");
maxSwap("abba");
maxSwap("bbaa");
maxSwap("aaaa");
maxSwap("babaabba");
return 0;
}
It is not better than brute force but we can make it efficient by removing unwanted branches as early as possible .
Then make the brute force solution more efficient by "removing unwanted branches as early as possible". I mean you can apply almost the same filter for the brute force solution in this problem.
def maxswap(clist):
clen = len(clist)
i, j = 0, 0
while i < clen:
if clist[i] == 'b':
break
i += 1
if i == clen:
return [0, 0]
val = -(clen - i - 1)
rj = i
cval = 0
for j in range(i + 1, clen):
if clist[j] == 'a':
cval += 1
else:
cval -= 1
if cval > val:
rj = j
val = cval
if rj == i:
return [0, 0]
return [i, rj]
print(maxswap('abab'))
print(maxswap('abba'))
print(maxswap('bbaa'))
print(maxswap('aaaa'))
print(maxswap('babaabba'))
This is O(N) DP solution and it's the best so far.
I would try to explain it as following:
1. find first 'b'
2. from the next position after the first 'b', find the continuous sub list with maximum value ('a' = 1, 'b' = -1)
3. reverse the first 'b' and the following continous sub array (if there's one)
This is O(N) DP solution and it's the best so far.
I would try to explain it as following:
1. find first 'b'
2. from the next position after the first 'b', find the continuous sub list with maximum value ('a' = 1, 'b' = -1)
3. reverse the first 'b' and the following continous sub array (if there's one)
I think it can't pass the case "baba"
the result of your code is [0, 1]
but I think the correct answer is [0, 3]
Sure, thanks JFantasy90, a little change is needed.
if cval > val:
should be
if cval >= val:
As I understand, you count the difference in the number of a, b in the remaining string after first b. Then you choose the "best" continuous subsequence right after the first b.
Isn't it your idea?
I think there's something wrong with this approach. Your scoring system doesn't care about the order of a,b in the string.
For example, these two string have same score:
baaabaabbbbbb
baabaaabbbbbb
(Only the order of the middle b is changed 1position)
However, they should be swapped at different positions and have different optimal results.
aaab||baabbbbbb
aaabaa||bbbbbbb
inhnnsoc , you are absolutely right. I think there a minor defect in a icomputational solution.
Score should start from zero every time we encounter a 'b' , this is how we can determine the longest sequence of consecutive 'a' .
So the algorythm should be
1. find first 'b'
2. from the next position after the first 'b', find the maximum continuous sub list of 'a'
3. if it is longer than prefix sequence (from start to first b) then reverse the first 'b' and the end of maximal subsequence of 'a'
here is correct code and results for your example
def maxswap(clist):
clen = len(clist)
i, j = 0, 0
while i < clen:
if clist[i] == 'b':
break
i += 1
if i == clen:
return [0, 0]
val = - i # first fix, initial val should represent length of starting
# 'a' sequence
rj = i
cval = 0
for j in range(i + 1, clen):
if clist[j] == 'a':
cval += 1
else:
cval = 0 # second fix , we should start over calculation
# when we see 'b'
if cval > val:
rj = j
val = cval
if rj == i:
return [0, 0]
return [i, rj]
print ( maxswap ( 'baaabaabbbbbb' ) )
print ( maxswap ( 'baabaaabbbbbb' ) )
output:
kirillmoizik$ python test.py
[0, 3]
[0, 6]
Here is the dynamic programming solution for this problem...
Lets C[] be the given char array, containing string.
Lets R[sizeof(C)][sizeof(C)] = {-1} : be the array, storing the optimal lexicographical compuration result within the frame of (i, j).
Lets reverse_loxicograph(C, i, j) : reverse the substring i to j and returns its lexicograph value.
int find_min_lexicograph (char C[], int i, int j) {
if(R[i][j] != -1) {
return R[i][j];
} else {
R[i][j] = min {find_min_lexicograph(C, i+1, j-1),
find_min_lexicograph(C, i, j-1),
find_min_lexicograph(C, i+i, j),
reverse_lexicograph(C, i, j)};
return R[i][j];
}
}
#include <iostream>
using namespace std;
int main() {
int t;
cin>>t;
for(int ii=0;ii<t;++ii){
string str;
cin>>str;
string mn = "";
int sidx = 0, eidx = 0;
for(int i=0; i< (int)str.size(); ++i){
if ( str[i] == 'b'){
mn = str[i];
string cur ="";
cur+=str[i];
for(int j=i+1;j<(int)str.size(); ++j){
cur=str[j]+cur;
if(cur<mn){
mn = cur;
eidx = j;
sidx = j-mn.size()+1;
}
}
break;
}
}
cout<<sidx<<" "<<eidx<<endl;
}
return 0;
}
while(arr[a++] =='a');
if(k==arr.length())
{ cout<<"0,0"<<endl;
rt 0;
}
FORR( i ,arr.length()-1 , 0){
k=i;
while(k>=a && arr[k--]=='a');
if(k==a)k++;
if(i-k < maxc.first)continue;
if(i-k > maxc.first)
{
maxc.first = i-k;
int o = k;
while(k>=a && arr[k--]=='b');
if(k==a)k++;
maxc.second = o-k;
first =k;
sec = i;
}
else if(i-k == maxc.first)
{
int o = k;
while(k>=a && arr[k--]=='b');
if(k==a)k++;
if(o-k > maxc.second)
{
maxc.second = o-k;
first =k;
sec = i;
}
}
}
string GetReverse(string strA, int start, int end)
{
string strTemp(strA);
while(end > start)
{
char temp = strTemp[end];
strTemp[end] = strTemp[start];
strTemp[start] = temp;
start++;
end--;
}
return strTemp;
}
void FindSmallest(string strA)
{
bool bFirstCharacterA = false;
bool bFirstCharacterB = false;
int iStart = 0;
int iEnd = 0;
int iSubStringStart = 0;
int iSubStringEnd = 0;
string minStr = strA;
for(int i = 0; i < strA.length(); i++)
{
//depending on first character we can solve case for a and b seperately
if(0 == i)
{
if(strA[i] == 'a')
{
bFirstCharacterA = true;
}
else
bFirstCharacterB = true;
}
else
{
if(bFirstCharacterA)
{
if(!iSubStringStart && strA[i] == 'b')
{
iSubStringStart = i;
}
if(iSubStringStart && strA[i] == 'a')
{
int len = (i - iSubStringStart) + 1;
string reversedString = GetReverse(strA, iSubStringStart, i);
if(reversedString.compare(minStr) < 0)
{
iStart = iSubStringStart;
iEnd = i;
minStr = reversedString;
}
}
}
else
{
if(strA[i] == 'a')
{
int len = (i - iSubStringStart) + 1;
string reversedString = GetReverse(strA, iSubStringStart, i);
if(reversedString.compare(minStr) < 0)
{
iStart = iSubStringStart;
iEnd = i;
minStr = reversedString;
}
}
}
}
}
cout<<"("<<iStart<<","<<iEnd<<")\n";
}
int _tmain(int argc, _TCHAR* argv[])
{
FindSmallest("abab");
FindSmallest("abba");
FindSmallest("bbaa");
FindSmallest("aaaa");
FindSmallest("babaabba");
FindSmallest("abbb");
FindSmallest("a");
return 0;
}
O(n*logn*logn) solution:
1. Find first i such that s[i]='b'.
2. Find lexicographically minimum substring starting from j (j>i) and ending at i. (Use suffix array like DP to find such a j in n*logn*logn time, can be reduced to n*logn if radix sort is used.)
#include <iostream>
#include <string>
#include <cstring>
#include <vector>
#include <algorithm>
#define fst first
#define snd second
#define pii pair<int,int>
#define piii pair<pii, int>
#define mp make_pair
using namespace std;
int sm(char* s, int len){
if (len==0) return 0;
int* ranks = new int[len];
for (int i=0; i<len; i++){
ranks[i] = s[i];
}
int sz=1;
while (sz<len){
vector<piii > ts;
for (int i=0; i<len; i++){
ts.push_back(mp(mp(ranks[i],i-sz>=0?ranks[i-sz]:99999999), i));
}
sort(ts.begin(), ts.end());
int cr=0;
for (int i=0; i<len; i++){
if (i>0 && ts[i].fst!=ts[i-1].fst) cr++;
ranks[ts[i].snd] = cr;
}
sz*=2;
}
int mi=0;
for (int i=0; i<len; i++){
if (ranks[i]<ranks[mi]) mi=i;
}
delete ranks;
return mi;
}
pair<int,int> solve(string s){
for (int i=0; i<s.size(); i++){
if (s[i]=='b'){
return make_pair(i, i+1+sm(&s[i+1], s.size()-(i+1)));
}
}
return make_pair(0,0);
}
int main(){
string s;
cin>>s;
pii ans = solve(s);
cout<<ans.fst<<" "<<ans.snd;
}
Here is a DP based solution running in O(N).
First step is to ignore the first set of 'a' s until we hit the first b.
'aabbaba' => 'bbaba'
Second step is to parse the string and divide it into runs of 'bbbaaa' represented as tuples.
E.g., 'bbaba' => (2, 1), (1, 1)
The rest is done recursively:
Let's say we have the following pattern now (B1, A1), (B2, A2), ..., (Bm, Am)
Let j be the best run to do the reverse, i.e., the smallest (lexicographically) string is
Aj Bj Aj - 1 Bj - 1 ... A1 B1 Bj +1 Aj + 1....Bm Am.
Then my claim is that "j" is either "1" or the best run for the suffix starting at B2.
Now all we need to do is to determine whether or not A1 B1 B2 A2 ... or Aj Bj .... is the smaller string.
This can be done in O(max (B1 + A1, Bj + Aj) ) .
Therefore, the overall complexity is O(N).
here is the full code:
public static Tuple<Integer, Integer> minReversedSubString(String s) {
StringBuilder sb = new StringBuilder();
// removing first couple of 'a' characters
for (int i = 0; i < s.length(); i++) {
if (sb.length() == 0) {
if (s.charAt(i) == 'b') sb.append('b');
}
else sb.append(s.charAt(i));
}
if (sb.length() == 0) return new Tuple<Integer, Integer>(0, 0);
String str = sb.toString();
// now our string starts with 'b'
//get different runs of 'bbbbaaaa'
// ex: 'bbaabbbba' has runs of (2, 2) and (3, 1)
ArrayList<Tuple<Integer, Integer>> runs = getRuns(str);
int index = bestReverse(runs, 0);
int pos = 0;
for (int i = 0; i < index + 1; i++) {
pos += runs.get(i).first();
pos += runs.get(i).second();
}
int first_b = s.length() - str.length();
return new Tuple<Integer, Integer>(first_b, first_b + pos - 1);
}
public static int bestReverse(ArrayList<Tuple<Integer, Integer>> runs, int i) {
if (i == runs.size() - 1) return i;
int index = bestReverse(runs, i + 1);
if (runs.get(i).second() > runs.get(index).second()) return i;
else if (runs.get(i).second() < runs.get(index).second()) return index;
else {
if (runs.get(i).first() < runs.get(index).first()) return i;
else return index;
}
}
public static ArrayList<Tuple<Integer, Integer>> getRuns(String str) {
ArrayList<Tuple<Integer, Integer>> runs = new ArrayList<>();
int b = 0, a = 0;
boolean runcomplete = false;
for (int i = 0; i < str.length(); i++) {
if (str.charAt(i) == 'b') {
if (a > 0) {
runs.add(new Tuple<>(b, a));
a = 0;
b = 0;
i--;
}
else {
b++;
}
} else {
a++;
}
}
if (b > 0) {
runs.add(new Tuple<>(b, a));
}
return runs;
}
If the string only contains ‘a’, return [0,0];
We only could reverse of a substring begin with ‘b’ in order to get a smaller string.
Find the first ‘b’ from the beginning
Check each substring begin with ‘b’, count how many ‘a’ after ‘b’ in each substring.
Suppose we have the following string (eliminate the heading ‘a’)
b{a*c1}b{a*c2}b{a*c3}….b{a*c_n}
If we reverse the first substring b{a*c1}, the resulting substring would be
{a*c1}bb{a*c2}b{a*c3}….b{a*c_n}——–(1)
If we reverse the first two substrings b{a*c1}b{a*c2}, the resulting substring would be
{a*c2}b{a*c1}bb{a*c3}….b{a*c_n}——–(2)
(2) is smaller than (1) c2 > c1
If we reverse the first three substrings b{a*c1}b{a*c2}b{a*c3}, the resulting substring would be
{a*c3}b{a*c2}b{a*c1}b….b{a*c_n}——–(3)
(3) is smaller than (2) c3 > c2 (c2 > c1)
(3) is smaller than (1) c3 > c1 (c1 > c2)
Therefore, every time we reverse the string to get a smaller substring as long as we have a bigger cn.
class OptimalSubstringReversal{
public:
void driver(string str){
int i = 0;
while(i < str.length()){
if(str[i] == 'b'){
break;
}
i++;
}
if(i == str.length()) {
cout << "[0,0]";
return;
}
int pre_count = 0;
int pre_end = 0;
int start = i;
int end = i;
int count = 0;
for(; i < str.length(); i ++){
if(str[i] == 'a'){
end = i;
count ++;
}else{
if(count > pre_count){
pre_count = count;
pre_end = end;
}
count = 0;
}
}
if(count > pre_count){
pre_end = end;
}
cout <<"["<<start <<"," << pre_end << "]" << endl;
}
};
I crafted a solution of O(N) algorithm complexity and O(1) space complexity. It takes care of all the examples that posted on this thread so far. I have post the code blow but I don't bother to explain why it is done like this, because it takes quite a bit time. If interested, please refer to here with detailed explanation of this solution: cpluspluslearning-petert.blogspot.co.uk/2014/11/data-structure-and-algorithm.html
#include <string>
#include <cassert>
struct AabbResult {
AabbResult(int s, int e)
: start(s), end(e)
{}
int start;
int end;
};
AabbResult AabbSolution(const std::string& str)
{
if (str.empty()) {
return AabbResult{ -1, -1 };
}
size_t firstB = str.find_first_of('b');
if (firstB == std::string::npos) {
// all are 'a', so need to reverse
return AabbResult{ 0, 0 };
}
bool foundA = false;
size_t rFirstA = 0;
size_t rNextA = 0;
size_t MaxNumOfContinousA = 0;
for (size_t index = str.length() - 1; index > firstB; --index)
{
if (!foundA && rFirstA == 0 && str[index] == 'a') {
foundA = true;
rFirstA = index;
continue;
}
if (foundA && str[index] != 'a') {
// find the first continuous 'a' in the reverse order
// and record its length
foundA = false;
MaxNumOfContinousA = rFirstA - index;
}
if (!foundA && rFirstA > 0 && str[index] == 'a') {
// find the start of the 2nd continuous 'a' in the
// reverse order
rNextA = index;
break;
}
}
if (rFirstA == 0) {
// not found any 'a' after first 'b'
// eg: aaabbbb
return AabbResult{ 0, 0 };
}
if (rNextA == 0) {
// only one continuous 'a' between firstB and rFirstA
// aaabbbaaaabb
return AabbResult( static_cast<int>(firstB),
static_cast<int>(rFirstA) );
}
// There are multiplie multipile continuous 'a' after firstB
// eg: aaabbbaaabbaaabbaabb
size_t strLenBetweenLongestA = 0;
size_t rLongestAIndex = rFirstA;
size_t numOfContinuousA = 0;
bool sameMaxLengthContinousAFound = false;
size_t numToContinueComparing = rFirstA - rNextA;
foundA = false;
size_t tempIndex = 0;
for (size_t index = rNextA + 1; index > firstB; --index, ++tempIndex) {
if (sameMaxLengthContinousAFound) {
// Step 4 on Scenario 4
if (numToContinueComparing > 0) {
if (str[index - 1] > str[rLongestAIndex - MaxNumOfContinousA - tempIndex]) {
// keep the continuous 'a' as it is still the least lexicographical one
// Case of booooobb > byyyyyyb
sameMaxLengthContinousAFound = false;
}
else if (str[index - 1] < str[rLongestAIndex - MaxNumOfContinousA - tempIndex]) {
// Case of booooobb < byyyyyyb
// The newly found continuous 'a's is the least lexicographical one.
// Update the record
rLongestAIndex = rFirstA;
}
--numToContinueComparing;
}
else {
// keep the continuous 'a' as it is still the least lexicographical one
// Case of booooobb = byyyyyyb
sameMaxLengthContinousAFound = false;
}
}
if (!foundA && str[index - 1] == 'a') {
strLenBetweenLongestA = rLongestAIndex - (index - 1);
rNextA = index - 1;
numOfContinuousA = 1;
foundA = true;
continue;
}
if (foundA) {
if (str[index - 1] == 'a') {
++numOfContinuousA;
if (numOfContinuousA > MaxNumOfContinousA) {
// found a even longer continuous 'a's
// clear the flag to cancel the comparison of Step 4 on Scenario 4
sameMaxLengthContinousAFound = false;
}
}
else {
if (numOfContinuousA > MaxNumOfContinousA) {
// found a even longer continuous 'a's
// update the record
rLongestAIndex = rNextA;
MaxNumOfContinousA = numOfContinuousA;
sameMaxLengthContinousAFound = false;
}
else if (numOfContinuousA == MaxNumOfContinousA) {
// found two continuous 'a' with the same number of 'a'
// Set the flag to start the comparison of Step 4 on Scenario 4
sameMaxLengthContinousAFound = true;
numToContinueComparing = strLenBetweenLongestA - MaxNumOfContinousA;
tempIndex = 0;
rFirstA = rNextA;
}
foundA = false;
}
}
}
if (sameMaxLengthContinousAFound) {
assert(numToContinueComparing > 0);
// found two continuous 'a' with the same number of 'a'
// not finished Step 4 on Scenario 4 yet before hitting firstB
// conitnue to the reversed part
for (size_t index = 0; index < numToContinueComparing; ++index, ++tempIndex) {
if (str[rFirstA + index + 1] >
str[rLongestAIndex - MaxNumOfContinousA - tempIndex]) {
// Case of booooobb > byyyyyyb, and
// Case of booooobb = byyyyyyb
return AabbResult{ static_cast<int>(firstB),
static_cast<int>(rLongestAIndex) };
}
else if (str[rFirstA + index + 1] <
str[rLongestAIndex - MaxNumOfContinousA - tempIndex]) {
// Case of booooobb < byyyyyyb
return AabbResult{ static_cast<int>(firstB),
static_cast<int>(rFirstA) };
}
}
}
return AabbResult{ static_cast<int>(firstB),
static_cast<int>(rLongestAIndex) };
}
Time complexity: O(n)
Space complexity: O(1) not including input storage
First some reasoning to simplify the algorithm:
- if we have a word of a form 'a*b*' then the result is (0, 0)
- otherwise word is of a form 'a*b+a.*'
- let us mark the position of the first 'b' as b1 (0-base index)
- let us mark the position of the first 'a' after b1 as a2
- let (X, Y) be the best solution
- this best solution must not change the leading b1 'a' characters
or (0, 0) would result in a word with lower lexicographic ordering
- this best solution must also change the leading 'b' character
into 'a', or (b1, a2) would result in a word with lower lexicographic
ordering
==> X <= b1
- let us mark the position Y - (b1 - X) as Y1 - that is the position
that will be reversed over position b1, and so we know that
word[Y1] == 'a'
- since we know the leading 'a' character must not be altered by
the best solution's character reversal, we know that
word[Y1 + 1..Y] must all be 'a'
- if word[Y1 + 1] == 'a' then (b1, Y1 + 1) is a better solution than
(X, Y) as its reversal results in more leading 'a' characters
==> word[Y1 + 1] == 'b' or word[Y1 + 1] not defined
==>
Y1 == Y as all characters in word[Y1 + 1..Y] must be 'a' but
word[Y1 + 1] must be either 'b' or not defined
==> b1 == X
==>
X == position of the first 'b' character in the string
word[Y] == 'a'
word[Y] is the last 'a' in its group
Now we just need to find the 'a' character, after the first 'b' character for which reading backwards from it results in a word with the minimal lexicographic ordering.
Here's how the algorithm below achieves this:
- as a helper we split up the input into 'b...ba...a' blocks which, when
reversed, can be compared lexicographically as follows:
block1 < block2
<==> block1 has < #a or (== #a and > #b)
<==> (#a in block1, -#b in block1) > (#a in block2, -#b in block2)
- that means that if each block is assigned weight (#a, -#b) blocks with
higher weight are 'better', i.e. they come first in the lexicographic order
when reversed
- we start with a block ending with the last 'a' character and work our way
backwards, looking for a block with the highest weight
- if there is one such block - we are done and that is the last block that
needs to be reversed
- if there is more than one block with matching weights then we need to
compare blocks in front of them in order, i.e. those coming after them
since we are moving backwards
- if we find a difference - we discard the candidate with the less
weighing block
- if they match all the way to the end, we use the first encountered
candidate
- if we encounter three such max weight candidate blocks - the middle one
can be safely discarded, i.e. we need to track & check at most two
candidates at any time
Here's the actual solution code:
def scan_block(word, start, last_in_block):
"return the block's weight and the position in front of the block"
prev_b = word.rfind("b", start, last_in_block)
count_a = last_in_block - prev_b
prev_a = word.rfind("a", start, prev_b)
if prev_a < 0:
prev_a = start - 1
count_b = prev_b - prev_a
weight = count_a, -count_b
return weight, prev_a
def solution(word):
x = find(word, "b") # first 'b'
current = word.rfind("a", x + 1) # last 'a' after the first 'b'
candidate = []
best_weight = 0, 0
while current > x:
weight, next = scan_block(word, x, current)
if weight > best_weight:
# new best weight block - discard any previous candidates and start
# afresh from here
candidate = [(current, next)]
best_weight = weight
else:
if len(candidate) > 1:
# we already have 2 candidates - see if we can discard one
candidate0_weight, candidate0_current = scan_block(
word, x, candidate0_current)
if weight > candidate0_weight:
candidate[:1] = [] # new candidate is better
elif weight < candidate0_weight:
candidate[1:] = [] # old candidate is better
if weight == best_weight:
# new best weight block - use it as the second candidate,
# possibly discarding the previous second candidate in the
# process
candidate[1:] = [(current, next)]
candidate0_current = candidate[0][1]
current = next
return (x, candidate[0][0]) if candidate else (0, 0)
And here's some test code implemented using pytest:
import pytest
@pytest.mark.parametrize("word, expected", (
("a", (0, 0)),
("b", (0, 0)),
("ab", (0, 0)),
("ba", (0, 1)),
("aaaa", (0, 0)),
("bbbb", (0, 0)),
("abab", (1, 2)),
("abba", (1, 3)),
("bbaa", (0, 3)),
("abaa", (1, 3)),
("aaaabbbb", (0, 0)),
("babaabba", (0, 4)),
("aabaabaabaaba", (2, 10)),
("aabaabaaabaaba", (2, 8)),
("aabaaabaaabaaba", (2, 9)),
("aabaaaabaaabaaba", (2, 6)),
("aabaabaabbaaba", (2, 7)),
("aabaabbaabaaba", (2, 11)),
("aabbaabaabaaba", (2, 11)),
("aabbaabbaabaaba", (2, 12)),
("bbbaaabbbaabbbaaa", (0, 16)),
("abbbaaabbbaabbbaaa", (1, 17)),
("aaaabbbaaabbbaabbbaaa", (4, 20)),
("bbaabbabbaabbabbbaaba", (0, 10)),
("bbaabbabbaabbaabbabbbaaba", (0, 14)),
("baabbaaababbaaa", (0, 7))))
def test_solution(word, expected):
assert solution(word) == expected
After reading the solutions, I summarize my solution:
Let's say the string is str.
1. find the latest b, let's say the position is i.
2. We should find j, where j is larger than i, and smaller than length of string. j is somewhere in str[i+1,...,length-1].
The essence of finding j, is to find the longest 'a' group. Among those group, do reverse and compare, find the smallest reverse.
For example,
aabaabab, the length of longest group is 2, the candidate is str[2,...,4].
aabaabaababaa, the length longest groups is 2, the candidates are str[2,...,4], str[2,...,5], str[2,...,12].
aabaaabaabaaa, the length of longest group is 3, the candidates are str[2,...,5], str[2,...12]
As for the time complexity, I would say, it is not O(n), but O(n^2). Because we should consider the complexity of the comparison for the reverse. It takes O(n) time comparision. And traversal j from i+1 to length-1 takes another O(n).
The space is O(1), because it only saves the best result, which we can find some way to store it as a integer number.
#include <stdio.h>
int main() {
int t;
scanf("%d",&t);
while(t--)
{
int i,j,flag=1;
int start,end,count=0,max;
int list[2000]={-1};
int cons[2000]={-1};
char temp[2000];
scanf("%s",temp);
for(i=0;temp[i];i++)
{
list[i]=temp[i]-'a';
printf("%d",list[i]);
}
for(j=0;list[j]!=-1;j++)
{
if(list[j]==1 && flag!=-1)
{
flag=-1;
start=j;
printf("%d %d",j,list[j]);
}
if (flag=-1)
{
if(list[j]==0)
count+=1;
else if(list[j]==1)
count=0;
else;
cons[j]=count;
}
}
printf("%d",flag);
if(flag!=-1)
{
start=0;
end=0;
printf("hi");
}
/*
printf("%d",cons[0]);
printf("%d",cons[1]);
printf("%d",cons[2]);
for(i=0;cons[i]!=-1;i++)
{
printf("%d",cons[i]);
}
max=cons[0];
for(i=0;cons[j]!=-1;i++)
{
if(max<=cons[j])
{
max=cons[j];
end=j;
}
}
printf("%d,%d",start,end);
*/
}
return 0;
}
#include <stdio.h>
int main() {
int t;
scanf("%d",&t);
while(t--)
{
int i,j,flag=1;
int start,end,count=0,max;
int list[2000]={-1};
int cons[2000]={-1};
char temp[2000];
scanf("%s",temp);
for(i=0;temp[i];i++)
{
list[i]=temp[i]-'a';
printf("%d",list[i]);
}
for(j=0;list[j]!=-1;j++)
{
if(list[j]==1 && flag!=-1)
{
flag=-1;
start=j;
printf("%d %d",j,list[j]);
}
if (flag=-1)
{
if(list[j]==0)
count+=1;
else if(list[j]==1)
count=0;
else;
cons[j]=count;
}
}
printf("%d",flag);
if(flag!=-1)
{
start=0;
end=0;
printf("hi");
}
/*
printf("%d",cons[0]);
printf("%d",cons[1]);
printf("%d",cons[2]);
for(i=0;cons[i]!=-1;i++)
{
printf("%d",cons[i]);
}
max=cons[0];
for(i=0;cons[j]!=-1;i++)
{
if(max<=cons[j])
{
max=cons[j];
end=j;
}
}
printf("%d,%d",start,end);
*/
}
return 0;
}
1. find index of first 'b' in the string, lets that position is 'i'
2. traverse from end and find the longest consecutive 'a' sequence and that start index from end is 'j' traverse till index 'i'
3. now reverse the substring (i,j)
The very first solution comes to mind , but unfortunately it is wrong :
Example:
baabaabbaa
According to your algorithm :
1).
baa||baabbaa
aab||baabbaa
2).
baabaa||bbaa
aabaab||bbaa (ONLY THIS IS CORRECT )
3).
baabaabbaa||
aabbaabaab
HOW CAN YOU RESOLVE THESE CONFLICTS ?
HINT : DON'T TRY TO COUNT NUMBER OF 'b' AFTER 'a'.CONFLICT MAY AGAIN ARISE AND VERY INEFFICIENT.
this conflict occurs only when there are more than one longest consecutive 'a' sequence exists
baabaabbaa for this input there exists 3 possible strings
1. aabbaabaab (baabaabbaa||)
2. aabaabbbaa (baabaa||bbaa)
3. aabbaabbaa (baa||baabbaa)
1. traverse from end and store first possible string in an array
2. when second possible string comes then compare both strings, update the array with smaller string
3. continue till first index of 'b' that is 'i' location
import java.util.Scanner;
public class SwapSubStringToGetSmallest {
/**
* @param args
*/
public static void main(String[] args) {
Scanner s = new Scanner(System.in);
int t = s.nextInt();
String[] strArr = new String[t];
for(int i =0;i<t;i++)
strArr[i] = s.next();
for(int i =0;i<t;i++)
swapAndFindSmallestString(strArr[i]);
}
private static void swapAndFindSmallestString(String str) {
int len = str.length();
char[] chArr = str.toCharArray();
int i=0;
while(i<len && chArr[i++]=='a');
if (i==len){
System.out.println("0,0");
return;
}
int startIndex=i-1;
int endIndex = 0;
int maxAs=0;
int curMaxAs = 0;
while(i<len){
if (chArr[i]=='a'){
curMaxAs++;
if (curMaxAs>maxAs)
endIndex=i;
}
else{
if (curMaxAs>maxAs){
maxAs=curMaxAs;
endIndex = i-1;
}
curMaxAs=0;
}
i++;
}
System.out.println(startIndex + "," + endIndex);
}
}
The "babaabba" example should really be 0,7 imo. My solution outputs that and I agree. O(n^2) though unfortunately. I just don't think it can be improved beyond that...
void FindMoves(const char* in, std::string& out)
{
out = "";
if(in)
{
bool foundIt = false;
std::string inStr(in);
const int size = inStr.size();
for(int i = 0; i < size; i++)
{
char c = inStr.at(i);
if(c > 'a')
{
std::list<std::pair<std::string, int> > alt;
for(int j = i+1; j < size; j++)
{
char c2 = inStr.at(j);
if(c2 < c)
{
std::string newStr = inStr;
newStr.at(j) = c;
newStr.at(i) = c2;
alt.push_back(std::make_pair(newStr, j));
}
}
std::string best = inStr;
int bestSwapIdx = i;
for(std::list<std::pair<std::string, int> >::const_iterator k = alt.begin(); k != alt.end(); k++)
{
if(k->first.compare(best) < 0)
{
best = k->first;
bestSwapIdx = k->second;
}
}
if(bestSwapIdx != i)
{
foundIt = true;
char buffer[128] = {0};
sprintf(buffer, "%d,%d", i, bestSwapIdx);
out += buffer;
break;
}
}
}
if(!foundIt)
{
out = "0,0";
}
}
}
Time complexity: O(n)
Space complexity: O(1) not including input storage
- for input a*b* the result is (0, 0)
- otherwise reversal always starts at the first 'b' and ends with some 'a'
- the final reversed 'a' must be the last character in the string or followed by 'b'
- reading backwards, we want the 'a' that starts the longest 'a' run followed by the shortest 'b' run, and if there are multiple such 'b..ba..a' runs (same number of 'a' characters, same number of 'b' characters) choose the first encountered
def next(word, char, start=None):
"""like word.find() but return len(word) if not found"""
return word.find(char, start) % (len(word) + 1)
def solution(word):
x = next(word, "b")
y = 0
best_count = 0, 0
next_b = x
while next_b != len(word):
next_a = next(word, "a", next_b + 1)
count_b = next_a - next_b
next_b = next(word, "b", next_a + 1)
count = next_b - next_a, -count_b # count_a, -count_b
if count >= best_count:
best_count = count
y = next_b - 1
return (x, y) if y else (0, 0)
Here is some example data:
A B C D E F G
bb aa bb a bb aa bb aa bb a bbb aa b a
A: aa bb bb a bb aa bb aa bb a bbb aa b a <-- (a * 2; b * 2) - non-initial
B: a bb aa bb bb aa bb aa bb a bbb aa b a <-- (a * 1; b * 2) - too few a
C: aa bb a bb aa bb bb aa bb a bbb aa b a <-- (a * 2; b * 2) - non-initial
D: aa bb aa bb a bb aa bb bb a bbb aa b a <-- (a * 2; b * 2) - !!the best!!
E: a bb aa bb aa bb a bb aa bb bbb aa b a <-- (a * 1; b * 2) - too few a
F: aa bbb a bb aa bb aa bb a bb aa bb b a <-- (a * 2; b * 3) - too many b
G: a b aa bbb a bb aa bb aa bb a bb aa bb <-- (a * 1; b * 1) - too few a
Here are some test cases implemented using pytest:
import pytest
@pytest.mark.parametrize("word, expected", (
("a", (0, 0)),
("b", (0, 0)),
("ab", (0, 0)),
("ba", (0, 1)),
("aaaa", (0, 0)),
("bbbb", (0, 0)),
("abab", (1, 2)),
("abba", (1, 3)),
("bbaa", (0, 3)),
("abaa", (1, 3)),
("aaaabbbb", (0, 0)),
("babaabba", (0, 4)),
("aabaabaabaaba", (2, 10)),
("aabaabaaabaaba", (2, 8)),
("aabaaabaaabaaba", (2, 9)),
("aabaaaabaaabaaba", (2, 6)),
("aabaabaabbaaba", (2, 7)),
("aabaabbaabaaba", (2, 11)),
("aabbaabaabaaba", (2, 11)),
("aabbaabbaabaaba", (2, 12)),
("bbbaaabbbaabbbaaa", (0, 16)),
("abbbaaabbbaabbbaaa", (1, 17)),
("aaaabbbaaabbbaabbbaaa", (4, 20)),
("bbaabbabbaabbabbbaaba", (0, 10)),
("bbaabbabbaabbaabbabbbaaba", (0, 14))))
def test_solution(word, expected):
assert solution(word) == expected
Hope this helps someone.
Bah, found a test case where it fails... back to the drawing board :-(
for "baabbaaababbaaa" it should return (0, 7), but returns (0,14)
We can not detect which 'a' to use for the Y position based simply on the number of 'a' chars in its series and the number of 'b' chars in the series before :-(
[EDIT: Unfortunately I could not remove or edit the original comment as it was added as an anonymous user and is not connected to my current account, so I just give this comment a -1 and added the correct answer in a separate comment. Please give negative scores to this one so it sinks down the dung pile. :-)]
I dont think this is any DP question. A greedy approach will be needed.
2 cases :
1. string starts with 'a'
if string starts with 'a', try to find the next 'b' , lets say 'b' index is i . from i to len(str) , get the largest sequence of 'a'. end of largest sequence is answer.
e.g. ababaabba => 1, 5
2. string starts with 'b', same as (1), but no need to find next 'b'. just print the index of largest sequence of 'a'.
e.g. babaabba => 0, 4
Check out the test cases in my answer and you'll see failing ones with your logic - those that have multiple 'largest sequences of a'.
The main problem is in deciding which 'longest sequence' to use. :-)
Thanks for pointing this out. now I appreciate the problem. I think, in interview , if you simply derive one condition like
for all i > 0 bi < ai and try all combinations after fail to look at most number of 'a' streams will be optimal.
Since it can only lexicographically small if index of b is lesser to a , we can take all such combination and print. It will be O(n^2) solution. I believe, we can find some pattern to optimize it even further. since larger pattern will be dependent on smaller pattern result.
See the O(n) solution in my comment (be careful - there are two comments - one correct and one not so much that I unfortunately can not edit or remove :-)). It basically implements a 'smart way' to decide which of the 'longest a' groups is the best one. It depends on the data that comes before that group and it is possible to greedily decide which one is the best by scanning the string from the back.
I think this can be solved like this. (correct me if I am wrong)
First of all, the start reverse point must happen at the first 'b', from that point forward, the second point must happen at the last 'a' in consecutive 'a's.
vector<int> reverseStrMin(string s){
int len = (int)s.length();
vector<int> v(2,0);
if(len<=1) return v;
int f1 = 0;
while(f1<len && s[f1] == 'a'){
f1++;
}
if(f1==len) return v;
string result = s;
int f0 = f1;
v[0]=f0;
while(f1<len){
while(f1<len && s[f1]=='b'){
f1++;
}
if(f1==len) return v;
while(f1<len && s[f1]=='a'){
f1++;
}
if(f1==len){
string s2=s;
reverseStr(s2, f0, f1-1);
if(result > s2){
v[1]=len-1;
}
return v;
}
string s1=s;
reverseStr(s1, f0, f1-1);
if(result > s1){
result = s1;
v[1]=f1-1;
}
}
return v;
}
void reverseStr(string &s, int i, int j){
if(i==j) return ;
while(i<j){
char tmp = s[i];
s[i]=s[j];
s[j]=tmp;
i++; j--;
}
}
Use a Different Logic as You will get Stuck in The case of
baabaabbaa
your output:aabbaabbaa
correct output :aabaabbbaa
For the required input size, I think O(n^2) solution should pass (n = length of the input string).
(I don't think there is a better solution than O(n^2) for this problem -- tell me if i am wrong!)
I see some DP solutions posted here, but I'm not sure whether it's O(n^2) time/space.
My idea is following: (brute force O(n^2) time, O(n) extra space):
The first index should be the index of the first 'b' from left.
The second index can be one of the indices of 'a' from the right of the first index. So for each possible value of second index, do the swap and compare with the best so far.
Code in C++:
- ninhnnsoc October 10, 2014