## McAfee Interview Question

Software Engineer / Developers**Team:**Sustainability Engineering

**Country:**India

**Interview Type:**Written Test

@tpcz, agreed, I'm in the same place on this.

The naive approach that I first considered is actually O(n^3), when you simply iterate all O(n^2) possible substrings and then compute the suitability of each solution in O(N) time. But you can then back to O(N^2) by precomputing a matrix of substring sums using DP.

For the iteration, my first thought is to try to consider the long lists first, since you can at least return early as soon as you find a substring that satisfies the condition. So, for a 6-digit input, consider the substrings in this order:

123456

1234

2345

3456

12

23

34

45

56

I don't see a clever way to prune the iteration. For example, among the foursomes, does something about one foursome's result allow you to eliminate other foursomes? I'm not seeing it yet.

A simple way to make the sum-of-a-range computation work in constant time is to precompute the function S(n) for all values n in linear time, where S(n) represents the sum of all elements to the left of an element:

S(n) = a[0] + a[1] + a[2] + ... + a[n-1]

so S(n) = S(n-1) + a[n-1]

Then to compute the sum of the range a[low:high], you take S(high) - S(low), using the Python convention that high is the first number NOT in the range.

@showell. There is another simple way to make it O(n^2).

For each possible center, you can compute the maximum required string in O(n) time (just send two pointers outward). Since there are O(n) such centers, this is O(n^2) time (and O(1)) space.

@Anonymous, yeah, the only drawback of the centers-outward approach is that you can't start by evaluating the longest possible candidates first, and therefore exiting early as soon as you find a match. If the likelihood of palindromes is fairly high, it's a significant benefit to proceed in reverse order. For the centers-outward approach, you are basically committed to evaluating all N*(N-2) sums. If the likelihood of a palindrome is reasonably high, this is inefficient.

With the precompute solution, your best-case scenario (the whole string is equally balanced) will run in linear time.

You could start in the middle of the string and evaluate from there outward, and at each step you could use the previous left and right lengths, for the left length subtract the element to the right of the middle and for the right length subtract the two rightmost elements and add the element to the right of the middle.

You would be working your way through the string largest-substrings first.

@Anonymous, If you do center-inward the naive way, then it's no better than the center-outward approach, because you're not traversing the sums in a way that allows you to terminate early. Now you can do a center-inward search in a breadth-search manner, and keeping the outward sums would allow you to quickly compute the next-inward sums when you got back to them in a breadth search, but then you're facing a lot of work. The whole idea of caching the long sums so that you could subtract off elements to get the next-longest sums, rather than recomputing a long sum, is what got me to the S() function epiphany in the first place. If you use the S() function, then you get all these advantages:

1) You only ever compute n sums.

2) You can iterate through the results in descending order.

3) To evaluate each potential result you merely compute a difference between two S() values in constant time.

4) You don't have to maintain any complicated data structures. It's just O(N) storage for the values of S().

Here's the implementation, by the way:

```
def test():
# test the string from the web page
s = '986561517416921217551395112859219257312'
assert 36 == get_equal_sum_substring(s)
# test with a longer string
s = '9' + ('5' * 8000) + '9' + ('6' * 7999) + '8'
assert 8002 == get_equal_sum_substring(s)
def get_equal_sum_substring(s):
s = map(int, s)
S = [0]
sum = 0
for i in s:
sum += i
S.append(sum)
def sub_sum(low, high):
return S[high] - S[low]
# Set slen to be provisional length of half of
# the substring. We start with the highest possible
# value of slen so that we can return as soon as
# find and equal-sum substring.
slen = len(s) // 2
while slen > 0:
for i in range(len(s)-2*slen):
lsum = sub_sum(i, i+slen)
rsum = sub_sum(i+slen, i+slen*2)
if lsum == rsum:
return 2 * slen
slen -= 1
return 0
test()
```

Python Solution with Cache

This Python code iterates through all the possibilities in descending order of length, and it avoids recomputing partial substrings by caching intermediate sums.

```
def test():
# test the string from the web page
s = '986561517416921217551395112859219257312'
assert 36 == get_equal_sum_substring(s)
# test with a longer string
s = '9' + ('5' * 500) + '9' + ('6' * 499) + '8'
assert 502 == get_equal_sum_substring(s)
def get_equal_sum_substring(s):
# Turn string of digits to an array of ints.
s = map(int, s)
# For a small s, the substring_cache is not
# really worth it, but for longer strings it
# makes sense.
substring_cache = {}
def sub_sum(low, high):
if low + 1 == high:
return s[low]
n = substring_cache.get((low, high), None)
if n is not None:
return n
n = s[low] + sub_sum(low+1, high)
substring_cache[(low, high)] = n
return n
# Set slen to be provisional length of half of
# the substring. We start with the highest possible
# value of slen so that we can return as soon as
# find and equal-sum substring.
slen = len(s) // 2
while slen > 0:
for i in range(len(s)):
if i+slen*2 > len(s):
break
lsum = sub_sum(i, i+slen)
rsum = sub_sum(i+slen, i+slen*2)
if lsum == rsum:
return 2 * slen
slen -= 1
return 0
test()
```

I got it to work recursively and with showell30's clever summing shortcut. Runtime is somewhat slow (3-4 seconds) on the second test case.

There must be a DP solution since there are overlapping subproblems but I'd need a 3D whiteboard to figure it out.

My Java code:

```
public class FindMatchingSums {
private static int S[];
private static int n;
public static void main(String args[]) {
String s1 = "123231";
initializeS(s1);
int sum1 = comparehalves(0, s1.length()-1);
System.out.println("sum1=" + sum1);
String s2 = "986561517416921217551395112859219257312";
initializeS(s2);
int sum2 = comparehalves(0, s2.length()-1);
System.out.println("sum2=" + sum2);
}
private static void initializeS(String number) {
// Precompute sums (thanks, showell30)
n = number.length();
S = new int[n];
S[0] = Integer.parseInt(number.substring(0,1));
for(int i = 1; i < n; i++)
S[i] = S[i-1] + Integer.parseInt(number.substring(i,i+1));
}
private static int comparehalves(int i, int j) {
// compare the sums of the two halves of the range of numbers from i to j
// i.e. split the range of numbers from i to j into two equal subranges, sum each and compare the sums
// if they are equal, return the length of the whole range, otherwise return 0
if(i < 0)
return 0;
if(j >= n)
return 0;
if(i >= j)
return 0;
if(((i - j + 1) & 1) == 1) {
// Can't split an odd length string in half
int max = 0;
int s1 = comparehalves(i+1,j);
int s2 = comparehalves(i,j-1);
return Math.max(s1, s2);
}
// Compare halves of this range
int end_of_first_half = ((j-i+1)/2)+i-1;
if(sum_of_range(i, end_of_first_half) == sum_of_range(end_of_first_half+1, j))
return j - i + 1;
// Compare even length substrings
int max = 0;
int s1 = comparehalves(i+2, j);
int s2 = comparehalves(i+1, j-1);
int s3 = comparehalves(i, j-2);
return Math.max(s1, Math.max(s2, s3));
}
private static int sum_of_range(int k, int l) {
int sum = S[l] - (k > 0 ? S[k-1] : 0);
return sum;
}
}
```

@showell30@yahoo.com Thanks for the upvote.

I thought your summing method was just a clever shortcut but now I suspect it is actually the layer of DP this problem needed. It is a table that stores the results of the overlapping subproblems, i.e. the sums of all the overlapping ranges of numbers. In other words it meets all the requirements of DP. Or at least I think it does, I'm still new to this.

BTW I'm interviewing with some big companies this week and it would mean a lot if you wished me good luck! :)

Good luck! I came upon the clever shortcut trying to think along DP lines--basically, structuring your computations so that recursive relations make things really simple--but I really struggled for a while. I think I've probably seen this technique in some other context, because it was obviously lurking in my subconscious. In some ways it's analogous basic Calculus--compute the integral function so that computing the sum of diffs across a range becomes simple subtraction.

This can be solved using DP.

Try the following code. The data is needed to be preprocessed before using the following code. Convert the string to array.

```
int find(int array[], int size)
{
int max = 0;
int temp[size][size];
for(int i = 0; i < size; i++)
for(int j = i; j < size; j++)
{
if(i == j)
{
temp[i][i] = array[i];
if(max < array[i])
max = array[i];
}
else
temp[i][j] = 0;
}
for(int L=2; L<=size; L++)
for(int i=0; i<size-L+1; i++)
{
int j = i + L - 1;
if(L%2 == 0)
{
int mid = (i+j)/2;
temp[i][j] = temp[i][mid] + temp[mid+1][j];
if(temp[i][mid] == temp[mid+1][j])
if(temp[i][j] > max)
max = temp[i][mid];
}
else
{
int mid = (i+j)/2;
temp[i][j] = temp[i][mid-1] + temp[mid][mid] + temp[mid+1][j];
// If odd length of substring is also valid then uncomment the following code
/* if(temp[i][mid-1] == temp[mid+1][j])
if(temp[i][j] > max)
max = temp[i][j];*/
}
}
return max;
}
```

simple solution is :

```
for i = 2 to n
for j = i-1 to 0
sum elements from j to i and from i to 2*i-j and find out weather sum is equal or not ?
```

this require O(n*n) time and O(1) space, similar to maximum length palindrome.

It's N cubed. You have two outer loops around an O(N) sum. To make this faster you need to avoid recomputing partial sequences, by using a cache or some kind of precomputed matrix (DP). See "Python Solution with Cache" for a caching approach. N cubed gets slow real fast. Try coding your naive algorithm for N=1000.

P.S. In actual wall time it's hard to outperform the naive algorithm with a caching strategy, because the "constant" time of a cache hit is significantly higher than just summing up numbers over a small range. It's also tricky to optimize for cache hits, because your strategy for breaking down sums can lead to overlapping ranges, so you would probably prefer some kind of DP-ish strategy that primes the cache, or use some sort of skip-list scheme.

P.P.S. See my response to @tpcz for a more efficient way to speed up the sum-of-range computation than using a cache.

Suppose you have the string "34122151" and you want to see if it's a getEqualSumSubstring. First you would say, OK, we should consider length 4 substrings (since 4*2 = 8 <- length of "34122151"). So you compute:

3+4+1+2 and 2+1+5+1 - this results in an O(N) scan of the entire string.

You then say "nope", that didn't work. Let's try ones of length 3:

3+4+1 == 2+2+1? But wait, the LHS is simply the previous computation (3+4+1+2) but subtract 2. Similarly 2+2+1 = (2+1+5+1) - 5. Therefore, there is no point performing an (almost) complete scan of the entire string when you could have just cached this sum from before (an O(1) lookup with a single subtraction will still be O(1)).

This way, instead of having an O(N) scan on the innermost loop, you have an O(1) (amortised) lookup.

(The way showell implemented it the first time in "Python Solution with Cache" is that when you say sum(3, 4, 1, 2) you evaluate it recursively as 3 + sum(4, 1, 2) and each call to sum() is placed within a dict (which has an O(1) lookup)).

Please feel free to correct this if I am wrong, my complexity knowledge is not great.

@eugene, Sorry that this thread is kind of scattered, but there are several solutions under consideration:

1) With a totally naive but simple solution, you iterate all N-squared solutions and compute each square in O(N) time for N cubed time. While slow as heck, this is easy to read.

2) In the center-inward and center-outward approaches, you go about examining you ranges in a way that allows each left-hand sum to be computed easily from the prior left-hand sum by just adding one element. This is a nice solution, because it's N-squared and requires no extra data structures. Its only drawback is that you have to search the solution space exhaustively.

3) If you suspect that there are valid solutions for long substrings, you can modify the N-cubed solution to examine long substrings first, so that you can exit as soon as you find your conditions satisfied. This is still N-cubed for the worst case, but it's linear-N for the best case.

4) You can modify solution #3 to use a cache for partial sums. This was my original solution, which I found unsatisfactory for several reasons.

5) Lastly, you can precompute the S() function to make all subsequent sum conditions be constant time. Then you proceed through the O(N*N) possible answers in order of reverse length, which allows you to exit early as soon as your find left == right.

public static void main(String[] args) {

String str = "986561517416921217551395112859219257312";

getEqualSumSubString(str);

}

public static void getEqualSumSubString(String str){

int len = str.length();

int i = len;

if(len%2 != 0){

i--;

}

int count = 0;

String firstSub = str.substring(0, i/2);

String lastSub = str.substring(i/2, i);

int firstSum = getSum(firstSub, i/2);

int lastSum = getSum(lastSub, i/2);

while(true){

if(firstSum == lastSum){

System.out.println(i);

return;

}else if(i >= 2){

if((count + i) <= len){

firstSub = str.substring(count, count+i/2);

lastSub = str.substring(count+i/2, count+i);

firstSum = getSum(firstSub, i/2);

lastSum = getSum(lastSub, i/2);

//System.out.println(i + " | " + count + " | " + firstSum + " | " + lastSum);

count++;

}

else{

i-=2;

count = 0;

}

}else{

System.out.println("0");

return;

}

}

}

public static int getSum(String str, int len){

int sum = 0;

for(int i = 0; i < len; i++){

sum += (str.charAt(i) - 48);

}

return sum;

}

--------------------------------------------------written code in C-----------------------------------

#include<stdio.h>

#include<conio.h>

int main()

{

char a[100],b[100],c[100],choice;

int length=0,n,k,sum1=0,sum2=0,i,j,l;

printf("\nenter the required string to compute:\n");

gets(a);

length=strlen(a);

printf("the length is %d:",length);

printf("\ndo u want to continue:\n",choice);

scanf("%c",&choice);

if(choice=='y')

{

if(length%2==0)

{

printf("\nthe string length is even:perform the operation\n");

}

}

else

{

printf("\nthe string length is ODD:so cannot perform\n");

exit();

}

n=length/2;

l=n-1;

printf("\nthe first half will be from 0 to %d\n",(n-1));

k=n;

printf("the value of k:%d",k);

printf("\nthe second half will be from %d to %d\n",k,(length-1));

for(j=0;j<=(n-1);j++)

{

b[0]=a[j];

b[1]='\0';

sum1=(sum1+(atoi(b)));

}

printf("\nthe sum1 is :%d\n",sum1);

for(i=k;i<=(length-1);i++)

{

c[0]=a[i];

c[1]='\0';

sum2=(sum2+(atoi(c)));

}

printf("\nthe sum2 is :%d\n",sum2);

if(sum1==sum2)

{

if(sum1==length)

{

printf("\nthe longest string==the sum of the half of the substring\n");

}

}

else

{

printf(" \nthe longest string is not equal to the sum of the half of the substring\n");

}

return 0;

}

```
public static int getEqualSumString(string s)
{
// get the lenght of the string
int strLen = s.Length;
int i = strLen;
// is a an odd string the find the even string length
if (strLen % 2 != 0)
{
i--;
}
while (i >= 2) // smallest 2*N sized string
{
for (int strStart = 0; strStart <= strLen - i; strStart++)
{
string subString = s.Substring(strStart, i);
bool isFound = isEqualSum(subString);
if (isFound)
{
return i;
}
}
i -= 2;
}
return 0;
}
public static bool isEqualSum(String str)
{
int sumLeftN = 0;
int sumRightN = 0;
int len = str.Length / 2;
char[] chars = str.ToCharArray();
for (int i = 0; i < len; i++)
{
sumLeftN += Convert.ToInt32(chars[i].ToString());
}
for (int i = len; i < str.Length; i++)
{
sumRightN += Convert.ToInt32(chars[i].ToString());
}
return sumLeftN == sumRightN;
}
```

```
// c# code
public static int getEqualSumString(string s)
{
// get the lenght of the string
int strLen = s.Length;
int i = strLen;
// is a an odd string the find the even string length
if (strLen % 2 != 0)
{
i--;
}
while (i >= 2) // smallest 2*N sized string
{
for (int strStart = 0; strStart <= strLen - i; strStart++)
{
string subString = s.Substring(strStart, i);
bool isFound = isEqualSum(subString);
if (isFound)
{
return i;
}
}
i -= 2;
}
return 0;
}
public static bool isEqualSum(String str)
{
int sumLeftN = 0;
int sumRightN = 0;
int len = str.Length / 2;
char[] chars = str.ToCharArray();
for (int i = 0; i < len; i++)
{
sumLeftN += Convert.ToInt32(chars[i].ToString());
}
for (int i = len; i < str.Length; i++)
{
sumRightN += Convert.ToInt32(chars[i].ToString());
}
return sumLeftN == sumRightN;
}
```

Another solution:

public class FindMatchingSum{

private static int S[];

private static int n;

public static void main(String args[]) {

String s1 = "123231";

initializeS(s1);

int result2n=Find2nLength();

System.out.println("Length of required substring:" + result2n);

}

private static void initializeS(String number)

n = number.length();

S = new int[n];

S[0] = Integer.parseInt(number.substring(0,1));

for(int i = 1; i < n; i++)

S[i] = S[i-1] + Integer.parseInt(number.substring(i,i+1));

}

public static int Find2nLength(){

temp =0;

For(int i=1;i<n;i++)

{

if(S[i]==S[2i+1]/2)

temp=2*i+1+1;

}

return temp;

}

}

It could be done in O(n^2) with trivial algorithm, but I gues, there will be linear solution using some trick with cummulative sum..

- tpcz March 17, 2013