## Google Interview Question

SDE1s**Country:**United States

**Interview Type:**In-Person

This solution is not valid because it does not take linear time. The question specifically specified "Find the LCP lengths of *all substrings* in O(n)".

This one clearly is at least O(n^2) time. There's a loop calling LCP n times, and LCP itself both has a loop and a call to startsWith (which is O(n) itself).

C

```
#include <stdio.h>
#include <memory.h>
#define MAX_STRING 256
int main() {
char s[] ="ababac";
int lcps[MAX_STRING];
for (int i=0; i<strlen(s); i++) {
lcps[i] = 0;
if (s[i] == s[0]) {
lcps[i]++;
}
for (int j=0; j<i; j++) {
if (lcps[j] == i && s[i] == s[lcps[j]]) {
lcps[j]++;
} else if (lcps[j] > 0) {
if(s[i] == s[lcps[j]] && (j+lcps[j]) >= i) {
lcps[j]++;
}
}
}
}
for(int i = 0; i < strlen(s); i++) {
printf("len(LCP(s(%d,%d),s)) = %d\n", i, strlen(s), lcps[i]);
}
return 0;
}
```

C

```
#include <stdio.h>
#include <memory.h>
#define MAX_STRING 256
int main() {
char s[] ="ababac";
int lcps[MAX_STRING];
for (int i=0; i<strlen(s); i++) {
lcps[i] = 0;
if (s[i] == s[0]) {
lcps[i]++;
}
for (int j=0; j<i; j++) {
if (lcps[j] == i && s[i] == s[lcps[j]]) {
lcps[j]++;
} else if (lcps[j] > 0) {
if(s[i] == s[lcps[j]] && (j+lcps[j]) >= i) {
lcps[j]++;
}
}
}
}
for(int i = 0; i < strlen(s); i++) {
printf("len(LCP(s(%d,%d),s)) = %d\n", i, strlen(s), lcps[i]);
}
return 0;
}
```

The algorithm is pretty much to take the substrings of the main string (prefixes) and use two pointers to scan the two strings and move them at the same time until the characters do not match. I present a concise solution in Python below.

```
def lcpOfAllSubstrings(s):
for i in range(len(s)):
count = 0
for l1, l2 in zip(s, s[i:]):
if l1 != l2:
break
count += 1
print('{}: len(LCP(s({}, {}), s)) = {}'.format(i + 1, i + 1, len(s), count))
# Test code
lcpOfAllSubstrings("ababac")
```

Output

```
1: len(LCP(s(1, 6), s)) = 6
2: len(LCP(s(2, 6), s)) = 0
3: len(LCP(s(3, 6), s)) = 3
4: len(LCP(s(4, 6), s)) = 0
5: len(LCP(s(5, 6), s)) = 1
6: len(LCP(s(6, 6), s)) = 0
```

```
public String helper(String s1, String s2) {
StringBuilder sb = new StringBuilder();
int mLen = min(s1.length(), s2.length());
for (int i = 0; < mLen; i++) {
if(s1.charAt(i) != s2.charAt(i)) {
break;
}
sb.append(i);
}
return sb.toString();
}
public String LPS(String arr[], int low, int high) {
if(low == high) return arr[low];
if(high > low) {
int mid = low + (high - low) / 2;
String s1 = LPS(arr, low, mid);
String s2 = LPS(arr, mid + 1, high);
return (helper(s1, s2));
}
return null;
}
```

```
public String helper(String s1, String s2) {
StringBuilder sb = new StringBuilder();
int mLen = min(s1.length(), s2.length());
for (int i = 0; < mLen; i++) {
if(s1.charAt(i) != s2.charAt(i)) {
break;
}
sb.append(i);
}
return sb.toString();
}
public String LPS(String arr[], int low, int high) {
if(low == high) return arr[low];
if(high > low) {
int mid = low + (high - low) / 2;
String s1 = LPS(arr, low, mid);
String s2 = LPS(arr, mid + 1, high);
return (helper(s1, s2));
}
return null;
}
```

A solution in Go. I'm not convinced it's actually O(n) time though. I think it ends up being O(kn) where k is the average length of a substring (excluding s being a substring of itself).

That's because on each loop through, on average it loops through 'k' candidates to check for progress in my solution.

I don't know how you'd do this in O(n); none of the current solutions appear to be that efficient.

```
package main
import (
"fmt"
"sort"
"strings"
)
func main() {
s := "ababac"
result := LongestPrefixesFrom(s)
fmt.Println(FormatResult(s, result))
}
// Takes a string. Returns a map from start index to length for substrings of s
// that are also prefixes of s.
func LongestPrefixesFrom(s string) map[int]int {
// HashMap of strings from their start point to length that are not ongoing
// in our current scan.
// Pre-insert 's' since it's always a substring of itself of length s
res := map[int]int{
0: len(s),
}
// HashSet of candidate. As we move through indexes, each candidate in this
// set is valid for index i-1, starting at the index recorded in the hashset.
// That is to say, with the example string 'ababc', the hashset might contain
// '2' to indicate a valid substring starts at index '2'.
// On the next index, 3, it would be checked this substring is still valid by
// checking if the value at index 3 (1 away from the start of 2) matches the
// value 1 from the start.
// In this example it does match. The check will repeat at index 4 (2 away
// from the start at index 2). Since the value at index 2 doesn't match, that
// means the last valid character in the substring was the previous one, so
// it can now be removed from the currently tracked substrings and moved into
// the result set.
candidates := map[int]struct{}{}
for i := 0; i < len(s); i++ {
for candidate := range candidates {
// candidate is something like 'index 5' indicating that a prefix of s starts at 5.
// To check if this candidate is still valid from 5 to i, we just check
if s[i-candidate] != s[i] {
delete(candidates, candidate)
res[candidate] = i - candidate
}
}
if s[0] == s[i] {
candidates[i] = struct{}{}
} else {
res[i] = 0
}
}
// add any existing candidates; those run until the end of the string
for candidate := range candidates {
res[candidate] = len(s) - candidate
}
return res
}
func FormatResult(s string, res map[int]int) string {
parts := []string{}
for key, val := range res {
// +1 for 1 index instead of 0 index to match the question
str := fmt.Sprintf("len(LCP(s(%v,%v)) = %v", key+1, len(s), val)
parts = append(parts, str)
}
// sort because map iteration is random in go
sort.Strings(parts)
return strings.Join(parts, "\n")
}
```

Hi DaveX, could we maybe try to use a Trie for this situation?. Like insert the word "ababac" and just take substrings of it by walking down the trie? I could be wrong, but wouldn't it be O(n) time then? What are your thoughts?

Let's say you have the worst case string, which is "aaaaa...".

Your trie is now just "a -> a -> a -> a ..." (which is identical your list of "a"s really).

Let's look at the runtime of lookups in that trie: To lookup the first character, you walk n characters in the trie. For the second, you walk n-1 characters, then n-2, n-3, etc.

That ends up being the sum of 0..n, which is O(n^2).

I don't think a trie is actually any more efficient than searching the list itself because there's only one prefix to search, not a bunch, and tries only really help if there are a bunch of prefixes.

Python3

inp = 'ababac'

sdict = {}

for i in range(len(inp)):

sdict[i] = inp[i:]

count = 0

LCP = {}

for sub_s in sdict.values():

for i in range(1,len(sub_s)+1):

if sub_s[:i] == inp:

LCP[sub_s] = len(inp)

count = 0

break

elif inp.startswith(sub_s[:i]):

count += 1

else:

LCP[sub_s] = count

count = 0

break

for key,value in LCP.items():

print('{}: {}'.format(key,value))

Something like this should be ok..

The program should be pretty much self explanatory

Hope this helps

- PeyarTheriyaa December 02, 2018