Google Interview Question
Software Engineer / DevelopersCountry: India
Interview Type: Written Test
There is one little mistake in your code. Try the string "BCDEBCDBCC"
Below is the corrected code.
#include<iostream>
#include<string>
std::string lex_min(const std::string&);
int main() {
std::cout<<lex_min("BCDEBCDBCC");
return 0;
}
std::string lex_min(const std::string& str) {
int size = str.size();
std::string input = str + str;
int offset = 0;
int answer = -1;
for (int i = 0; i < size * 2; i++) {
if (answer == -1)
{
// First character
answer = i;
}
else if (input[i] < input[answer]) {
// We definitely have a new answer here
answer = i;
// Reset the offset for future tie
offset = 0;
}
else if (input[i] == input[answer + offset]) {
// We might have another answer here
// move the offset forward
offset++;
}
else if (input[i] < input[answer + offset]) {
// we have found something even better
// than what we had before
// Set marker for new answer
answer = i - offset;
offset = 0;
if(input[i] == input[answer]) /* THIS IS THE CORRECTION*/
offset=1;
}
else {
offset = 0;
}
}
return input.substr(answer, size);
}
A better solution with time complexity O(n) and space complexity O(1).
Use the above algorithm without concatenating the string. (i.e. traverse only the original string)
But run the 'for' loop 2*size (same as above).
Just use index as ( i mod size ) and ((answer+offset) mod size). It will automatically make indices within the range.
I read Pratyay's code and Tarang's comments and came up with this C# version.
public static void LexicalMin(string word)
{
int idx = 0;
for (int i = 1; i < word.Length; i++)
{
int offset = 0;
for (int j = i; (j+1)%word.Length != i; j = (j + 1) % word.Length)
{
if (word[j] < word[(idx+offset)%word.Length])
{
idx = i;
break;
}
else if (word[j] > word[idx + offset])
{
break;
}
offset++;
}
}
//idx now is the beginning of the word that we need
Console.WriteLine(idx+" "+word[idx]);
}
Given string S.
For String S' = SS (append S to itself).
Compute suffix tree (ST) of S.
Now do a depth first search of ST, picking the children in lexicographic order.
Pick the first node you find, at depth |S|.
Linear time algorithm.
Obviously, this ignores the 'write code' part, but the question was unclear about working code, time limits etc.
The algorithm does work, and might be a good thing to discuss with your interviewer.
Manan, you don't have to downvote because of that.
(Or is there a mistake?).
We can first identify the lexicograpically first character and then start building suffix tree neglecting other starting characters. Also to be frank we don't need to build whole trie we can just keep replacing an array of characters as we are only interested in knowing the leading branch. Let me know if this is correct?
Easy to find one Omega(nlogn) solution. n = |s|
(1) s = s + s, for example, s="abc", then s = "abc"+"abc"="abcabc"
(2) for any integer i, t[i]=s[i,i+1...i+n-1]
(3) we want to find such i that t[i] is lexicographic mininum.
now we can not store all t[i] (the space may be O(n^2))
so we want to write one function cmp(i,j)
to compare t[i] and t[j], but how?
First of all, we calculate the Hash array. O(2n)=O(n)
Hash[i]=s[0]*p^(i)+s[1]*p^(i-1)+...+s[i-1]*p^1+s[i]
If we want to get the Hash value of the substring s[x..y], we can simply have:
Hash(x..y) = Hash[y]-Hash[x-1]*pow(p,y-x+1) O(1) // we can calculate pow(p,*) at first in O(2n)=O(n) time
Then for 2 given integer i and j, we wanna compare t[i] and t[j]
(1) we find LCP(t[i],t[j]). Binary search with Hash O(logn)
(2) compare the (LCP(t[i],t[j])+1) -th value;
for example:
t[i]="abcd"
t[j]="abef"
LCP(t[i],t[j])=2
so we compare t[i][3] = 'c' and t[j][3] = 'e'
then we get the result.
Now we can compare t[i], t[j] in O(logn) time
Then we use one random algorithm to find the k-th "i". O(n * logn) (The algorithm is something like the quick-sort)
Algorithm
maintain the starting index of the minimum string and the ending index from the start, which contains duplicate characters with the substring ending at the current index
if the character at the current index is equal to the character at the ending index, increment the ending index
if the character at the current index is smaller than the character at the ending index, set the ending index to zero
if the character at the current index is bigger than the character at the ending index, set the start to the index where the duplicate substring ending at the current index starts
Time: O(n)
public String minString(String a)
{
int n = a.length();
int start = 0; // the starting index of the minimum string
int i = 0; // the ending index from start, which contains duplicate characters
for(int k = 1; k < n || i % n > 0 ; k++) {
if(a.charAt((start + i) % n) == a.charAt((start + k) % n)) {
i++;
} else if(a.charAt((start + i) % n) < a.charAt((start + k) % n)) {
i = 0;
} else {
start = (start + k - i) % n;
k = (k - i - 1) % n;
i = 0;
}
}
return a.substring(start) + a.substring(0, start);
}
public String minArr(String s)
{
char min = 'Z';
for(int i = 0; i < s.length(); i++)
if(s.charAt(i) < min)
min = s.charAt(i);
ArrayList<Integer> pos = new ArrayList<Integer>();
for(int i = 0; i < s.length(); i++)
if(s.charAt(i) == min)
pos.add(i);
int p = getMinPos(pos, s);
return s.substring(p) + s.substring(0, p);
}
public int getMinPos(ArrayList<Integer> pos, String s)
{
int add = 1;
int len = s.length();
s += s;
while(pos.size() > 1 && add < len)
{
for(int i = 1; i < pos.size(); )
{
if(s.charAt(pos.get(i)+add) < s.charAt(pos.get(i-1)+add))
pos.remove(i-1);
else if(s.charAt(pos.get(i)+add) > s.charAt(pos.get(i-1)+add))
pos.remove(i);
else
i++;
}
add++;
}
return pos.get(0);
}
1) Use a trie and insert in it all the N length strings like BCABDADAB,CABDADABB etc. Find the left most path in this trie.
O(N*N) , O(N*N)
2) Find out all the rounded strings of length N (the strings used in 1)) that start with the smallest char i.e ABDADABBC,ADABBCABD,ABBCABDAD (all start with A, ans would be one of these) Find lexicographical min in O(N).
3) Or instead of copying into temp strings (3 O(N) strings)) use 3 pointers with start address of 3 A's. Max O(N) space for index pointers.
for(l=0;l<N;l++)
{
for(k=0;k<ptListLen,k++) //len =3
find the min of Str[(ptList[k]+l)%N] (when l=1, you will be comparing B,D,B 1st index in the three strings.
if(multiple equal mins (like B) discard other pointer indices other that min and continue searching till you get only only 1 min.
}
T : O(N*N), S=O(N) : time can be reduced is getting the count of mins can be reduced somehow
since we're comparing strings with the same length, assuming there are N letters in the alphabet, you'd suffice to transform the string and it's rotated versions to an integer by changing from base N to for example base 10 and keep a record of the smallest.
This can be done on O(n) time and O(1) space.
Brute force would be O(n*n). That is comparing all possible permutation.
We can do little better if we calculate permutation only starting with minimum element i.e. A. In this example we have 3 permutations for that.
Treat the string as a number in base 27.
for (i = string.length-1; i >= 0; i--)
number = string[i] + number * 27;
min_location = string.length - 1;
min_number = number;
for (i = string.length-1; i >= 0; i--) {
new_number = (number - (string[i] * pow (27 , 4))) * 27 + string[i];
if (new_number < min_number) {
min_location = i;
min_number = new_number;
}
}
You will quickly overflow your int. If you try to take care of that, it essentially becomes brute force.
You can divide everything by 100 or 1000. Your 27 will become 27 * 10^-3 and you can scale all the char values respectively.
@Dad, Glad I didn't get some eating habits from you.
If you cannot understand the comment, you are not worth wasting time. Goodbye.
LOL @amir.
As to explanation of the comment: Suppose the string had length 1000. Now you try to compute 27 to the power of 999. Guess how many bits it requires?
This is an interesting approach but it has limitation but I think amir will be given some points for this in real interview.. better than people like CuriousCat who just give non-contructive comments in all the posts...
@Manan: You are kidding, right? There is really no point in converting to base 27. In fact, you have already been given the number (as an array) in base-27, if you think about it carefully. All you do by converting to base-27 is waste some time and space (you aren't saving any space either!).
CuriousCat's comments are spot-on in this thread. And, perhaps a surprise to you, those comments are for _your_ benefit and i find them constructive. If you do (just) brute force, interviewers at Google don't look upon you kindly(I believe Gayle might have mentioned this in one of her blog posts). You might get some points, but remember, you already have a 1 in 10 chance of getting in with much better candidates out there. Crap like this will easily put you into the 90% bucket and you are just shooting yourself in the foot.
As to the idea, yes, it might be interesting, but for a different problem. (Read up on rolling hashes).
@Manan: And as to the point about constructive comments. Please give a reason for your downvote to my answer.
Hypocrite.
Anonymous : I totally agree with what you are saying. The suffix tree solution is a better solution if you want to handle arbitrary strings. I wrote this solution because it is more interesting to me and I knew from the beginning that bit size is going to be a problem. :)
@amir: LOL. I don't have any worries about other people being dumb!
You can fool yourself as you please, but your solution is still crap.
I dont know why all people are so dumb,.. Wish all people could be as good as me... sigh!
Another "brute-force" approach, but hopefully less brute-force than others. Time complexity is still O(n^2).
int minStart(char[] arr) {
int len = arr.length;
char[] repeat = new char[2*len];
System.arraycopy(arr, 0, repeat, 0, len);
System.arraycopy(arr, 0, repeat, len, len);
int minStart = 0;
for(int i=1; i<len; i++) {
for(int j=0; j<len; j++) {
if(repeat[minStart+j] < repeat[i+j])
break;
if(repeat[minStart+j] > repeat[i+j]) {
minStart = i;
break;
}
}
}
return minStart;
}
Here is a O(n) solution :
- Adrien November 22, 2012We concat the string twice, then browse the string from left to right,
keeping the the start position of the current answer seen so far.
When we see a letter that would possibly be the start pos of a new answer, we do
not update it yet, instead we keep an offset that defines the size of the prefix
that has been checked for comparison of the current best answer and the possible
new answer.
std::string lex_min(std::string& str)
{
int size = str.size();
std::string input = str + str;
int offset = 0;
int answer = -1;
for (int i = 0; i < size * 2; i++)
{
if (answer == -1)
{
// First character
answer = i;
}
else if (input[i] < input[answer])
{
// We definitely have a new answer here
answer = i;
// Reset the offset for future tie
offset = 0;
}
else if (input[i] == input[answer + offset])
{
// We might have another answer here
// move the offset forward
offset++;
}
else if (input[i] < input[answer + offset])
{
// we have found something even better
// than what we had before
// Set marker for new answer
answer = i - offset;
offset = 0;
}
else
{
offset = 0;
}
}
return input.substr(answer, size);
}