You have to compress a string

Flipkart Interview Question for SDE-2s

3

of 3 votes

19
Answers
You have to compress a string in the following format.
eg 1- : input : aasasatb
output : 2a2sa1t1b
eg2 -: input: abcdbcdff
output :- 1a2bcd2f
- neer.1304 May 11, 2015 in India | Report Duplicate | Flag | PURGE
Flipkart SDE-2 Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 1 vote

Greedily tries to match as many characters as possible, which is admittedly not optimal for some strings.

std::string encode(char* inStr)
{
   std::string rval;
   while (*inStr)
   {
      int dups = 0;
      int tryLen = std::max((int)strlen(inStr)/2, 1);
      for (; tryLen && !dups; tryLen--)
         for (; !strncmp(inStr + tryLen*(dups+1), inStr, tryLen); dups++);
      rval += std::to_string(dups+1) + std::string(inStr,tryLen+1);
      inStr += (tryLen+1)*(dups+1);
   }
   return rval;
}
void main()
{
   char* tests[] = {"aasasatb", "abcdbcdff"};
   for (int i = 0; i < sizeof(tests)/4; i++)
      std::cout << tests[i] << " -> " << encode(tests[i]) << '\n';
   getch();
}

output:

aasasatb -> 2a2sa1t1b
abcdbcdff -> 1a2bcd2f

- tjcbs2 May 12, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

/*	output:
aaaasasatb = 4a2sa1tb
aaaaabaaaaabaaaaabaaaaab = 4aaaaab
aaaaabaaaaabaaaaabZaaaaab = 3aaaaab1Z5a1b
abcdbcdff = 1a2bcd2f
aaaabbbbabababcdabcdeeeffffefef = 4a4b2ab2abcd3e4f2ef
*/

var shrinkHash = {};

function shrink(str) {
    //    find best cycle
    if (shrinkHash[str]) return shrinkHash[str];
    for (var size = 1; size <= str.length / 2; size++) {
        var chunk = str.substr(0, size);
        var isCycle = true;
        for (var i = chunk.length; i < str.length; i += chunk.length) {
            if (chunk != str.substr(i, chunk.length)) {
                isCycle = false;
                break;
            }
        }
        if (isCycle) {
            return shrinkHash[str] = (str.length / chunk.length) + chunk;
        }
    }
    return shrinkHash[str] = "1" + str;
}

function compress(str, record) {
    if(shrinkHash[str]) {
        return shrinkHash[str];
    }
    if (!str.length) return "";
    var minString = "1" + str;
    if (!record) record = minString.length;
    for (var i = 1; i <= str.length; i++) {
        var left = shrink(str.slice(0, i));
        if (left.length > record) {
            continue;
        }
        var right = compress(str.slice(i), record);
        var finalString = left + right;
        if (finalString.length < minString.length) {
            minString = finalString;
            record = minString.length;
        }
    }
    return shrinkHash[str] = minString;
}

function solve(str) {
    var html = "<b>" + str + "</b> = <i>" + compress(str) + "</i>";
    var div = document.createElement("div");
    div.innerHTML = html;
    document.body.appendChild(div);
}

solve("aaaasasatb");
solve("aaaaabaaaaabaaaaabaaaaab");
solve("aaaaabaaaaabaaaaabZaaaaab");
solve("abcdbcdff");
solve("aaaabbbbabababcdabcdeeeffffefef");

- Jack Le Hamster May 13, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I thought I'd describe the algorithm above.
Basically, we keep cutting out a chunk of the string, and recursively call solve on the rest of it:
So for aaaasasatb, cut "a" and call compress on "aaasasatb".
Then cut "aa", compress("aasasatb")
...
On each chunk that we cut, we see if we can shrink it, using the shrink function.
So we'll get
"1a" + compress("aaasasatb")
"2a" + compress("aasasatb")
"3a" + compress("asasatb")
"4a" + compress("sasatb")
"1aaaas"+compress("asastb") <= note that it's unable to shrink "aaaas".
...
"1aaaasasat"+compress("b")

Out of all the result above, we'll look for the shortest string, which is going to be "4a"+compress("sasatb").

The shrink function work as follow: Try to see if there's a cycle until the end with the first letter, then with the first 2 letter, first 3 letters ...
So for "sasa", see if there's a cycle with "s". NO
See if there's a cycle with "sa". YES. So turn "sasa" into "2sa".

It seems like a lot of computation, but thanks to memorization (via shrinkHash), it's quite fast!

- Jack Le Hamster May 13, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Not done, but this one is taking a while, and some interesting edge cases popped up. This doesn't seem realistic in terms of time for a normal code-interview, unless it's an offline interview.

Java:

import org.junit.Test;

public class Test1 {
    @Test
    public void compress() {
        System.out.println(compress("aasasatb"));
        System.out.println(compress("aaaaabaaaaabaaaaabaaaaab"));
        System.out.println(compress("aaaaabaaaaabaaaaabZaaaaab"));
        System.out.println(compress("abcdbcdff"));
        System.out.println(compress("aaaabbbbabababcdabcdeeeffffefef"));

//        output:
//        2a2sa1t1b
//        2aaaaabaaaaab
//        3aaaaab1Z2aa1a1b
//        1a2bcd2f
//        2aa2bb3ab1c1d1a1b1c1d3e2ff2ef
    }

    public String compress(String value) {
        int length = value.length();
        int cursor = 0;

        String compressed = "";//TODO stringbuilder

        while (cursor < length) {
            int end = (cursor + (length - cursor) / 2);

            int matchIndex = findMatchingCompressionIndex(value, cursor, end);

            int quantity = 1;
            if (matchIndex != -1) {
                quantity++;
                int matchLength = matchIndex - cursor;
                int next = matchIndex + matchLength;

                while(next+matchLength < length && subStringMatches(value, cursor, next, matchLength)) {
                    next += matchLength;
                    quantity++;
                }
                compressed += quantity + value.substring(cursor, cursor+ matchLength);
                cursor += quantity*matchLength;
            } else {
                compressed += quantity + ""+ value.charAt(cursor);
                cursor ++;
            }
        }

        return compressed;
    }


    public int findMatchingCompressionIndex(String value, int startIndex, int end) {
        for (int i = end; i > startIndex; i--) {
            if (subStringMatches(value, startIndex, i, i-startIndex)){
                return i;
            }
        }
        return -1;
    }

    public boolean subStringMatches (String value, int start1, int start2, int length) {
        for (int i = 0; i < length; i++) {
            if (value.charAt(start1+i) != value.charAt(start2+i)) {
                return false;
            }
        }
        return true;
    }
}

Of the interesting edge-cases, you'll notice that test #2 returned "2aaaaabaaaaab" when "4aaaaab" would have been more appropriate. Maybe I'll fix that later. I feel that "if (matchIndex != -1)" piece could be refactored to be more readable and modular. Beyond that my personal timer ran out.

- psyfireDev May 12, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

import java.util.HashSet;
import java.util.Set;

public class TestJava {
	
	public static void main(String[] args){
		
		Set<String> setstring = new HashSet<>() ;
		String input1 = "AABBBCCCCD";
		String output = "";
		int count =0;
		
		String input = input1.toLowerCase().trim();
		
		for(int i= 0;i<input1.length();i++){
			
		
			setstring.add(String.valueOf(input.charAt(i)));
			
			
		}
		
		setstring.toString().toLowerCase();
		
		System.out.println(setstring.toString());
		
		for (int j=0;j<setstring.size();j++){
			
			
			for(int k= 0;k<input.length();k++)
			{
			if(String.valueOf(input.charAt(k)).equals(setstring.toArray()[j].toString())){
				count ++;
			
				
					
			}
			
			}
			
			output = output+setstring.toArray()[j].toString()+count;
			count = 0;
		
			
		}
		System.out.println(output);
	}
	
}

- Satyam May 12, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

IIRC its implementation of lempel ziv compression algorithm.

- Manku May 12, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public class CountStrLenght {

	public static void main(String[] args) {
		
			// TODO Auto-generated method stub
			String input = "aabbhmhmhmhmhmhmxccddklklmnmnmnoaadmkbcdbcd";
			String finalString = input;
			int strLen = input.length();
			Set<Integer> oPositionSet = new LinkedHashSet<Integer>();
			HashMap<String, String> oMap = new LinkedHashMap<String, String>();

			int subLen = 1;
			while (subLen <= strLen / 2) {
				int preIndex = 0;
				while (preIndex + subLen < strLen) {

					String cmpStr = input
							.substring(preIndex, preIndex + subLen);
					String remStr = input.substring(preIndex + subLen, strLen);
					int wordcount = 0;

					while (remStr.length() >= cmpStr.length()
							&& (remStr.startsWith(cmpStr))) {
						if (wordcount == 0) {
							preIndex = preIndex + subLen;
						}
						wordcount++;
						preIndex = preIndex + subLen;
						remStr = input.substring(preIndex, strLen);

					}
					if (wordcount > 0) {
						String newStr = String.valueOf(wordcount + 1) + cmpStr;
						String toReplace = cmpStr;
						while (wordcount > 0) {
							toReplace += cmpStr;
							wordcount--;
						}

						int strIndex = input.indexOf(toReplace, preIndex
								- toReplace.length());
						int strEndIndex = strIndex + toReplace.length();
						
						boolean existPosition = true;
						for (int k = strIndex; k < strEndIndex; k++) {
							if (!oPositionSet.add(k))
								existPosition = false;
						}
						if (existPosition)
							oMap.put(toReplace, newStr);

					} else {
						preIndex++;
					}

				}
				subLen++;

			}

			finalString = input;
			for (int k = 0, i = 0; k < input.length(); k++, i++) {
				if (!oPositionSet.contains(Integer.valueOf(k))) {
					String tonewReplace = "1" + String.valueOf(input.charAt(k));
					finalString = finalString.substring(0, i)
							+ tonewReplace
							+ finalString
									.substring(i + 1, finalString.length());
					i++;
				}
			}

			Iterator<String> oIter = oMap.keySet().iterator();
			while (oIter.hasNext()) {
				String key = oIter.next();
				finalString = finalString.replaceAll(key, oMap.get(key));
			}
			System.out.println(finalString);

		
	}
}

- Raveesha May 12, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

can you please provide algorithm or approach first before pasting code here

- siva.sai.2020 May 13, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

public class CountStrLenght {

	public static void main(String[] args) {
		
			// TODO Auto-generated method stub
			String input = "aabbhmhmhmhmhmhmxccddklklmnmnmnoaadmkbcdbcd";
			String finalString = input;
			int strLen = input.length();
			Set<Integer> oPositionSet = new LinkedHashSet<Integer>();
			HashMap<String, String> oMap = new LinkedHashMap<String, String>();

			int subLen = 1;
			while (subLen <= strLen / 2) {
				int preIndex = 0;
				while (preIndex + subLen < strLen) {

					String cmpStr = input
							.substring(preIndex, preIndex + subLen);
					String remStr = input.substring(preIndex + subLen, strLen);
					int wordcount = 0;

					while (remStr.length() >= cmpStr.length()
							&& (remStr.startsWith(cmpStr))) {
						if (wordcount == 0) {
							preIndex = preIndex + subLen;
						}
						wordcount++;
						preIndex = preIndex + subLen;
						remStr = input.substring(preIndex, strLen);

					}
					if (wordcount > 0) {
						String newStr = String.valueOf(wordcount + 1) + cmpStr;
						String toReplace = cmpStr;
						while (wordcount > 0) {
							toReplace += cmpStr;
							wordcount--;
						}

						int strIndex = input.indexOf(toReplace, preIndex
								- toReplace.length());
						int strEndIndex = strIndex + toReplace.length();
						
						boolean existPosition = true;
						for (int k = strIndex; k < strEndIndex; k++) {
							if (!oPositionSet.add(k))
								existPosition = false;
						}
						if (existPosition)
							oMap.put(toReplace, newStr);

					} else {
						preIndex++;
					}

				}
				subLen++;

			}

			finalString = input;
			for (int k = 0, i = 0; k < input.length(); k++, i++) {
				if (!oPositionSet.contains(Integer.valueOf(k))) {
					String tonewReplace = "1" + String.valueOf(input.charAt(k));
					finalString = finalString.substring(0, i)
							+ tonewReplace
							+ finalString
									.substring(i + 1, finalString.length());
					i++;
				}
			}

			Iterator<String> oIter = oMap.keySet().iterator();
			while (oIter.hasNext()) {
				String key = oIter.next();
				finalString = finalString.replaceAll(key, oMap.get(key));
			}
			System.out.println(finalString);

		
	}
}

- Raveesha May 12, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Tried few test cases but got wrong answer for them

1. String input = "aasasatb"; O/p - 2asas1a1t1b Exp O/p - 2a2sa1t1b
2. String input = "aaaaabaaaaabaaaaabaaaaab" 5ab5ab5ab5ab Exp O/p- 4aaaaab
3. String input = "aaaaabaaaaabaaaaabZaaaaab" 5ab5ab5ab1Z5a1b Exp O/p - 3aaaaab1Z4a1b

- neer.1304 May 13, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

A functional O(n^2) approach:

private static class Chunk{
    Chunk next;
    ArrayList<Character> charArr;
    int count;

    private Chunk(){
        this.char = new ArrayList<Character>();
    }
}

public static String encode(String input){
    //handle simple outputs
    if(input == null || input.isEmpty()){
        return "";
    }
    /get the input
    char[] arr = input.toCharArray();
    //build a head for the Chunking chain
    Chunk head = new Chunk();
    head.charArr.add(arr[0]);
    head.count++;
    //grab a pointer to the current chunk being processed
    Chunk currentChunk = head;
    int inputIndex = 1;
    //while there is content to be processed
    while(inputIndex < arr.length){
        int chunkLocation = 0;
        int inputIndexRunner = inputIndex;
        //send a runner to see if the content of the string matches the 
        //current chunk
        boolean matches = true;
        while(chunkLocation < currentChunk.charArr.size()){
            if(arr[inputIndexRunner] != currentChunk.charArr.get(chunkLocation)){
                matches = false;
                break;
            }
            chunkLocation++;
            inputIndexRunner++;
        }
        //if it matches, then just up the count on this chunk
        if(matches){
            currentChunk.count++;
            inputIndex += currentChunk.charArr.size();
        }
        else{
           //if the current chunk only passes through once, 
           //add the character to the chunk
           if(currentChunk.count == 1){
               currentChunk.charArr.add(arr[inputIndex]);
           }
           //otherwise this has to be a new chunk
           else{
               Chunk next = new Chunk();
               currentChunk.next = next;
               currentChunk = currentChunk.next;
               currentChunk.charArr.add(arr[inputIndex]);
               currentChunk.count = 1;
           }
           inputIndex++;
        }
    }

    //process the chunks to output
    return processChunksToString(head);
}

private static String processChunksToString(Chunk chunk){
    StringBuilder builder = new StringBuilder();
    while(chunk != null){
        builder.append(chunk.count);
        for(Character c : chunk.charArr){
            builder.append(c);
        }
        chunk = chunk.next;
    }
    return builder.toString();
}

- zortlord May 12, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Simple brute force attack.

public class Solver {


    public static void main(String... args) {

        Solver solve = new Solver();
        System.out.println(solve.solve("aasasatb"));
        System.out.println(solve.solve("abcdbcdff"));
    }

    public String solve(String a) {
        StringBuilder sb = new StringBuilder();

        int foundCounter;
        int curSize;
        for (int index = 0; index < a.length(); index = index + (foundCounter * (curSize+1))) {
            foundCounter = 1;
            for (curSize =Math.max( (a.length() - index)/ 2,1) ; curSize >= 1 && foundCounter == 1; curSize--) {
                for (int i = index + curSize; i < a.length(); i = i + curSize) {
                    if (a.substring(index, index + curSize).equals(a.substring(i, Math.min(i + curSize, a.length())))) {
                        foundCounter++;
                    } else {
                        break;
                    }
                }
                if (foundCounter > 1 || curSize == 1) {
                    sb.append(foundCounter).append(a.substring(index, index + curSize));
                }
            }
        }
        return sb.toString();

    }


}

- Gil May 13, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public class CountStrLengthNew {

	public static void main(String[] args) {
		// TODO Auto-generated method stub
		System.out.println("ttllmmnn                  ------------After Compress ---------"+compress("ttllmmnn"));
		System.out.println("aaaaabaaaaabaaaaabZaaaaab ------------After Compress ---------"+compress("aaaaabaaaaabaaaaabZaaaaab"));
		System.out.println("aasasatb                  ------------After Compress ---------"+compress("aasasatb"));
		System.out.println("ababababababababab        ------------After Compress ---------"+compress("ababababababababab"));
		System.out.println("aabbaabba                 ------------After Compress ---------"+compress("aabbaabba"));
		System.out.println("abababababababababkc      ------------After Compress ---------"+compress("abababababababababkc"));
		System.out.println("aaaaabaaaaabaaaaabZaaaaab ------------After Compress ---------"+compress("aaaaabaaaaabaaaaabZaaaaab"));
		
	}

	public static String compress(String input) {
		int inputLength = input.length();
		int compLength = input.length() / 2;
		String compressedString = "";
		for (int startIndex = 0; startIndex < input.length(); startIndex++) {
			int wordcount = 0;
			compLength = (input.length() - startIndex) / 2;

			if (input.length() - startIndex == 1) {
				compressedString = compressedString + 1
						+ String.valueOf(input.charAt(inputLength - 1));
				break;
			}

			while (compLength != 0) {
				String cmpStr = input.substring(startIndex, startIndex
						+ compLength);
				String remStr = input.substring(startIndex + compLength,
						inputLength);

				while (remStr.length() >= cmpStr.length()
						&& (remStr.startsWith(cmpStr))) {
					if (wordcount == 0) {
						startIndex = startIndex + compLength;
					}
					wordcount++;
					startIndex = startIndex + compLength;
					remStr = input.substring(startIndex, input.length());
				}
				if (wordcount != 0) {

					int newInterVals = getInternalPatternLen(cmpStr);
					wordcount = (wordcount + 1) * newInterVals;
					int newPatterLength = cmpStr.length() / newInterVals;
					String newString = cmpStr.substring(0, newPatterLength);

					if (remStr != null && remStr.length() >= newPatterLength
							&& remStr.startsWith(newString)) {
						wordcount++;
						startIndex = startIndex + newPatterLength;
					}

					compressedString = compressedString + wordcount + newString;

					startIndex--;
					break;
				} else {
					compLength--;
				}
			}
			if (wordcount == 0 && compLength == 0) {
				compressedString = compressedString + 1
						+ input.substring(startIndex, startIndex + 1);
			}
		}

		return compressedString;
	}

	public static int getInternalPatternLen(String cmpStr) {
		int intervalcounter = 1;
		int count = 1;
		if (cmpStr.length() == 1)
			return intervalcounter;
		while (count <= cmpStr.length() / 2) {
			intervalcounter = 1;
			for (int k = count; k < cmpStr.length(); k += count) {
				if (cmpStr.substring(0, count).equals(
						cmpStr.substring(k, count + k))) {
					intervalcounter++;
				} else {
					intervalcounter = 1;
					break;
				}
			}
			if (intervalcounter != 1)
				break;
			count++;
		}
		return intervalcounter;
	}

}

- raveeshajoshi May 14, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

You can see my thoughts below
1) keep starting index 0, take half of the string (compare length)and start comparing with remaining
2) decrement compare counter length if there is no mach in previous compare length
3) If there is a match find count of the match
4) if no matches till it reaches count 1 , mark it as 1<first character> , move index by 1
5) Repeat these steps until index moves to input string length
6) There is a extra method call added once number of matching words found to find out if any internal matchings available. ex:abababab ,

- raveeshajoshi May 14, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public class StringCompress
{
    public static String compress(final String source)
    {
        String str = source;
        final StringBuilder sb = new StringBuilder();
        int i = 0;
        int max = 1;
        max = firstSubstringWithoutDuplication(source);
 
        String last = str.substring(i,i+max);
        int j=i+1;
        i = i+max; 
        int count = 1;
        boolean flag = true;
        String current = "";
        while (i < str.length())
        {
            if(i+max<=str.length())
               current = str.substring(i,i+max);
            else 
                current = str.substring(i);
            if (current.equals(last))
            {
                count++;
                i = i+max;
                flag = false;
            }
            else if(flag)
            {
                sb.append(count);
                sb.append(last.charAt(0));
                count = 1;
                max = firstSubstringWithoutDuplication(source.substring(j));
                if(j+max<=str.length())
                   last = str.substring(j,j+max);
                else
                   last = str.substring(j);
                i=j+max;
                j=j+1;
            }
            else
            {
                sb.append(count);
                sb.append(last);
                count = 1;
                max = firstSubstringWithoutDuplication(source.substring(i));
                last = str.substring(i,i+max);
                i=i+max;
 
            }
        }
        sb.append(count);
        sb.append(last);
        return sb.toString();
    }
 
    public static void main(final String[] args)
    {
       System.out.println(compress("aab"));
       System.out.println(compress("aasasatb"));
       System.out.println(compress("abcdbcdff"));
       System.out.println(compress("xyzabcdbcdff"));
    }
 
    public static int firstSubstringWithoutDuplication(String str) {
        int maxLength = 1;
 
        int checker = 0;
        for (int i = 0; i < str.length(); i++) {
            int val = str.charAt(i) - 'a';
            if ((checker & (1 << val)) > 0){ return i; }
            checker |= (1 << val);
        }
 
        return maxLength;
    }
}

- shaileshagarwal1 October 13, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

With smallest use of memory and O(n^2) .

- shaileshagarwal1 October 13, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

String S= "aaabbbbccdeee";
 int count=1; 
 for(i=1;i<S.length;i++)
{ if(S.charAt(i)!=S.charAt(i-1))
    { 
      S1.append(count).append(S.charAt(i))
      count=1;
    }
else
     count++;

- rohit November 21, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

string& RunLengthEncodedString(char* pString)
{
int len = strlen(pString) - 1;
char *last = pString;
string myStr = "";
int count = 0;
for (auto i = 0; i < len; i++)
{
if (pString[i] == *last)
{
count++;
}
else
{
myStr += (*last);
myStr += count;
count = 1;
*last = pString[i];

}
}

myStr += (*last);
myStr += count;
return myStr;

}

- Prabhat Kumar Srivastava May 12, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

import java.util.HashSet;
import java.util.Set;

public class TestJava {

public static void main(String[] args){

Set<String> setstring = new HashSet<>() ;
String input1 = "AABBBCCCCD";
String output = "";
int count =0;

String input = input1.toLowerCase().trim();

for(int i= 0;i<input1.length();i++){

setstring.add(String.valueOf(input.charAt(i)));

}

setstring.toString().toLowerCase();

System.out.println(setstring.toString());

for (int j=0;j<setstring.size();j++){

for(int k= 0;k<input.length();k++)
{
if(String.valueOf(input.charAt(k)).equals(setstring.toArray()[j].toString())){
count ++;

}

}

output = output+setstring.toArray()[j].toString()+count;
count = 0;

}
System.out.println(output);
}

}

CareerCup

Flipkart Interview Question for SDE-2s

Books

Videos

Resume Review

Mock Interviews