Microsoft Interview Question
Applications DevelopersCountry: India
Interview Type: Written Test
Each integer has 32 bits. So, it can accommodate at-least 9 digits.
The idea is club as many numbers as possible to a single number. While clubbing we have to maintain the number of digits of each number being clubbed. Hence, if input number is 'xyz', then in output we have to write 4xyz. While decoding, use these extra digits as delimiters.
Example:
If we have [1,2,3,4,5,20,123], we can convert into [11121314,152203123] .
Initial memory required=7*4=28 bytes
After compression, memory required=2*4=8 bytes.
I dont think MS wants to test our Data Compression Skill set. I would go for Run Length Encoding which is the simplest.
This is my approach
1. Create a int array of size 10 +1 for separator.
2. Loop through each integer and update its frequency for [111,1923] => 0,4,1,1,0,0,0,0,0,1,1
3. Construct Huffman tree with these most frequently occurred near root.
4. Compress the data and return byte[].
5. Do the inverse in uncompress take byte[] and return int[].
Write two functions:
First function takes in an array and returns the compressed data in any format.
Second function takes in the compressed data and returns the array in original format.
i consider most of these answers as over engineering. Almost all of them have missed the fact that the given list is a "sequence". All you need to do is write the first digit of each sequence and the actual sequence length and keep repeating.
For e.g. 1, 2, 3 7, 8, 13, 17, 18, 19, 20 becomes 1,3,7,2,13,1,17,5
I meant.. 1,3,7,2,13,1,17,4.
But you get the point. Larger the sequence, better is the result.
I think you're misinterpreting the usage of the word "sequence" here. I doubt that the term "sequence" in the problem statement was meant to mean anything special. I think the question is just "Given an array of integers, how would you compress that?"
Another approach is to change from base 10 to larger base like base 67.
- Anonymous November 16, 2012