Amazon Interview Question
Software Engineer / DevelopersTeam: RCX
Country: United States
Interview Type: In-Person
but unicode character may consist of not only english alphabets!!
Do you think an array of length 26 will work??
Of course it won't work. You need an array of length 2^16-1. Whether or not this is acceptable depends on the situation.
1) Initially sort the array
After sorting the following code can solve this
main()
{
char a[]={a,a,a,a,b,b,c,c,d,d,e,e,e}; // Can be sorted in O(nlogn)
int max=0,curr=1,i;
char f;
for(i=0;i<9;i++)
{
if(a[i]==a[i+1])
{
curr++;
}
else
{
if(curr>max)
{
max=curr;
curr=1;
f=a[i];
}
}
}
printf("The max occuring character is %c",f);
}
This assumes you can destroy the original array, which you should ask the interviewer about.
1. Sort the array - O(n lg n)
2. Do a linear scan over the array. For each character, store the frequency in the array itself. So for example:
a,b,b,b,c,c,c,c,c,d,e,e,f,f,f will result in 1, b,3, c,5, e,2, f,3
3. Now scan the array once up to n-1 and:
a. If arr[i] is a number, i+1 // this means that this character had a frequency of 1
b. If arr[i] is a char && arr[i+1] > maxFrequency, update maxFrequency, i+2
c. else i+1;
Code assumes that at least one character repeats more than once.
Also, if n is known to be small, we can do Counting Sort and make the whole program run in O(n) time
Sorting the array has overhead: time for sure + space, if you are be allowed to mess up the original array!
For most "fast" sorts O(n log n) has a big constant multiplier for small Ns. So it might or might not be feasible. For small Ns a simple (e.g. insertion) sort with worse asymptotic behaviour could be better.
Can we use first part of the distribution counting here?
We can create a distribution table for the unicode characters i.e. D[0..255] and store the frequency of each character in D. Then, find the largest number in D. I agree this will be similar to Hash table, but what if n is very small?? This would be most efficient in such cases since its efficiency is linear.
Iterate array for the element and compare.. whenever comparison done , shift the values to left so as to avoid comparison for other disticnt element:
Forexample: a, b, a, c, a, b
1st iteration: comparing a , will change the array as: a ,b,c,b . Store the max(a) count to be in some temp variable.
similar for b iteration: a, b, c . Now check the max of count and set the variables as required.
By this we can avoid many comaprison. However , in worst case, such as a, b ,c ,d, e
complexity will be O(n2)
1. Make a sorted copy of the array
2. Run longest common sub-sequence to get the most repeated character
That's one way I can think of avoiding hash tables, but then sorted copy of array will take O(n) space.
This is actually a very unique way of finding max character! However, it will take up extra space, which kinda defeats the purpose of the question.
Hash tables are quite generic stuff. You can use a hash table but the hash function should be a bit more sophisticated than the character value (with Unicode this is 16 bits, resulting in a potentially huge hash table). A hash functions to add up/XORing the two bytes of the unicode char will do, size of hash table is reasonable (256 items). This case the items are linked lists of (char, count) pairs.
1) Create a new array with length 26. Initialize all the elements with 0. Let's call it int[] countArray
- Msingh January 06, 20122) Iterate the given char array and if char is 'a' then increment the value at index 0, if it is 'b' then increment the value at index '1' and so on... When reading the given array is finished iterate the countArray and print the character corresponding to highest index value