CareerCup

Amazon Interview Question for Software Engineer / Developers

0

of 0 votes

29
Answers
Stream has 2 methods

char getNextChar()

bool hasNextChar()

Stream is expected to have 1 M characters. Your application cant store them.

Want to find the 1st unique character in the stream
- king Zidane November 15, 2012 in India for Aggregation | Report Duplicate | Flag | PURGE
Amazon Software Engineer / Developer Data Structures

Email me when people comment.

Email me when people comment.

Loading...

An error occurred in subscribing you.

Team: Aggregation
Country: India
Interview Type: Phone Interview

Email me when people comment.

Email me when people comment.

Loading...

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

1

of 1 vote

Should make use of Hash. There can be 3 states for each character
0- Not present in the stream at all
1-Present once in the stream
2-Present more than once in the stream.
Finally which ever characters are having the state 1 are unique. But we want to know the first unique character.For this we can keep a ascending heap. Key to this heap should be the index of the first occurrence of the character on the stream. There should not be any duplicate characters in the heap. For doing this keep another array to know whether the character is already added or not.

After all the characters in the stream are read. Take nodes from the root and check if the state of the character is 1 in the hash array we have kept. If not go to the next min node from the heap and do the same operation.

Time complexity
O(n) for hashing to the array.
Insertion into heap will happen only for the different characters.

O(n) should be the answer

- Mani November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

1

of 1 vote

we can just keep an int array of size of number of unicode characters. Since upper limit of 1 million is given int array would be sufficient. Then also have a count.As the characters comes in increment the count. If the int array already have a value> 0 make it zero.If it is zero make it count. Once the stream is finished, just iterate through the int array to find the lowest count. That index will give the character .

- Anjana November 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

@mani : in case all the stream characters are unique, the insertion into heap costs O(nlogn) by the end of stream.

- bharat November 19, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

@bharat: Technically, given this example the characters are known to be of type "char" which, by definition (and assuming the use of the full range of the data type) is only 256 possible values. Even assuming a unicode value yields a (relatively) small number compared with the total in the stream.

Although still, for this case it might be reasonable to argue a better data structure than a heap exists (even a priority queue with the 8-bit constraint). That's not enough computations or memory usage (in most cases) to be overly worried about the performance.

- Anthony November 28, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

@anthony: UTF16 Encoding represents 1,112,064 characters.

- bharat November 28, 2012 | Flag

Comment hidden because of low score. Click to expand.

1

of 1 vote

Why heap ? Queue is sufficient to keep track of first occurance of each letter. Have Count Map also with you, Just deque and check if its count is 1 or not from the Count Map

- Tony January 13, 2013 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

ok, I see optimization possible with heap. If you see any character is repeated you can go and delete that from the heap. In the end, you can simply return the root.

- Tony January 13, 2013 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Deletion will take o(n) time in this case. You can take a circular array, where head will point the first character that came in the stream, and in the tail you can add the first character came currently in the stream. When a character is repeated, dirty the position in the circular array by replacing -1, if the character is pointed by head, increase the head pointer, till we found a non -1 value.

So, 1 array (hash) for holding the current status of the character. 1 circular array for the first non-repeated character pointed by head of this array.

- Psycho July 12, 2013 | Flag

Comment hidden because of low score. Click to expand.

1

of 1 vote

Like other Solutions keep a HashMap. As each character is encountered add to HashMap with key=character and value=index where found (you can keep an int counter as you iterate over stream). Once you find a character is duplicate, set value for character in HashMap to -1. Then at end, simply iterate over HashMap (which at most has as many characters as character set) and select the character which has the lowest positive value.

- Steve September 12, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

Assuming single-byte encoded stream, you can solve this problem in linear time
and using constant space by simply utilizing the counting sort algorithm with an
array of a data structure that stores the first position of the encountered character.
As soon as the stream has no more characters in it, we simply exit the while loop,
which has already built the stream statistics for us - the count of each character
and its first occurence position in the stream.

So, the problem is reduced to finding the character that has occurred only once
and that has the least occurrence position (the first unique).
================================================================================

static const int STAT_SIZE = 256;
static const int INVALID_POS = (-1);

struct char_pos {
    int count;
    int position;
    char_pos() : count(0), position(-1){};
};

int main()
{
    char_pos stream[STAT_SIZE];
    char_pos *chstat;
    int pos = 0, curpos;
    char chunique;
    /* Accumulate the stream statistics */
    while (hasNextChar()){
        char ch = getNextChar();
        chstat = &stream[ch];
        chstat->count++;
        if (chstat->position == INVALID_POS)
            chstat->position = pos;
        ++pos;
    }
    pos = INVALID_POS;
    /* Iterate over the statistics and find the first unique character */
    for (int i = 0; i < STAT_SIZE; ++i){
        if (stream[i].count == 1) {
            chunique = (char) i;
            curpos = stream[i].position;
            if (INVALID_POS == pos) {
                pos = curpos;
            }
            else {
                if (curpos < pos)
                    pos = curpos;
            }
        }
    }
    
    if (INVALID_POS == pos)
        printf("No unique character found in the stream\n");
    else
        printf("The first unique character: 0x%02x found at position %d\n", chunique, stream[chunique].position);
        
    return 0;

}

- ashot madatyan November 17, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

Maintain two bits and the first position of a given char (there are at most 256 chars). Since we expect 1M characters, we can use 32 bits for each char and so can use an 256 size array of 32 bit ints, using the last two to determine the count (0, 1, or >1) of the char.

Once we have the array filled, finding the required char should be easy.

- Anonymous November 18, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

1. Take two integer arrays A & B of size 256 (or as many characters possible).
2. For each character in the stream,
a. increment the count in array A and do one of the following:
b. If count is 1, store position of the character in B
c. if count is =2, reset the position of character in B
d. if count >2, do nothing

At the end of stream, get the index of B with minimum position value.

- gona November 19, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

Instead take 2 arrays- 1 boolean one int......
if count>1, make boolean[count]=false, and position[count]=-1
then at last print the min of position[count] (>0)

- king Zidane November 19, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Where do you store the count? You cannot start with a default false with the boolean array as one cannot differentiate between count=0 Vs count>1.

- gona November 19, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

count=false initially, and when first time a char comes, make the position=current_pos and count=true, now next time a value comes, and it is already true, then make position=-1, keep count=true, for subsequent visits

- king November 19, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

We can actually solve this with one integer array A initialize to -1 initially.
Now if we are seeing a char for the first time (A[char]==-1), set A[char] = position
if we are seeing a char second time (A[char] > -1), set A[char]=-2
if we have seen a char more than once (A[char]==-2), do nothing

take min position ignoring all elements < 0.

- gona November 19, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

in this case u will be checking if the value is -2 or not, same thing is achieved by boolean array, approach is same my friend

- king November 19, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

I think first we should make sure the encoding of the string (i.e. the number of bits that can represent 1 character).
If the encoding is ascii, we can use two bitmap (each has 255 bits) to solve this problem. Bitmap A is used for remembering if the character appeared once, and bitmap B is used for remembering if the character appeared more than once. We need another integer array (capacity 255) to remember the last time a character appeared, then the problem will be solved. Time complex is O(n), space complex is O(1) (255bits * 2 + sizeof(int) * 255)
If the encoding need many bits to represent a character, we can use multi-level bitmap structure to reduce the memory cost.

- bcp1989 November 20, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

1) First we need to store the index of all unique characters in an array of fixed size.
2) Now find the smallest index from this array.

Here is the code

int find_first_unique_index()
{
        int index, small_idx = INT_MAX, temp;
        char array[256], c;
        for (index = 0; index < 256; index++) {
                array[index] = 0;
        } 
        index = 1;
        while (hasNextChar()) {
                c = getNextChar();
                if (array[c] > 0) {
                        /* Invalidate the index */
                        array[c] = -array[c];
                } else if (array[c] == 0) {
                        array[c] = index;
                }
                index++;
        }
        /* Now find the smallest valid index */
        for (index = 0; index < 256; index++) {
                temp = array[index];
                if (temp > 0 && temp < small_idx)
                        small_idx = temp;
        }
        return small_idx;
}

- arun.edarath November 23, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

while(Stream.hasNextChar())
{
	char current_char = stream.readChar();
	if(array[character - '0'] == 0)
	{
		array[character - '0'] = 1;
		Linked_Hash_Set.put(current_char);
	}
	elseif(array[character - '0'] == 1)
	{
		array[character - '0'] = 2;
		Linked_Hash_Set.remove(character);
	}
}
return the first element of the Linked_Hash_Set

Since the number of character is constant, ASCII - 128, Extended ASCII - 256 and Unicode - 65535, storage space required for linked hash map and the array is constant.

- Illusion November 24, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

I have a solution for this.
We can maintain an array (of size 256 ) wherein an element is inserted if not present and deleted from the array if already present ( re-arrange rest of them after deletion).
When 1M chars are scanned the element present at index 0 of the array will be the First Unique element.
This method has more complexity since for every element in the stream we will need 256 comparisons in the worst case.

- Maverick December 12, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

LinkedHashMap will be perfect DS for this question

- xiaolong_fighting January 28, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

Use LinkedHashMap, insert the character as key and count as value. The first character with count 1 is the unique character.

- Anonymous May 13, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

I think minheap is the answer
just let the key be number of times a character is repeated

- siddharth May 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

There can be 3 states for each character

0- Not present in the stream at all
1-Present once in the stream (this we are interested)
2-Present more than once in the stream

to represent the above 3 valus, 2 bits are enough.
so allocate an array of size (256*2) bits, if character represented by ASCII value.

for any character 'x' in the stream, bits 2* asciivalueof('x') and 2* asciivalueof('x')+1 represents the appearance state in the array

1. scan the full stream one by one, set this 3 values properly in the array
2. then scan the array to find the value 1 and return the index.

space needed : constant O(1)
time needed : O(n)

- Vin November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

by index u mean, min index, rite?

- king Zidane November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Could u plz explain, how ur method ensures that the index is of the "first unique" element from the stream?

- novice November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Using a single array can only tell whether the character was repeated or not. It wont help us in finding which is the first unique character.
So i suggested that in addition to this array, we need to build an ascending heap(A binary tree with lowest value in the root). As you read in the stream when a character is encountered insert that characters position(index in the stream, which according to the problem is 0,1,2...million)

struct Node
{  unsigned long int index; //This is the key
    char c;

}
When the same character is encountered in the stream do not insert it into the stream. In order to do this we can have an array of bools to know whether the character was added or not.

Now after all the characters in the stream are read, take the node in the root of the heap and check if that character was unique or repeated. If not delete that node from the heap and take the next smallest node from the heap. Again check in the array whether repeated or not. continue until you fine a character that is not unique. That should be your answer

- Mani November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

If you want to use an array, instead of using the count for the respective character, try to use the index as a value and if you find any character repeating( i.e. if the corresponding value in the array is positive, then make it -1). after entire stream is over, you can find the minimum value >= 0 from the array which is the index of the first non repeating character.

- praveen November 15, 2012 | Flag

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.