There is a large file( 1TB) co

Amazon Interview Question for SDE1s

2

of 2 votes

32
Answers
There is a large file( 1TB) containinig braces. Question is to check for their balance. I said will use a counter, will increment on an open brace and decrement on an close brace. If counter goes negative or counter is non zero at the end of the file, braces are not balanced. Otherwise balanced. Followup question was to make this process parallel ( meaning to see if this problem can be solved through parallelism, like dividing the the problem into sub problem....) Remember the file is very large.
- gns January 23, 2014 in India | Report Duplicate | Flag | PURGE
Amazon SDE1 Problem Solving

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 8 vote

If "( = 1" and ") = -1". We should check the same two conditions:
1. The total sum of the brackets is 0
2. We never go negative at any point of time.

This can easily be done with one chunk, if we will do it on multiple chunks, then there is two numbers that we need to report per chunk:
1. Sum of all brackets.
2. Lowest negative number that was found during processing this chunk.

The last step is to "merge" the results. We need to sequentially add the total sum of the brackets, and we also need to check that at any chunk, the previousChunksSum + CurrentLowestNegativeNumber is always above zero.

- Mahmoud January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

we can optimize this further by:
During processing individual chunk, you can let the partial sum go -ve only during start, while you are encountering initial "}"s. once you encounter a "{" then onwards the sum should not go below your original -ve value.

sum = 0
while (ch == '}') sum--;
leastValue = sum;
// at this point you got a -ve number, we should never go below this during further proessing of the chunk
while (ch = getChar())
{
if (ch == '{') sum++;
if (ch == '}') sum --;
if (sum < leastValue)
{
// the file is not balance
// stop processing any further.
}
}

- Anonymous January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Perfect !!!

- Ajeet January 24, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 6 vote

The OP's approach for the initial problem was correct. Namely, something like

int counter = 0;
for (int i=0; i<string.Length; i++)
{
    char c = string[i];
    if (c == '{')
    {
        counter++;
    }
    else if (c == '}')
    {
        counter--;
        if (counter < 0)
        {
            return false;
        }
    }
}
return (counter == 0);

This is correct because we are counting how many unclosed { we have at any moment. When we see a new { we add 1, and when we see a }, we immediately pair it with the last unclosed {, therefore using up one of them and lowering the count by 1. If there are no unclosed { when we see a }, the } does not have a matching { and the counter becomes negative, at which point we return false. At the end, if there are any unclosed { remaining, those { do not have a matching } and we must return false. If the unclosed { count is 0, on the other hand, we return true.
--------------------------

The first instinct to solving the parallel problem might be to split the work up into equal-sized chunks for each processor. If there are N characters in the file and K processors, give each processor a chunk of N/K characters. When processing a chunk, we can no longer simply return false when the count drops below 0, because there may be more opening braces in an earlier chunk processed by a different processor. We also can't return false if the counter is not 0 at the end, because it may be that there are more closing braces in later chunks. What we should calculate for each chunk, instead, is:

1. If we were doing a global brace counter as in the single-threaded approach, by how much would this chunk change the counter? This can be a positive or negative number. Call this the "brace delta". This number is equal to the number of opening braces minus the number of closing braces in the chunk.

2. While we are calculating the "brace delta", what is the lowest point to which the brace delta drops anytime during the processing of the chunk? Call that the "brace delta minimum". This number is important because this is the point in the chunk where the global counter would be the smallest. We can check whether the global counter would have ever been negative by seeing whether (global counter before chunk) + (brace delta minimum in chunk) is < 0 for any chunk.

These two things would be calculated by:

ChunkResult BraceDeltaAlgorithm (int chunkStartPos, int chunkEndPos, String input)
{
    int braceDelta = 0;
    int minBraceDelta = braceDelta;
    for (int i=chunkStartPos; i<chunkEndPos; i++)
    {
        char c = input[i];
        if (c == '{') 
        {
            braceDelta++;
        }
        else if (c == '}')  
        {
            braceDelta--;
            if (braceDelta < minBraceDelta)
            {
                minBraceDelta = braceDelta;
            }
        }
    }
    return new ChunkResult (braceDelta, minBraceDelta);
}

The complexity of the above code is O(N/K) for each of K processors. However, we can consider it O(N/K) time because all the processors can work in parallel.

After we process each chunk in parallel, we combine the results in a single-threaded way. We use the brace deltas to calculate the value of the global counter at the beginning of every chunk, and we use the brace delta minima of the chunks to verify that the global counter never drops below 0. The global counter value should be 0 at the end of all the chunks as before.

int globalCounter = 0;
for (int i=0; i<K; i++)
{
    ChunkResult result = childThreads[i].getResult();
    if ( globalCounter + result.minBraceDelta < 0)
    {
        return false;
    }
    globalCounter += result.braceDelta;
}
return (globalCounter == 0);

The non-parallel part takes O(K) time, making the overall time for the solution O(N/K + K).
---------

The above approach is a very good practical solution, because in most cases the number of processors K will be fairly small, and the "+ K" portion of the time complexity would be negligible. Let's assume however that we have unlimited processors now.

We can see that K = order of sqrt(n) is the best choice for K if we want to minimize the
value of N/K + K. So even with unlimited processors, the time taken by the above approach is O(n / sqrt(n) + sqrt(n)) = O(sqrt(n)).

We can do better by giving a divide and conquer approach:

1. The current processor splits the input into 2 chunks, and hands each of 2 processors one chunk each. Pointers would be passed instead of copying data, so this is O(1).
2. Each of these 2 processors recursively solve this problem, again delegating to yet other processors. If the problem size has been reduced to a small constant, the processor uses the BraceDeltaAlgorithm from before as a base case.
3. The solutions obtained by the 2 processors in step 2 are combined into a single ChunkResult, and that result is returned.

ChunkResult solve (int start, int end, String input)
{
    if (end - start <= SOME_SMALL_CONSTANT)
    {
        return BraceDeltaAlgorithm (start, end, input);
    }
    int mid = start + (end-start)/2
    start new thread t1 to run "solve (start, mid, input)"
    start new thread t2 to run "solve (mid, end, input)"
    ChunkResult r1 = WaitForThreadToFinishAndGetResult(t1); 
    ChunkResult r2 = WaitForThreadToFinishAndGetResult(t2); 
    return new ChunkResult (r1.braceDelta + r2.braceDelta, Math.min(r1.minBraceDelta, r1.braceDelta + r2.minBraceDelta))
}

public bool hasCorrectBraces (String input)
{
    ChunkResult result = solve (0, input.Length, input);
    return (result.braceDelta == 0 && result.minBraceDelta >= 0);
}

Everything here is O(1) except the time spent waiting for the recursive solve() invocations. Since both calls to solve are processed in parallel, we only bill for one of them. So our recurrence relation is T(n) = T(n/2) + O(1) => T(n) = O(log n). This improves over the previous bound of O(sqrt(n)).

- eugene.yarovoi January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Should be Math.min(r1.minBraceDelta, r1.minBraceDelta+ r2.minBraceDelta)
ain't it?

- kannan April 29, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@kannan: That part of the expression tells you the minimum delta that occurs at any point in the combined chunk. This is the smaller of the minimum delta in the first chunk, and the overall delta caused by the first chunk (r1.braceDelta, not r1.minBraceDelta) plus the minimum delta anywhere in the second chunk (r2.minBraceDelta). So the expression should be Math.min(r1.minBraceDelta, r1.braceDelta + r2.minBraceDelta), as posted. Let me know if further clarification of this point is needed.

- eugene.yarovoi April 29, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

This required a stack data structure to validate the structure. Because with count we will not be able to differentiate between - "{123*}23 }{" & "{123*{23 }}". First string is in invalid.
Algorithm:

1. Use a stack - implemented using linkedlist.
2. If "{" comes push it to stack.
3. If "}" comes pop from the stack.
4. End of file - if stack is empty, it is a valid file.
Note: If there is any chance  that stack can throw OutOfMemory (there are too many push) in that scenario we can implement this stack using a BitVector (to reduce the space).

- Ajeet January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

The OP's method is correct -- see my comment under gns's answer for an explanation. The OP's method would be able to detect the first string is invalid because, to quote the OP, "If counter goes negative or counter is non zero at the end of the file, braces are not balanced". The counter goes negative in your first example. It first goes up to 1, then down to 0, then to -1. At that point it is negative and the algorithm would return false.

The main challenge of this problem lies in figuring out a way to parallelize this efficiently.

- eugene.yarovoi January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@eugene.yarovoi

If we suppose the file is divided into serial chunks with IDs assigned to them from 1 to n. How about computing the counter for each of the chunk in parallel and store them separately. In the end, we can check that the cumulative sum of the counters from chunk 1 to chunk n never becomes negative. Would this work?

- Murali Mohan January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I mean, we need to verify:
counter_of_chunk1 >=0
counter_of_chunk1 + counter_of_chunk2 >=0
...
counter_of_chunk1 + counter_of_chunk2 + .. + counter_of_chunkn >=0

- Murali Mohan January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@eugene.yarovoi
Yup, I agree, we can do it using a counter .....thanks for the correction.

- Ajeet January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

@Ajeet..

Infact, the problem is very clear that it is a very large file, so stacks and linkedlists will runout of space. Can you explain more how to use bitvector?

- gns January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Simply counting is not gonna work, I think that's clear enough for this question.

In terms of the "big data", I think it's possible with bit array (or bit vector).
Assume you the 1 TB file contains N characters
0. you need a N length bit array: x
1. when met a left bracket, shift left by 1 bit and then do an AND operation with the shifted array: (x << 1) & 1; it is like pushing an element on to the right.
2. when met a right bracket, shift right by 1 bit x >> 1; So alternatively, it feels like poping up an element.
3. do the same stack balance check before the 2nd step, if the bit array is actually 0, we done.

- suzker January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Simply counting braces as the OP described does work for the non-parallel case. Each } closes off the most recent {, so all you need to know is whether you have any { to close off whenever you hit a }, and whether all { have been closed when the string ends. Think about it this way -- if you had a stack, every element of the stack would always be {, simply because as soon as you get } you immediately pair it with one of the { on the stack, so no } or any other symbols ever go on the stack. Since your stack will never contain anything other than {, you can just maintain a count of how many of them you have rather than maintaining the stack in explicit form. From these realizations we can prove the OP's approach correct.

- eugene.yarovoi January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@eugene.yarovoi Thank you and I agree with you that my statement of "simply counting is not gonna work" is kinda reckless. I am actually referring to "simply just checking the final count equals to 0 or not is not gonna work." The techniques you described is more of a run-length encoding for the stack. It's true that we can compress the "stack" like: "{{{{{{" into "{6". I got no problem with that. When you meet a right bracket, simply just minus one like popping from a stack. But I think the OP is describing another "counting", and I quote, " If counter goes negative or counter is non zero at the end of the file, braces are not balanced.". With this technique, if we really want to go parallel, we can actually do this for each chunk: 0. keep a records of each chunk while splitting the file in a sequence order. 1. keep a run-length style record for each chunk, for an input like "{{{}{", it goes like "{3" then "{2" then finally "{3". And for input like "}}}" it goes like "}3" 2. Now do a linear traverse over the run-length encoding outputs from each chunk in the original order, now try to combine the output in such order see if we have an empty "stack" left at the end of the process or we have a mismatch during the process. You can still call it counting, but it's not just "simply counting at the end". - suzker January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I think the OP meant the sentence as (If counter goes negative) or (counter is non zero at the end of the file), braces are not balanced

and not

If ((counter goes negative) or (counter is non zero)) at the end of the file, braces are not balanced

Since the language is ambiguous, it probably makes sense to give the benefit of the doubt and assume the interpretation that would make the solution correct.

- eugene.yarovoi January 24, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

If the braces are balanced, (going by the counter method by @ajeet) you will end up with sum 0.
Hence even you may divide the file into chunks, with help of some marker method or use a file system which gives you chunks of file and then put each chunk for counting the braces. Each parallel process will end up in a number. These numbers again should add up to 0.

- satyajeettripathy January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Assuming the file is divided in 2 chunks, the first chunk has {{} and the second chunk has {}}. For this one, by going with counters, finally they would sum-up to 0. But what if the first chunk has {}} and second one has {{}, assuming that the original file has "{}}{{}" ). Though the sum is 0, the braces are not balanced. Anymore suggestions?

- gns January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@SatyajeetTripathy is going in the right direction. Why not take 2 counters, say, left_paren_count and right_parent_count? Ensure that the cumulative sum of these variables for all but the last chunk is such that: sum(left_paren_count) >= sum(right_paren_count). By the last chunk the cumulative sums should be such:
sum(left_paren_count) = sum(right_paren_count)

- Murali Mohan January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Murali Mohan, try this test case,

input: "}{{}" --> not balanced

first chunk: }{, left_parent_count = 1, right_parent_count = 1
second chunk: {, left_parent_count=2, right_parent_count = 1
last chunk: }, left_parent_count=2, right_parent_count = 2

sum_left = sum_right, but they are not balanced.

- suzker January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Suzker
Good test case, actually the algo needs modification. the cumulative sums need to be checked as soon as a right parenthesis is encountered and not at the end of the reading the chunk. Chunks are useful only in retrieving parts of a large file.

- Murali Mohan January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

We can do this with stacks and in parallel as well. Using the bit-vector still won't solve it as the file size is huge. 1TB / 8 = 128GB, so even with bit-vector you would need RAM to be bigger than 128GB.

So, I am using the map-reduce model here:

The huge file will be split into small parts (Hadoop does this for us) that can easily fit into the main memory.

Mapper pseudo-code:
Input: (k1, v1) where k1 is byte-offset in the file, v1 is the text
Output: (k2,v2) where k2 is the byte-offset in the file, v2 is a stack which has braces in it

create stack;
while(more chars) {
	if ("{")
		stack.push("{");
	if ("}")
		stack.pop();
}

return stack;

Reducer pseudo-code:
Input: (k1, v1) where k1 is byte-offset, v1 is stack (actually its a list of values we get but here byte offsets are unique so those will be one item lists)
Output: (k2, v2) k2 is smallest byte-offset from the k1's we got, v2 is final merged stack

sort input key-value pairs by byte offset;
create stack S;
while(more keys) {
	read stack s associated with the current key;
	if (s.pop() == "{")
		S.push("{");
	if (s.pop() == "}")
		S.pop();
}

write (smallest input key, S) to output file;

Finally, read the reducer's output. If the output stack is empty we have a valid input file.

- DevGuy January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Some mistakes in this solution, - be more careful with push and pops - need to invert every stack after the first one in reducer, to do the merge correctly Here is a simple simulation using threads: {{{ public class ConcurrentParensValidator { private String parens = ""; private ExecutorService service; private void generateRandomParens(int length) { StringBuilder builder = new StringBuilder(); Random rnd = new Random(); for (int i = 0; i < length; i++) { if (rnd.nextBoolean()) { builder.append("{"); } else { builder.append("}"); } } parens = builder.toString(); } private void useAValidOne() { this.parens = "{{{}{{}}}}{}{{}}"; } public String getParens() { return parens; } class Worker implements Callable<Stack<String>> { private final String parens; public Worker(String parens) { this.parens = parens; } @Override public Stack<String> call() throws Exception { System.out.println("Worker-" + Thread.currentThread().getId() + " working on " + parens); Stack<String> stack = new Stack<String>(); for (int i = 0; i < parens.length(); i++) { char paren = parens.charAt(i); if (paren == '{') stack.push("{"); if (paren == '}') { if (stack.isEmpty()) stack.push("}"); else { if (!stack.isEmpty() && stack.peek().equals("{")) stack.pop(); else if (!stack.isEmpty() && stack.peek().equals("}")) stack.push("}"); } } } System.out.println("Worker-" + Thread.currentThread().getId() + " generated " + stack); return stack; } } public boolean validate(int numThreads) throws InterruptedException, ExecutionException { service = Executors.newFixedThreadPool(numThreads); Stack<String> stack = new Stack<String>(); List<Future<Stack<String>>> results = new ArrayList<Future<Stack<String>>>(); for (int i = 0; i < numThreads; i++) { results.add(service.submit(new Worker(parens.substring(i * (parens.length() / numThreads), (i + 1) * (parens.length() / numThreads))))); } Stack<String> temp = new Stack<String>(); int count = 0; for (Future<Stack<String>> result : results) { Stack<String> s = result.get(); invertStackIfNotFirstOne(temp, count++, s); while (!temp.isEmpty()) { String paren = temp.pop(); if (paren.equals("{")) stack.push(paren); if (paren.equals("}")) { if (stack.isEmpty()) stack.push(paren); else { if (!stack.isEmpty() && stack.peek().equals("{")) stack.pop(); else if (!stack.isEmpty() && stack.peek().equals("}")) stack.push(paren); } } } } service.shutdown(); return stack.isEmpty(); } private void invertStackIfNotFirstOne(Stack<String> temp, int count, Stack<String> s) { if (count++ == 0) { temp.addAll(s); } else { while (!s.isEmpty()) { temp.push(s.pop()); } } } public static void main(String[] args) throws InterruptedException, ExecutionException { ConcurrentParensValidator validator = new ConcurrentParensValidator(); validator.generateRandomParens(16); // hard code one valid string - "{{{}{{}}}}{}{{}}" // validator.useAValidOne(); System.out.println("Randomly generated parentheses string is " + validator.getParens()); System.out.println("Generated parentheses string is " + (validator.validate(4) ? "valid" : "invalid")); } } }}} - DevGuy January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

OK, something messing up my replies. I will not paste code for this now :)

Formatting issue with earlier reply. Some mistakes in this solution:
- be more careful with push and pops in the stacks

- need to invert every stack after the first one in reducer, to do the merge correctly

- if the first stack (ordering by byte-offset in file) itself starts with a closing parenthesis, then there is simply no chance of file being valid. Mappers need to maintain a static volatile boolean fag to mark this occurrence (to avoid unnecessary computation).

- DevGuy January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Break the 1TB file into number of cores system has, eg if system has 4 cores than 250 GB each file block size

Now make a thread which will process each file block
Within the thread
Use ArrayList
while file encounters character either '{' or '}'
if last index value is '{' and current value is '}; then remove both of them from list
else add the value to list;
return list;

Once these threads have runned on the different files they will produce Arraylists equal to number of cores say eg 4

Then check the ArrayLists
if all arraylists are empty : then print balanced
else
for i=0 to arraylist.size
check for pairs of bracket: if found remove the index from list
if arraylist is empty now then : print balanced
else
print unbalanced;

- pathak.ravi1989 January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I have two doubts regarding your solution:

1) Processing 250GB files is going to need hugely beefed up server machines. Else, you will be doing a lot of round trips to the disk which will degrade performance greatly.

2) Another issue would occur if the file is mostly parentheses, so then some of your arraylists can be huge if you have unlucky sequences.

- DevGuy January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

1st doubt: 1TB itself is a huge size and assuming a memory rich server is reasonable for it.
2nd doubt: Huge servers have huge RAM space and thus big compiler memory heaps. Although taking your point on unlucky sequence: we can prefer compressing data.
an elegant solution will be

Long countx=0,county=0;filesegment=index;//filesegment defines which section of file eg 4 four cores we have 4 indexes namely {0,1,2,3}
while !EOF
if current value is '} then
{
if countx>0 // number of '{' are more then
countx--;
else if (filesegment==0 && count==0)// 1st index file cannot have '}' in absence of '{'
print unbalanced; break; System.exit();
else if filesegment!=0// else we will add in count '}'
county++;
}
if last index value is '{' and current value is '{' then countx++;
end while
return (countx+1,county);//countx+1 because when countx=0 it will have one '{' because of if statement

After thread completes tasking we join the output from all the threads

O1=sum(countx1,countx2,...)//taking sum of all countxs
O2=sum(county1,county2,....)//taking sum of all countys
for each filesegments
if i=filesegments.length-1 then
if(O1>O2)// one before last file index we have more '{' and less '}' then unbalanced
print unbalanced;BREAK;System.exit();
if((countx(i)>=county(i+1))&& i!=filesegments.length-1)
O1=O1-county(i+1); O2=O2-county(i+1);
if((countx(i)<county(i+1))&&i!=filesegments.length-1)// unbalanced paranthesis
print unbalanced;BREAK;System.exit();
end for
if O1==O2==0 print balanced

This is fast compared to the above one as if these unlikely sequences follows illegal constructs can be handled fast. No need of having bulky datastuctures

- pathak.ravi1989 January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@pathak.ravi1989 sure you could arrange for a machine with such huge RAM and yes, your compression idea will work for some time. Yet, I think you missed my point.

What I am trying to say is that you are going for vertical scaling and that has limits. Soon, there could be requirement to process many more such huge files. Maybe, even bigger files can come up like 1PB (not unlikely in today's age). What if there are requirements on availability? If your machine goes down, you face angry customers ;)

The preferred way is to scale horizontally i.e. use a large number of commodity machines to process splits of full data. This is what Google and Amazon do for this kind of stuff. Large number of machines working in tandem also helps avoid single points of failure.

- DevGuy January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

1) break the 1TB data into n number of chunks.
2) for each chunk call a function say CHECKCOUNT to check difference between opening and closing braces.
NOTE : always start comparing after first opening braces.
3) return value of this number is
0 if number of opening braces equal to number of closing braces.
(-ve value) if number of closing braces are greater then opening braces.
(+ve value) if number of opening braces are greater then closing braces.
4) store the return values into some arrays.
5) now you have a array with zero's, +ve and -ve number
6) user devide and conquere technique to add adjacent numbers
7) in the end if you have zero means success otherwise failure

- abdul lateef January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

How about this:

// divide the file into N blocks, for each block do the following:
b.unmatch_minux = 0;
b.unmatch_plus = 0;

while (c = read a char) {
  if (c == '(')
    b.unmatch_plus = 0;

  if (c == ')')
    if (unmatch_plus > 0)
      b.unmatch_plus--;
    else 
      b.unmatch_minus--;
}

// combine the blocks in sequence
sum = 0;
for (each block b) {
  sum -= b.unmatch_minus;
  if (sum < 0)
    return "unbalanced";

  sum += b.unmatch_plus;
}

- Jason January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Fix a typo:

// divide the file into N blocks, for each block do the following:
b.unmatch_minux = 0; // number of unmatched ")" on the left
b.unmatch_plus = 0;  // number of numatched "(" on the right

while (c = read a char) {
  if (c == '(')
    b.unmatch_plus++;

  if (c == ')')
    if (unmatch_plus > 0)
      b.unmatch_plus--;
    else 
      b.unmatch_minus--;
}

- Anonymous January 23, 2014 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Formatting issue with earlier reply. Some mistakes in this solution, - be more careful with push and pops - need to invert every stack after the first one in reducer, to do the merge correctly Here is a simple simulation using threads: {{{ public class ConcurrentParensValidator { private String parens = ""; private ExecutorService service; private void generateRandomParens(int length) { StringBuilder builder = new StringBuilder(); Random rnd = new Random(); for (int i = 0; i < length; i++) { if (rnd.nextBoolean()) { builder.append("{"); } else { builder.append("}"); } } parens = builder.toString(); } private void useAValidOne() { this.parens = "{{{}{{}}}}{}{{}}"; } public String getParens() { return parens; } class Worker implements Callable<Stack<String>> { private final String parens; public Worker(String parens) { this.parens = parens; } @Override public Stack<String> call() throws Exception { System.out.println("Worker-" + Thread.currentThread().getId() + " working on " + parens); Stack<String> stack = new Stack<String>(); for (int i = 0; i < parens.length(); i++) { char paren = parens.charAt(i); if (paren == '{') stack.push("{"); if (paren == '}') { if (stack.isEmpty()) stack.push("}"); else { if (!stack.isEmpty() && stack.peek().equals("{")) stack.pop(); else if (!stack.isEmpty() && stack.peek().equals("}")) stack.push("}"); } } } System.out.println("Worker-" + Thread.currentThread().getId() + " generated " + stack); return stack; } } public boolean validate(int numThreads) throws InterruptedException, ExecutionException { service = Executors.newFixedThreadPool(numThreads); Stack<String> stack = new Stack<String>(); List<Future<Stack<String>>> results = new ArrayList<Future<Stack<String>>>(); for (int i = 0; i < numThreads; i++) { results.add(service.submit(new Worker(parens.substring(i * (parens.length() / numThreads), (i + 1) * (parens.length() / numThreads))))); } Stack<String> temp = new Stack<String>(); int count = 0; for (Future<Stack<String>> result : results) { Stack<String> s = result.get(); invertStackIfNotFirstOne(temp, count++, s); while (!temp.isEmpty()) { String paren = temp.pop(); if (paren.equals("{")) stack.push(paren); if (paren.equals("}")) { if (stack.isEmpty()) stack.push(paren); else { if (!stack.isEmpty() && stack.peek().equals("{")) stack.pop(); else if (!stack.isEmpty() && stack.peek().equals("}")) stack.push(paren); } } } } service.shutdown(); return stack.isEmpty(); } private void invertStackIfNotFirstOne(Stack<String> temp, int count, Stack<String> s) { if (count++ == 0) { temp.addAll(s); } else { while (!s.isEmpty()) { temp.push(s.pop()); } } } public static void main(String[] args) throws InterruptedException, ExecutionException { ConcurrentParensValidator validator = new ConcurrentParensValidator(); validator.generateRandomParens(16); // hard code one valid string - "{{{}{{}}}}{}{{}}" // validator.useAValidOne(); System.out.println("Randomly generated parentheses string is " + validator.getParens()); System.out.println("Generated parentheses string is " + (validator.validate(4) ? "valid" : "invalid")); } } }}} - DevGuy January 23, 2014 | Flag Reply

CareerCup

Amazon Interview Question for SDE1s

Books

Videos

Resume Review

Mock Interviews