## eugene.yarovoi

BAN USERUsually, image compression is done in a lossy way, meaning information is lost during the compression process. Formats that do lossy compression, such as JPEG, retain the information that is in some sense most critical to the overall visual perception of the image and lose details that contribute less. Huge space savings are often attained this way.

There are also formats that are lossless, meaning that you can recover the exact, unaltered original from the compressed version. These usually also exploit certain properties that are frequently true about image data. For example, adjacent pixels often have similar colors, so to take a very simple scheme as an example, you could store an image as a series of mostly small numbers indicating the change in the color from one pixel to the next. Suppose you have a row of pixels whose values are 123 125 124. You can compress this by writing the values as 123 2 -1. 2 and -1 require fewer bits to represent than 125 and 124. This example is just a very simple compression scheme and more complicated schemes are possible.

Even with no special image properties to exploit, you can halve the use of space because the image in this question only has 16 colors. If a pixel can have one of 16 colors, it can be represented with 4 bits per pixel (since there are 16 different possible sequences of 0 and 1 of length 4). However, we're told in the problem that the image is presently represented with 1 byte (8 bits) per pixel instead. We can re-encode the image to use 4 bits per pixel to halve the space usage.

@jitten: bidirectional bfs is a generalization of choice 1. Choice 1 is a special case of bidirectional bfs (when the distance is limited to a maximum of 2).

- eugene.yarovoi February 27, 2013In some languages, List is a concrete class. Don't assume everyone is using Java.

- eugene.yarovoi February 27, 2013I'm not sure I'm 100% clear on what you're saying, but that sounds about right. Trying every combination is probably a bad idea. Use dynamic programming.

- eugene.yarovoi February 26, 2013You can just solve this problem with an array and some sorting. Consider that the problem can be converted to a instance of a problem where we have a list of intervals and we ask how many pairs of intervals overlap.

- eugene.yarovoi February 25, 2013@Jack: I think the idea is to say the complexity is O(n log n + R) where R is the number of intersections. You're right that the number of intersections could be quadratic, but the idea is that this algorithm is reasonable if R is small.

It's true that the problem can be solved in O(n log n) overall, regardless of R.

Here's a famous scene from the movie "Labyrinth" that features this problem: youtube DOT com/watch?v=2dgmgub8mHw

- eugene.yarovoi February 24, 2013@Last Anonymous: I don't understand your argument. The liar and truthteller will point to the same door, regardless which one you ask. That allows you to determine the correct door.

- eugene.yarovoi February 24, 2013Simple set intersection between the two people's friend lists will work here.

A while back I answered a similar but more general question: given two people in a social network, what is the shortest path of friend connections from one to the other? Discussed here: careercup DOT com/question?id=14087784

The question only asks for the number of intersecting circles. You don't have to print out which ones intersect.

- eugene.yarovoi February 24, 2013Do a modified version of the merge part of mergesort. Advance through the two arrays the same way you would in mergesort (but don't output a merged array), and at each comparison between an element of A and an element of B, additionally check whether the two elements being compared are equal, and if so, print out one of them.

- eugene.yarovoi February 22, 2013@Rahul: The original answer ("Use a hashset?") doesn't describe in what way a HashSet would be used. I was not disputing the correctness of the LinkedHashSet comment, only offering another possible approach.

- eugene.yarovoi February 22, 2013Review the definition of "stable sort". A "stable" sort does not preserve the order of all elements, only elements with equal sorting keys. Since we just have an array of integers here, whether or not we use a stable sort is irrelevant. Any sort will destroy the order of elements and therefore not satisfy the requirements of this problem.

- eugene.yarovoi February 21, 2013Or just start with an empty HashSet, and for each element of the array, check to see if it's already in the hashset. If yes, don't print it or do anything else; if no, add it to the hashset and print it. This way each element will be printed the first time it occurs.

- eugene.yarovoi February 21, 2013Remember, whenever the questions are real questions, people typically write them down from memory. You can't expect a flawless description. The core of the question might have been real, but it may be interspersed with comments from the person who posted the question. That person was probably unable to figure out the algorithm during the interview, so they came here to post the question and ask for pseudocode, all with the goal of understanding what approach they should have taken.

- eugene.yarovoi February 21, 2013Yes, it's valid to frame it that way. If N is the perimeter of the rectangle, the algorithm is O(N).

- eugene.yarovoi February 20, 2013@Anonymous: Agreed that the algorithm will work for an arbitrary rectangular matrix. The complexity analysis was just for the square matrix case. Also agreed that the complexity for the rectangular case is O(max(r,c)), or equivalently O(r + c).

- eugene.yarovoi February 20, 2013Anything you mention on your resume is fair game. That said, unless the position is specifically for someone skilled in a particular technology, interviewers usually ask only for core parts of a language when testing coding skill. Not knowing Spring and Hibernate too well is probably just fine if the position isn't for a Spring or Hibernate developer.

I would note that you should know more than just core Java and data structures if you want to get hired, though. You should also be good at algorithms, design, and testing/debugging code. If you're an experienced candidate, having some domain knowledge relevant to the position you're applying for is sometimes important too.

+1, Good answer.

Why do you say using immutable objects reduces the garbage collection overhead, though? Since you have to discard an immutable object and create a new one every time you want the value to change, I would generally expect the effect on garbage collection to be somewhat the opposite.

Marking it as final doesn't make it a constant. It just means you can't assign to the member, a goal which was already being accomplished by the use of the private keyword and the lack of a setter. You can't reassign what object the "date" field points to, but the object itself can still be changed. When getDate() returns the instance of the Date class, that instance can still be modified.

For ways of solving this problem, see my and HG's comments to Rage's answer.

Yes, if N is the total number of elements, the algorithm is O(sqrt(N)) for a square matrix. The analysis that concluded this algo was O(N) had N as the number of rows in a square matrix, and there are sqrt(number of elements) rows in a square matrix.

- eugene.yarovoi February 20, 2013We haven't really defined what N is in this problem.

- eugene.yarovoi February 20, 2013If Date is a mutable reference type (as seems to be the case here), your getDate() method leaks the reference and allows instances of the Data class to be modified.

- eugene.yarovoi February 19, 2013@zyfo2: though it cuts the matrix in half, it creates two sub-cases whereas the algorithm presented in this answer only creates one. You can't compare the two algorithms just by arguing that one of them results in subcases that cut the input in half. The number of subcases is very important too.

- eugene.yarovoi February 19, 2013@dadakhalandhar: it can't be better than O(n) if it involves checking m + n -1 elements. Just checking n elements is already O(n).

- eugene.yarovoi February 19, 2013We know it's a reference type due to the use of "class" rather than "struct".

I see your point though. You're thinking about the question of what's allocated when just this code runs. I was thinking about the question of what's allocated when this code is used to instantiate objects.

We were just shown the declaration of the class. There may well be code that uses the class that's not shown.

- eugene.yarovoi February 18, 2013EDIT: ignore this answer. I misread the question as asking about C++. Can't delete the post for some reason.

Where the memory for a class is allocated depends on how you allocate it. The memory for the variables of a class is generally placed on the stack if you create the instance with notation like

`Test myInstance;`

The memory will be placed on the heap if you use the "new" keyword, like:

`Test* myInstance = new Test();`

Some classes may have internals that do dynamic memory allocation. In that case, even if the object is created using the first notation above and its variables are placed on the stack, the variables may point to memory that is dynamically allocated and hence on the heap. For example, consider a "vector" class whose variables might be:

```
int size;
int capacity;
T* backingArray;
```

While all 3 of the above variables will be stored on the stack if the vector is created on the stack, backingArray will still point to a location on the heap where the dynamically resized array begins. backingArray itself (the pointer) will be on the stack.

- eugene.yarovoi February 18, 2013There's no reason for doubt here. Google asks questions of a wide range of difficulties, and this question is certainly within at least the lower bounds of what can be asked.

You might be surprised at how many candidates would strike out on this question. Remember, for a simple question like this, you'd often be asked to write the code, so it would also be about whether you're able to write clean, efficient, and relatively bug-free code, all in a short amount of time, to solve this problem. The question is not difficult, but it's sufficiently nontrivial for some people to fail it.

@Alva0930: for that case, we could use a hash table instead of a bitset. This would still preserve the O(n) runtime (at least in terms of expected runtime). If we wanted worst-case guarantees we could store the remainders in a balanced tree and achieve a worst-case of O(n log n).

- eugene.yarovoi February 18, 2013@Alva0930: I see. I don't know where the person who posted the problem statement is getting that formula. Maybe they just got it wrong, or maybe they interpreted the question differently.

- eugene.yarovoi February 17, 2013Except in my last paragraph, I was mostly talking about the case of basic synchronization: just making it thread-safe, as opposed to having it be concurrent in any particularly optimized way. Basic thread-safety is easily doable for both hashmaps and treemaps.

In my last paragraph, I was thinking about how there are at least some improvements for both trees and hashmaps. For one, we can have a strategy for both HashMaps and trees where we allow an arbitrary number of concurrent reads, and only one concurrent write (that blocks reads and is blocked by reads).

Hash map synchronization is complicated too. A hash map probably has collision resolution in the form of linked-list chaining, and the linked lists can be in the process of being manipulated when you're doing a search. You might have a lock for each bucket to solve this problem, but then you must still come up with a strategy for how to handle the resizing of the hash map in a thread-safe way.

You're right that supporting concurrent writes for trees is a lot harder, though.

"BST gives good serialization when stored in PreOrder format but not in postorder or inorder."

I don't know why you say that.

"HashTable can be serialized only if one makes sure to overwrite the hash function with the keys in the has table during serialization."

I have no idea what you mean.

"I dont think BST basically supports much of concurrency stuff..Hashtables when moved to the concepts of concurrent hashmaps they do give concurrency support"

The Collections.synchronizedMap(...) method in Java, for instance, can make TreeMaps synchronized as easily as HashMaps. Whether a map is synchronized and whether it's backed by a tree or hash table don't have to be related.

"but as such the hash tables are synchronized so I dont think they would support concurrency"

I believe something like Collections.synchronizedMap(...) just locks the data structure every time it's doing a read or write, but more sophisticated implementations are definitely possible for both trees and hash tables.

Why do you say a hash table is any more thread-safe than a tree?

- eugene.yarovoi February 17, 2013I've seen this page before. The algorithm doesn't come with a proof that its worst-case bound is O(n),although it probably is. It's slightly better in the particular tests that were run, on the particular machine it was tested on.

- eugene.yarovoi February 14, 2013This will be an NFA that looks exactly like a trie, right? This NFA will have up to O(max pattern size) active states at any given time, and will therefore require O(k * max pattern size) time to check.

- eugene.yarovoi February 14, 2013How will you construct the automata? Will you make 1 automaton per pattern, and run each one individually when matching? In that case, the complexity would be at least O(k * n) where n is the number of patterns. Also, matching a pattern of length k with an automaton is not O(k) because your automaton is going to be an NFA here, not a DFA.

- eugene.yarovoi February 14, 2013JerryC's solution is not O(1). No solution can be O(1) because in some cases, it will be necessary to scan the input for at least max (prefix length over all prefixes) characters before determining whether there's a match. The only way it can be O(1) is if we assume all prefixes are of a size bounded by a constant. Then there are solutions that are O(1) because they are only dependent on the size of the longest prefix and independent of the number of patterns we have to match. We can use a prefix tree to obtain one such solution.

- eugene.yarovoi February 14, 2013Not really O(1) time. It's O(max length of pattern)...I suppose perhaps you're assuming that the patterns are all O(1) in size, and you just don't want the runtime to be proportional to the number of patterns?

- eugene.yarovoi February 13, 2013It has the disadvantage of backfiring if you end up not doing so well. "Even when he worked hard, he couldn't meet our standards."

I would naturally assume everyone coming in to a job interview is motivated. There should be no need to assert this, as it should be evident from your insightful questions that you have done your research on the company and that you invested time in preparing.

Your original question offers a false choice. It's neither a secret nor something I would necessarily mention unless the conversation went in that direction.

There's no particular reason to tell your interviewer how hard you studied. I don't view it as a positive or negative, but why tell them? Show them by acing their questions!

- eugene.yarovoi February 13, 2013You certainly could. I think earlier I was just trying to show that the space required doesn't have to be any more than the space needed for a hash table of integers.

- eugene.yarovoi February 13, 2013I certainly wasn't planning on adding them together. There are 256 (or 128 depending on whether you consider the extended set) ASCII characters, so you'd encode them by some formula like char0 + 256*char1 + 256*256*char2

- eugene.yarovoi February 13, 2013You'd then also need to change the last line to return buf.reverse().toString(); since you still need to reverse the output.

- eugene.yarovoi February 13, 2013@amritaansh123: why would that happen?

- eugene.yarovoi February 13, 2013@zyfo2: As far as I know, no faster algorithm than O(n) is known. Please provide specific details if you want to claim a better algorithm.

- eugene.yarovoi February 13, 2013That applies only for a connected graph. If a graph is not connected, it could have fewer than N-1 edges and still have cycles.

- eugene.yarovoi February 13, 2013This solution assumes the strings are stored as the reverse of how they're stored according the problem description. Solution looks right at least conceptually aside from that.

- eugene.yarovoi February 12, 2013Looks good.

- eugene.yarovoi February 12, 2013

Rep**Gayle L McDowell**, CEO at CareerCupGayle Laakmann McDowell is the founder / CEO of CareerCup, which provides programming interview prep for candidates interviewing with Microsoft, Google ...

Rep**Loler**, Employee at Google

Rep

**CareerCup**is the world's biggest and best source for software engineering interview preparation. See all our resources.

Open Chat in New Window

LZMA doesn't guarantee the best possible compression.

- eugene.yarovoi February 28, 2013in your first paragraph, if you don't put the count when none of the adjacent bytes are repeated, how will you later determine whether 11 10 encodes a pixel whose value is 11 and then a second pixel whose value is 10, or 10 pixels all of whose values are 11?