Bazaarvoice Interview Question for Software Engineer / Developers






Comment hidden because of low score. Click to expand.
0
of 0 vote

If the list of email addresses is not sorted, then Put the e-mail addresses into a Balanced Binary search tree( this will remove the duplicates from the list). now serialize the binary tree and send it to the server , on the server end it can be de-serialized to get the original tree with no duplicates.

- Addy September 30, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

@Addy: solution seems fine.
what was your answer? @bv

- manoj September 30, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Do you allow to remove duplicate email address?

- Anonymous September 30, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

I think this is the right answer. Many of the BV questions seem to be incompletely specified,like they are seeing if you will ask questions to further define the end goal, instead of just going to town with the partial requirement spec.

- Anonymous September 14, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I haven't done a complete analysis on this exact problem, but have on a very similar one, and the best data structure for string based sorting/searching was a 'trie' (retrieval tree). For de-duping, which isn't the normal use for 'tries', the insert function would be the flavor that only inserts, and resolves partial collisions, to the minimum depth for uniqueness. This ends up with an O(n) with a worst case coefficient/multiplier of the longest length string/email, and the average case multiplier being the average number of compares for inserts, and average depth for retrievals.

- Anonymous September 15, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Hash has the least complexity.
Create the hash of each email address. If there is a duplicate hash, check if emails are duplicate. If yes, consider only one of them. Otherwise include both the email addresses.
Complexity O(n)

- Avinash October 01, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

I think the complexity is not O(n) but O(n log(n)).
You have to iterate over the list of email addresses (O(n)) and to check the hashes of each email address against some data structure like a hashset (O(log(n)). The overall complexity should be O(n log(n)).

- Anonymous February 05, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

(Reposted this, as the first one went to the wrong comment)

I haven't done a complete analysis on this exact problem, but have on a very similar one, and the best data structure for string based sorting/searching was a 'trie' (retrieval tree). For de-duping, which isn't the normal use for 'tries', the insert function would be the flavor that only inserts, and resolves partial collisions, to the minimum depth for uniqueness. This ends up with an O(n) with a worst case coefficient/multiplier of the longest length string/email, and the average case multiplier being the average number of compares for inserts, and average depth for retrievals.

- Anonymous September 15, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Additionally, the trie results in much less processing. The question involves strings, which can be operation intensive. The 'trie' only does compares/operations as far into the string as necessary. Hashing strings, not. The trie behaves like an auto-optimizing hash for the number of string elements to use for the hash to minimized processing, as well as self-sort the collision chain.

- Anonymous September 15, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

It should be obvious that the dupe count for a node can be incremented when inserting, then decremented when retrieving, so that the first batch is the super set of unique emails, and each subsequent traversal generates a set of emails with increasing dupe counts.

Whether this is the answer they were looking for can't really be known, since there isn't enough info about the problem to be solved.

- Anonymous September 15, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

but wouldnt this defeat the purpose of the question... where u are asked send out emails using minimum number of batches. The question has no restriction on the number of emails per batch, so if you remove duplicate entries you can send all emails in one batch. So I think the question is to be interpreted in different way.

- Anonymous October 07, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Yes, I think this is the right answer (see above similar comment). They seem to want to see if you realize the task is incompletely specified. I have seen this pattern several times with BV 'test problems'.

- Anonymous September 15, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

It's kind of like bin packing. There will be multiple hashes corresponding multiple bins. Scan each email address one by one and lookup the hashes one by one. If it occurs say in the first bin then check in the second bin. Now say there are only 2 bins. In this case create another bin(hash) and put it into this.

- Anonymous February 10, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

What you are essentially saying, in a more complicated way, is already part of a hash- the collision chain. After a complete scan of the emails, the hash table just needs to generate as many batches as the highest order collision chain. There are many markup strategies for ID'ing the hash members/collision chains.

- Anonymous September 15, 2012 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

I am an IT Recruiter looking for some Software Engineers in the Austin area contact me directly--client has asked that I recruit specifically for Bazaarvoice candidates suzie.jimenez@thehtgroup.com 512.345.9300

- suziejimenez3 September 25, 2014 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More