## Facebook Interview Question

Software Engineer / Developersto find whether any duplicate elements in the sorted list, how come the complexity is O(n)/

There are 2 lists A and B. Put all elements of A into a hash table. It is O(n) time. Then compare each element from B and check whether it is in the hash table, also O(n) time.

We can have a hashMap for friends which will hold the emailid(any unique id) as key and the reference(userid or db key) to user as value.

Advantage we can lookup to see whether the target person is the friend of the user or not(null as lookup return value).

Thus we can reduce it to something like BFS in a graph. alongwith this we can pass a cutoff value(the maximum level till where it will lookup). this cutoff value will keep on decreasing and at 0 we will stop our search. In this way we can avoid handling loops and will be having valid exit point for the algorithm.

We can store the result (friend path from one user to another) whenever we have found the target user. It will also handle multiple friend ways (as we are doing BFS and only exit point is the level cutoff).

assign a int ID to each user. Each user has a list of sorted IDs as the friend list. To find whether A and B are connected, the worst case will be (n^k)*lgn (where n is the number of friends for one user and k is the levels to search). Compared with the hash table solution, this one has worse time scalability but better space scalability.

Backend:

> take snapshots of graph every 6 hours

> compute friends of friends on this graph using pregel / other custom two hop traversals

>> maintain evidence of distance 2 (say ids of common friends)

> maintain list of edits (friend adds, deletes - rare but happen!) at each end vertex

Webapp:

Case 1: distance 2 in last precomputation

> lookup latest precomputed distance 2 evidence

> see if edits at source / destination invalidate evidence

Case 2: distance 2 due to recent additions

> get friend lists for source and destination

> join on additions

if user A is logged in an searches for user B,

start by searching user A's friends, and friends of friends of user A...

the complexity is huge O(n^k) where k is number of levels of friends you want to search.

you can make the friend search faster, by having a hash table for each user that returns whether a name is the user's friend. This takes up huge memory, as each of the thousands of users will have a hash table, but complexity of searching at each level is reduced to 1. So search order will be O(k) where k is number of levels you want to search.

any other ideas?

I don't think you are right, you just reduce the checking time for each level, but if you didn't find out the B in this level, you still have to go through each vertex in this level, there is no advantage to use hash table, I guess. The running time will still be n^(k-1) which is n^k

If we start the search from both source and destination, then search could be minimized.

- deepak November 14, 2010Assumption is all the friends list are stored in a sorted order. To find whether A and B are at 2 distance away, get the list of friends of both A and B. Find whether any duplicate elements in the list. Complexity is O(n).

Finding whether A and B is 4 distance away - merge all friends of friends list. And check duplicates in the list. Complexity is O(n^2) {Merging O(n^2) and searching O(n^2)}

Thus complexity will be

1 level depth - O(logn) {Binary search}

2 level depth - O(n)

4 level depth - O(n^2)