Facebook Interview Question


Country: United States




Comment hidden because of low score. Click to expand.
0
of 0 vote

edited (mixed up terms)
I would try to approach the question in a structured way, where do these queries come from, what do we expect to ask tomorrow, so we don't have to change this structure all the time. Anyway, for the exact question:

page(s) visited by 100 users:
- would store for each number of unique users per day the pages
- as a b-tree with #of unique users as key and page id and number of total visits as value (I would loose which user it was with this aggregation), but i could have fast statistics for a page ranking based on user visits.
- to ask which was visited 100 times, I had a O(lg(m)+k) query. m is the different page frequencies per day, k the number of pages with exactly 100 users per day. Instead of b-tree I could use a HT with the frequency as index, so it would be O(k), how ever, I'd expect a query > 100, > 1000 or >100 < 200 etc...

pages visited by 1 user 15 times:
- same as above, I would get all pages visited by 1 unique user and ask which one had only 15 visits in total. if that's not good enough, because there are many pages with 1 unique user a day, I would use a different index on the b-tree of first query.
that index would consist of two integers, the #unique users and the #of total page visits (ordered primarily on unique users, secondary on total page visits) So I could ask both questions on the same index with logarithmic upper bounds (depends what I want exactly, e.g. one result or multiple results)

page visited by u3 more than 20 times (may be more than one...first, all > 20?, any if exists):
- I would assume the vector describing #visits / page for a single user and day is sparse.
- hash-table with user id as key and a vector as value
- the vector contains the tuple page-id, #visits
- to find I would have O(1) to get to the vector and a linear scan on the vector, If I sort the vector, I had O(lg(n)) to find the min(>20), O(1) to find any or none, or O(n) to find all > 20

- ChrisK August 13, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I think a hashmap for each of those query categories with the help of a master hashmap(one that keeps track of all enteries) would solve this specific question. What do you guys think?

- Itsme August 13, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I would use a hash in which the keys are the pages and the values the list of users that visited them. Next the code in Python that illustrates it showing how to solve the queries using the proposed data structure

from collections import defaultdict

def build_query_structure(data):
        d = defaultdict(list)
        for x, y in data:
                d[y].append(x)
        return d

query_data = [('u1', 'p4'), ('u3', 'p2'), ('u7', 'p9')]
query = build_query_structure(query_data)

# Query 1: Pages visited exactly 100 times
print filter(lambda x: len(query[x]) == 100, query.iterkeys())

# Query 2: Pages visited by only one user exactly 15 times
print filter(lambda x: len(query[x]) == 15 and len(set(query[x])) == 1,\
             query.iterkeys())

# Query 3: Page visited by u3 more than 20 times
print filter(lambda x: query[x].count('u3') >= 20, query.iterkeys())

- Fernando August 14, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I'm probably missing something here, but there's my approach depending on efficiency priority:
Efficiency for queries: 3-dimensional matrix with userId, pageId, visitCounter; indexed/sorted by all of them
Efficiency for inserts: 2-dimensions userId, pageId, allowing duplication

- Andres MJ August 14, 2017 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More