Implement a simple, persistent

Palantir Technology Interview Question for SDE1s

1

of 1 vote

0
Answers
Implement a simple, persistent, thread-safe, cache, which should ideally be able to store up to 1 million product names.
- sunilkanaujia.manit July 06, 2015 in India | Report Duplicate | Flag | PURGE
Palantir Technology SDE1

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: Written Test

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 1 vote

Implement a simple, persistent, thread-safe, cache, which should ideally be able to store up to 1 million product names.

Define:
Cache: Key -> value. get, write
Persistent: Write to a data-base
Thread-safe: Multiple requests to get and write simultaneously

Store 1 million product names

Constraints:
Space: Locking, datastructure to store information
(alternative data-structures may require less locking like trees,
trade-off space-time)
Request return time
thread-safe

Brute force(dont optimize): (Dont consider the constraints) unlimited space
data storage:
hash map

startup:
initialize cache (possibly have seriaslizd cache in database, load into hash-map)

event handlers
Maintain queue of requests.
Buffer queue of requests - handle sequentially

shut down:
write all data in hash map to data-base

test cases:

add individual features:
Parallelize requests:
New architecture:
load balancer
shared state:
Master slave, atomic commit protocol

sequence of get requests : trivially distribute to other threads

get(x), write(x', y)
x != x'
- parallelize
x == x':
sequential.

Thread safe:
Given a queue sequence.
perform topological sort on the queue and parallelize it.
write(x, y, t) depends on get(x, t') or write(x, y', t'') with largest t'/t''
computation of deps for static list:
given mp: optype -> x -> opId
and we have computed deps for operations uptil time 't'.
for operation at time 't+1'.
get(x, t+1) -> mp write x, mp = mp (read x = t+1)
write(x, y, t+1) -> max (mp write x) (mp read x), mp (write x = t+1)

Thus in O(1) we can compute the new dependency relation for any set of enquiries.
Now given that we have topologically sorted the input into some graph directed graph G.

We need to do two things:
1. Define an update for the graph.
get(x, t+1) we have the new node it gets assigned to n
if n == null
a) create a new starting node n' in the graph G and update GM : nodeId -> node
nodeIds are unique for now the timestamp (problem: space of timestamps is small, could wrap around address space or is drawn from some pool and consumed nodes retire their ids)
attempt to acquire lock on n:
if n has been consumed then go to (a)
otherwise add current node as child of n.
release the lock

2. Traverse the graph and schedule events to corresponding nodes.
for each of the starting frontier:
try and acquire a node.
once you acquire the node,
dispatch for consumption to some worker from a thread pool
then mark as consumed
release the lock.

add pairs of features
add triples of features
etc.

Persistence:
On simple example with queue of things:
Intersperse writes to data base.
Log information for each intermediate log.
Functionality^ state has a rollback, intermediate means high volume of writes
reduces data base as bottleneck.
Intermediate log helps recover not all but some information of failed missed transactions.

Trivially parallelizable to alternate solution:
Concerns: Race conditions from simultaneous writes.
Order writes to same resource:
Enforced by traversal of topologically sorted tree

Data storage ballpark:
10^6 records
record: key, value
8bytes - key
value - 1 MB

Graph: (noderef, edge, time)
Graph Map: key -> noderef
Stored in database values:
1000 requests per second.

time required for write:
0.1s

time required for read:
0.0001s

x reads per second,

goal:
calculate time required to service above quota of requests.
time: max possible response time.
sequence of writes all to the same resource.
(1000 - x) writes all to e.
preceded by x reads all to e.
x reads, 1000- x writes all to x.
maximum possible delay time is 99 seconds if all are writes.

Average consumption of space considering each resource recieves an equal number
of requests.
1000 requests, n resources.
1000/n requests.
1000/n*0.1 - 1 is the maximum time required to respond to a request.

maximum possible space consumed locally when servicing such requests.
space consumed:
1000 requests per second.

Graph: (noderef, edge, time)
Maximum number of elements in the graph over the course of a second.
In the beginning 0
n write request all to the same resource.
990 requests are buffered. (8 bytes, 4 bytes, 4 bytes) 16KB.

Graph Map: key -> noderef

CareerCup

Palantir Technology Interview Question for SDE1s

Books

Videos

Resume Review

Mock Interviews