Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
Hash table with an interface to code/decode the hash values as "browser-friendly" ASCII strings. Base64 or Base56 are good choices for this.
Speed can be improved by using caching. The most visited URLs and their hashes would be maintained in a caching hash table that is kept in main memory.
Questions
How tiny?
If a URL is already shortened, do we shorten it again?
If a URL is a URL that does not lead anywhere, but is a made up URL, then do we still shorten it?
If the URL is already “short enough” do we shorten it further?
Do we shorten URLs that are short URLs generated by our short url system?
How many URLs are we talking about? -- Is it more than Long.MAX_VALUE?
Core Design
Insert URL to database and use something like base 62 [a-z,A-Z,0-9] encoding of the index column.
We look up if a URL is already shortened -- if it is then return that shortened version.
Go to URL and don’t create a short URL if you get a 404 or if DNS resolution fails (domain doesn’t even exist)
Check for malicious content.
Check against a blacklist of websites.
Send content to an API that tries to use AI to understand if malicious content.
When a user does a GET on a short URL, then do a 302 to redirect user to it.
Provide functionality to flag a URL.
Provide functionality to delete a URL?
There can be a potential problem here - If I insert a URL, delete it and then repeat this process a large number of times, I will extinguish available short URLs
Provide functionality to detect DOS attacks
Throttle number of requests per IP per URL.
Add frequently used URLs to cache.
Disk/Database -
--Replication
--Sharding - based on URL
--Disk striping (RAID0)
--Use SSD.
Cache
Distributed cache
Edge cache servers for regions.
Possible problems
If I insert a URL, delete it and then repeat this process a large number of times, I will extinguish available short URLs
-We can mark a URL deleted on delete, then while creating new URLs, first get the list of URLs marked deleted and replace the first one of those
-If no URL marked for delete, then insert new record into DB.
Dang it ... I meant to write:
- JSDUDE January 19, 2015Design a Tiny URL system.