Amazon Interview Question
Software Engineer in TestsCountry: United States
Interview Type: Phone Interview
c:\> .... implies windows shell
And in unix you would use find:
find <dir> -name "Amazon" [-type .. ]
print_amazon_file_names(directory)
{
list = directory_listing(directory)
whlie(x= iterate through list)
{
if(x is directory and is not "." nor "..")
print_amazon_file_names(x);
if(x is a filename )
if(substring(x, "Amazon")) print(x);
}
}
There is an upperlevel on pathname_size and "Amazon" is fixed, so even the naive sliding window substring function has O(1) complexity.
The above algorithm does O(1) when it meets a directory too.
Thus O(f + d) is overall efficiency.
The question asks if "Amazon" is present in the filename. So we need to check if there is a substring "Amazon" anywhere in the name. If the name has length N, this will take O(N), not O(1)
Total complexity is O(number_files * filename_length + number_folders)
Create a hash table of directory names where the value of each table entry is a string of all names that are appended. We can use a special character to denote the start of a new file name. Now, traverse the hash table and use KMP algorithm to search for term "Amazon" in each of these strings, output the file name that contains the term.
If appending is not an option, then each entry of the hash table can be considered as a linked list where each node is a file name under that particular directory.
The complexity would be O(mn) where m denotes number of directories and n is the maximum number of files in one these directories (worst-case).
Another constraint to check is how deep can the directory level get? If directory depth is too high, then a recursive solution may not be preferable.
print_amazon_file_names(directory)
{
add_dir_to_queue(directory);
whlie(queue not empty)
{
list = directory_listing(directory)
while (list not empty) {
if(x is directory and is not "." nor "..")
add dir to queue();
if(x is a filename ) {
if(substring(x, "Amazon"))
print(x);
}
}
}
}
Assume we are interested in only searching file names not folder names.
-Form a Suffix tree for all the file names. This will enable to search file name containing 'Amazon' and can work for any names. And also with this approach search is faster and adding new file entry is also simple
If the interviewer is also interested in knowing the folder having the file, then this can be handled using a separate data structure which contains the list of folder paths alone and tag them with unique id. Use this id in suffix tree leaf node, to indicate the folder they belong to.
Here is my Ruby implementation.
#!/usr/bin/env ruby
# I interpret this to mean "Recursively find all files and directories containing a
# particular case sensitive substring in their names."
def find_files(fdir,name,options = {})
a = []
defaults = {case_sensitive: true, ignore_hidden: true}.freeze
options = defaults.merge(options)
Dir.entries(fdir).each do |e|
next if defaults[:ignore_hidden] && e =~ /^\..*?/
f = fdir + File::SEPARATOR + e
m = options[:case_sensitive] ? (e =~ /.*?(#{name}).*?/) : (e =~ /.*?(#{name}).*?/i)
if m
a << f
end
if File.directory?(f) && !(f =~ /\.{1,2}/)
a += find_files(f,name,options)
next
end
end
a
end
fdir = ENV["HOME"]
p find_files(fdir,"mp4",case_sensitive: true)
Your first answer to this question should be another question.
- Ryan September 13, 2013"Can I use grep?"
Only if they say no, should you try to write an algorithm.