Yahoo Interview Question
Software Engineer / DevelopersAssumptions: perfect hashing, a keyword appears only once in all 100 text files.
Divide all the 100 files into keywords, suppose we will get some 1000 keywords.
Consider having hashtable(array representation) of 1000 entries. Each hash table entry can hold a struct pointer having fd and offset in the file pointing to particular paragraph.
____________
k1 | fd1,offset1|
|------------|
k2 | fd2,offset2|
|------------|
k3 | fd3,offset3|
|------------|
and soo on
FD represents a particular file and offset represent distance from starting of file to starting of paragraph in which keyword is present.
After preprocessing its O(1) time operation. with O(num of keywords space).
"If a particular keyword can be part of different paragraphs and also be different files. And have to return all the paragraphs in which the file a particular keyword occurs"
_______ ___________ _______ _______
k1| --|--|fd1,of1,of4|--|fd2,of2|--|fd3,of3|
|-------| _______
k2| ----|--|fd5,of5|
and so on
Fd1 represents a particular file and of1,of4 represents the offsets to two different paragraphs in which this keyword K1 occurred.
write a script to use grep on those 100 text files for the keyword
- chennavarri October 18, 2010