Scientific Games Interview Question
Dev LeadsCountry: India
Read entire file in Memory.
It involves 2 actions. Start 32 threads working in parallel on chunks of every roughly 300Mb.
Because searching is CPU intensive there is no point creating more than 32 threads since there are 32 cores and all of them will be busy.
How to optimize searching further:
Create a trie of every chunk where each node stores the word along with the line number that it appears in.
is the file on disc? so the bottle neck is disc I/O
- Chris August 12, 2017is the file on memory? so, the bottle neck is memory
is the file on multiple machines: good, we can work in parallel: assign blocks of x GB to m machines, take care that the blocks overlap so at least one machine has the whole pattern ...