If you have a 10G file and onl

Goldman Sachs Interview Question for Software Engineer / Developers

-3

of 5 votes

21
Answers
If you have a 10G file and only 2G of memory, how can you fit the file into the memory. Describe the solution and write the code.
- curious September 03, 2013 in United States | Report Duplicate | Flag | PURGE
Goldman Sachs Software Engineer / Developer Problem Solving

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 9 vote

We Can DO Something Like This:

Physical Data Logical Data
2 GB --> Lines of File [1-10]
2 GB --> Lines of File [11-20]
2 GB --> Lines of File [21-30]
2 GB --> Lines of File [31-40]
2 GB --> Lines of File [41-50]
2 GB --> Lines of File [51-60]
2 GB --> Lines of File [61-70]
2 GB --> Lines of File [71-80]
2 GB --> Lines of File [81-90]
2 GB --> Lines of File [91-100]

And We can Have a HASH MAP , storing the Key Range and The Physical Contents it Has IN a LOGICAL MANNER.

MAP Eg:-
{1$10 , Lines 1-10},
{11$20 , Lines 11-20} . etc.....

In This Manner , no need to load the entire file at one go , Instead we can Have OUR Map Decide WHich Portion of FIle to Load based on the Line to be Read.

Moreover we can index words that are non redundant, to help us in better searching phrases or form meanings from the sentences.

Eg: "Chemicals" can be index to be containing in the the following keys of the Hash Map

"31$40" & "82$90"

weapons is another word that is found in key "82$90". Such combination of non redundant and import words can be used to search important pharases from the file without loading the entire File.

- ARgho Chatterjee.

- acJavaFreak September 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

hmhmhmhmhmhmmhmhmhm

- mhmhh September 04, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

+3 to this answer?

LOL.

this.site.teeming.with.morons = true;

- Anonymous September 04, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

His concept is similar to Memory mapped files, but he is just not able to express it in the actual way it is supposed to be done.

- algorithmor November 04, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Split the input file into the "N" chunks (in this case, 2Gb each) that total to size "K" bytes (10Gb in this case). Load the chunks per request. When the memory is full, unload the oldest chunk/chunks to make space for new ones. Among other possible scenarios, You can unload a chunk based on the reference count of the given chunk (i.e., ref = 0).

- ashot madatyan September 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Something like Cache? Lazy loading approach?
design <position, content> pair pages,
in memory, 1M pages, every page 1MB data
the file is as large as 10M pages, load data to memory, position = addr/10.

- Cheng Q. September 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

A disk file typically has sequential access and hence poses challenges in terms of accessing a record randomly.
Therefore, make a copy of the file such that records of it form a doubly-linked list(which means records will have pointer to the previous one also). Use a double-ended queue(deque) of size ~2GB, so that as records are accessed from first to last or vice-versa, old records are de-queued from one end and new records are added at the other end.

- Murali Mohan September 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Huffman encoding???

- Hector September 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

{
vkdsmclsdmc ldsmc 
mvdkm cs
vksdlma;l
hczkMx,ab s
xkckla

}}

- Anonymous September 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

if only data is to be stored ,we may follow a good compression techniques .Otherwise we have to make logical storage in 2GB memory of 10 GB data.Please comment if i misunderstood problem

- Anonymous September 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Problem is ambiguous. Not your fault. Your interpretation is *much more* interesting than the other idiotic interpretation of partially reading file etc.

- Anonymous September 04, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

use the memory mapped file. mmap

- dc September 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Folks, the entire file (or data) has to fit into 2G of memory. Partial writing/reading of chunks of data is not an acceptable solution.

- curious September 05, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Write data in to the file. Once it reach 1GB compress (GZIP) the file and name it different. GZIP can compress in a ratio of 1032:1 .

- KPS September 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Hmm.
If I take the decimal expansion of pi then convert the hex into binary and truncate at 10G. .... I want to see the magic algorithm that can compress this to 2G and recover the data.

- bigphatkdawg September 17, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

binary value of 2 is 0010 hence value of 10 is 1010

- sam January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use MappedByteBuffer. It is used in existing trading applications and offers low latency.

- algorithmor November 04, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Try an LRU cache/Redis with capacity 2 gb with single or multil chunks.
when one chunk is done processing discard it and bring in the new chunk.

- Siddhi October 02, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 3 vote

Put it on drop box, and store the path! only 2GB? pshaw.

Otherwise possibly discuss compression algorithms, might be domain specific etc.

- Anonymous September 03, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

this.joke = true

- siddhi October 02, 2017 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

lol

- pkenil96 May 24, 2020 | Flag

CareerCup

Goldman Sachs Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews