Microsoft Interview Question
Software Engineer / DevelopersCountry: India
Interview Type: In-Person
Use circular buffer of length n, where each node stores one line. At the end of file. Print data in buffer line by line
This will result in extra memory usage of "n".
While, Manpreet's answer is more optimized one :)
@play2win. You are right. But Manpreet's answer assumes you have access to different locations in a file (which is a reasonable assumption given the way the question is stated), while this does not. This assumes it is a stream. For instance, this will work with data you are receiving over a network.
Also, you will read over most of the file twice. While this algorithm will read it only once, and sequentially (which might be better for performance for cylinder-disk based files).
Assuming n is much smaller compared to the total number of lines, this algorithm will be much better.
If you want to avoid reading over the file twice, you will have to save the n offsets in file (which can also be done in the circular buffer solution, instead of saving the lines), so it is not really that different.
btw, if you are assuming disk based files, a better solution would to be start reading from the end of the file (after a seek...).
So an example algorithm would be:
If the expected length of the line is L (say fixed length records in a log file), then seek to end - nL and read the data, and see if you have gotten the n lines. If not, you seek appropriately and read again.
This way you will not read most of the file you don't need to read.
You could also seek constant bytes prior and read the lines in constant chunks and avoid reading no more than a constant number of bytes.
Which algo you choose will depend on the line structure.
Yup that's correct that choice of algorithm will depend on the file structure and line structure too.
But looking to the problem given, it seems to be simple text file which can be read from starting only. If data is coming from the network or its in a disk based files, problem will completely change. We shall have different problems for the specific scenarios.
Looking towards a simple and best solution, its easy to maintain two pointers. Let's not complex the problem without the need :) Else it will be like confusing the interviewer :P
Nice to see the brainstorming the problem!!
@play2win: I think reading from end (whether it is a simple text file or not) is a simple solution and not confusing at all. In fact, interviewers (in Microsoft/Google) actually look at such discussion as a good sign.
Yaa true reading from end shall work. But still someone has to parse the file till end. Whether it is done by you in code or the API of the language you are using.
Doubt, if that will be acceptable, otherwise anyone will think that as first solution but no one will post that..
@play2win: Have you heard of file seek? I am guessing not, as otherwise your last comments don't make any sense.
Please provide the code snippet in some language to understand the same.
Although I am not interested much in putting my energy into one question...
class PrintLastN
{
public void PrintLastNLinesOfFile(string fileName, int numberOfLines)
{
try
{
if (numberOfLines <= 0)
throw new Exception("Invalid Number of Lines: Number of lines have to be greater than 0");
int i = 0;
StreamReader streamReaderPointer1 = new StreamReader(File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.Read));
StreamReader streamReaderPointer2 = null;
while (streamReaderPointer1.ReadLine() != null)
{
i++;
if (i == numberOfLines)
{
streamReaderPointer2 = new StreamReader(File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.Read));
}
else if (i > numberOfLines)
{
streamReaderPointer2.ReadLine();
}
}
if (streamReaderPointer2 != null)
{
string line = string.Empty;
while ((line = streamReaderPointer2.ReadLine()) != null)
{
Console.WriteLine(line);
}
}
else
{
Console.WriteLine("Invalid Number of Lines: Number of lines are less than expected");
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
finally
{
Console.WriteLine("Press key to end...");
Console.ReadLine();
}
}
}
Build a queue of length n with first n lines. At each iteration enqueue the element read, and dequeue. Once the file end has been reached, dequeue and print lines, unless queue is empty.
Use two index/pointers, one lets say P1 & P2, Now enhance P1 to n lines, if it already reaches to end of file and then print the file with 2nd pointer. If not then
start both pointer, enhance both one by one, means enhance P2 by one line and then enhance P1 by one line(starting with first line),
when we P2 reaches to end of file, then we print lines with the help of P1.
#include<stdio.h>
#define N 33
void printfile(FILE *file,long offset,int line){
if(line > N ){
return;
}
if(fseek(file,-1*offset,SEEK_END)!=-1){
char ch=fgetc(file);
if(ch == '\n'){
line++;
}
printfile(file,offset+1,line);
printf("%c",ch);
}
}
void main(){
FILE *file = fopen("file.txt","r");
if(file == NULL){
//printf("file not found \n");
perror("Error");
}
printfile(file,0,1);
}
We can also think this problem similar to find the nth element from a singly linked list.
- Manpreet February 07, 2013We can have two variables:
- one for start point of file.
- one for end point of file, first time by moving 'l' lines ahead. (As we do not know the no of lines in a file, so to keep track of the size of buffer we are interested in, Like we used to do for finding nth element)
After this, move end point one line ahead and check is EOF?
If not, then also increase the start point one line ahead and so on...
When you will hot EOF with the end point, then you have the point from the start.
you then just have to print the lines on the way till end.
Hope this is what, we are exactly supposed to do.
Please add, If required.
Thank you..!!