Given a file of n lines, where

Amazon Interview Question for Software Engineer / Developers

0

of 0 votes

36
Answers
Given a file of n lines, where n-1 lines are identical and 1 line is different. Find the unique line in not more than one scan of the file.
- Nascent May 25, 2013 in India | Report Duplicate | Flag | PURGE
Amazon Software Engineer / Developer

Email me when people comment.

An error occurred in subscribing you.

Country: India

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 5 vote

First read three lines.
If they are all same, these lines do not contain the unique line. Save this line, and go on matching for the other line in the rest of the (n-3) lines.

If they are not all same, two of them must be same. The other one is the answer.

- Hello world May 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

But we only have the luxury of scanning the line once..If the three lines does'nt match then you have to again scan the lines to search for unique line.Correct me if I'am wrong

- Sibendu Dey June 30, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 6 vote

I can think of a simple solution..
XOR all lines and save the result in a variable 'res'.
If n is odd,
answer=res
If n is even,
if(line1.equals(line2))
answer=res^line1
else
answer=res^line3.

- teli.vaibhav May 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

This should work.

- aka May 26, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 votes

Actually we can solve the problem in another way:
1. Assume that in the current step of iteration you have three variables of string : previous, current, next
2. Check on each iteration :
2.1. if (previous == current) do noting
2.2. if (previous != current)
2.2.1. if (current == next) return previous
2.2.2. if (current != next) return current

Complexity : O(N * L) where L - max length of line
Auxiliary Memory : O(1)

Xor-solution.
Complexity : O(N * L) , where L - max length of line
Auxiliary Memory : O(L)

Here is an implementation of your idea on C#:

const int MAX_LINE_LENGTH = 100;
        static string GetUniqueLine(string[] lines)
        {
            if (lines.Length < 3) return null; // or handle it in another way

            char[] result = new char[MAX_LINE_LENGTH];
            foreach (string line in lines)
            {
                XorString(result, line);
            }

            if ((lines.Length & 1) == 0)
            {
                if (lines[0] == lines[1])
                {
                    XorString(result, lines[0]);
                }
                else
                {
                    XorString(result, lines[2]);
                }
            }

            return new string(result.TakeWhile(ch => ch != '\0').ToArray());
        }

        private static void XorString(char[] result, string line)
        {
            for (int i = 0; i < line.Length; i++)
            {
                result[i] ^= line[i];
            }
        }

- ZuZu May 26, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

LOL.

- Anonymous May 27, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 votes

XORing all the lines till the end may be little overwhelming. What if the different line was the 4th line in a million line file?

1. Compare 1st and 2nd lines.
a. If they both are equal, keep XOring the 1st line(or the 2nd line) with 3rd line onwards. Whenever the result of XOR operation !=0 then return the different line.
2. If they both are different XOR 1st line and 3rd line, XOR 2nd line and 3rd line. return whichever is nonzero

- Murali Mohan May 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

Hash the current line and keep track of its count. After scanning all the lines find the line with count = 1 in the hash map.

- Anonymous May 26, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

But there is possiblilty that different lines can generate same hash value, at that time this logic will fail ...

- Atul Kumar June 12, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

Store 3 lines at a time: previous, current & next. Initially, previous = line #1, current = line #2 and next = line#3.

while (next != EOF) {
if (!previous.equals(current) {
if (next.equals(current))
return previous;
else return current;
} else if (!current.equals(next)) {
return next;

previous = current;
current = next;
next = readnextLine();
}

- Murali Mohan May 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Is this a trick question of some sort ? But to me, it seems simple. maybe i am missing something.
However, we can do it this way too:
1. Have 2 counters initialized to 0 - counter1 similar lines and counter2 for the different line.
2. Begin scanning the file tokenized by \n.
3. Update the appropriate counters
4. Worst after about n-1 scans, the 2nd counter will be updated.
5. In either case, return current line when counter2 < counter1 && (counter1&&counter2 are both > 0)

- Anonymous May 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Also to add to the above,
won't the below suffice ?

uniq -u file.txt

- Anonymous May 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Amazon really likes hearing Unix ways to solve things FYI

- Farid June 04, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

- Hello world May 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

As hello outlined, just make sure that the first line is not unique, then use that as the comparison point. Java code below.

public class FindOddOneOut {
	public static void main(String[] args) {
		{
			String [] arr = {"hello", "hello", "hello", "hello", "bye", "hello", "hello", "hello"};
			System.out.println(oddOneOut(arr));
		}
		{
			String [] arr = {"bye", "hello", "hello", "hello",  "hello", "hello", "hello"};
			System.out.println(oddOneOut(arr));
		}
		{
			String [] arr = {"hello", "hello", "hello",  "hello", "hello", "bye"};
			System.out.println(oddOneOut(arr));
		}
		{
			String [] arr = {"hello", "hello", "hello",  "hello", "hello"};
			System.out.println(oddOneOut(arr));
		}
	}
	
	public static String oddOneOut(String [] lines)
	{
		if(lines.length <= 2)
			return null;
		else if(!lines[0].equals(lines[1]))
		{
			return lines[0].equals(lines[2]) ? lines[1] : lines[0];
		}
		else
		{
			for(int lineAt = 2; lineAt < lines.length; lineAt++)
			{
				if(!lines[lineAt].equals(lines[0]))
				{
					return lines[lineAt];
				}
			}
		}	
		return null;
	}
}

- Anonymous May 26, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Find the line which is unique, show it's index.
Short circuit the read if the unique line is found.
Must read no less than three lines if the line count is > 3.
1. Read each line of the file.
2. The first line is compared to each consecutive line, noting the line count.
3. When the line that is unique is encountered, report that line number and exit.

public void readFileLines(String fileName) {
		try {
			File file = new File(fileName);
			FileReader fileReader = new FileReader(file);
			BufferedReader bufRead = new BufferedReader(fileReader);
			String line1 = new String();
			String line2 = new String();
			String lineNext = new String();
			
			// Line count to identify which line is different than the rest.
			int count = 0;
			
			while ((lineNext = bufRead.readLine()) != null && !lineNext.isEmpty()) {
				++count;
				if (!line1.isEmpty() && !lineNext.equals(line1)) {
					line2 = lineNext;
					if ((lineNext = bufRead.readLine()) != null) {
						++count;
						if (lineNext.equals(line1)) {
							System.out.println("Line[" + (count-1) + "] is unique: \"" + line2 + "\"");
						}
						else if (lineNext.equals(line2)) {
							System.out.println("Line[" + (0) + "] is unique: \"" + line1 + "\"");
						}
						System.out.println("The other  Lines : \"" + lineNext + "\"");
					}
					break;
				}
				else {
					if (line1.isEmpty()) {
						line1 = lineNext;
					}
				}
			}
			if (!line1.isEmpty() && line2.isEmpty()) {
				System.out.println("All lines in input file are the same. There are " + count + " lines.");
				System.out.println("Each line is: \"" + line1 + "\"");
			}
		}
		catch (IOException e) {
			System.err.println("IOException: " + e.getMessage());
		}
	}

- msmtdg May 26, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

sort filename|uniq -c => gives the count of the lines and the line

Now using awk

sort filename|uniq -c |awk -F ' ' '{ if ($1==1) for (i=1; i<=NF; i++) print $i}'

- ccr May 26, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think its a trick ques :) (The trick is the part where it says 1 scan of the file).. I think we can do it in log(n) time frame:

divide the file in 3 equal pieces. hash them and compare them or simply compare. 2 will be equal and 1 different.. take the different one and repeat the process till you find the diff line.. :)

- abhi May 26, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

This is O(log N)? LOL.

- Anonymous May 27, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

How are you comparing the pieces ?

- Ashish May 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

String findUniqueLine(File f)
{
	String l1,l2,current = null;
	int l1count,l2count =0;
	while(!eof(f))
	{
		current = f.readline();
		if(l1==null){l1 = current;l1count++;}
		elseif(l1=current){l1count++;}
		elseif(l2==null){l2 = current;l2++; break}
	}

	if(l1count == l2count)
	{
		if(f.readline() == l1){return l2;}	
		else{return l1};
	}
	elseif(l1count>l2count)
	{
		return l2;
	}
	else
	{
		return l1;
	}
}

- Muktesh May 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

String findUniqueLine(File f)
{
	String l1,l2,current = null;
	int l1count,l2count =0;
	while(!eof(f))
	{
		current = f.readline();
		if(l1==null){l1 = current;l1count++;}
		elseif(l1=current){l1count++;}
		elseif(l2==null){l2 = current;l2++; break}
	}

	if(l1count == l2count)
	{
		if(f.readline() == l1){return l2;}	
		else{return l1};
	}
	elseif(l1count>l2count)
	{
		return l2;
	}
	else
	{
		return l1;
	}
}

- Muktesh May 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

// THE PSEUDO CODE IS AS FOLLOWS :

firstline = line(1)
if n < = 2
// This means that there are not enough lines to find the unique line
Exit
else
{
for i = 2 to n
{
if firstline = line(i)
Continue
else
{
if i != 2 Then
Printf "Line %d is different " &i
else
' Means the difference lies in first and second line
if line(3) = line(1)
Printf "Line 2 is different "
else
Printf "Line1 is different"

exit for
}
}
}

- Michelle May 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

// THE PSEUDO CODE IS AS FOLLOWS : 

firstline = line(1)
if n < = 2 
	// This means that there are not enough lines to find the unique line 
	Exit 
else 
{
	for i = 2 to n 
	{ 
		if firstline = line(i) 
			Continue 
		else 
		{ 
			if i !=  2 Then 
				Printf "Line %d is different " &i
			else 
				' Means the difference lies in first and second line 	
				if line(3) = line(1) 
					Printf "Line 2 is different "
				else 
					Printf "Line1 is different"	
		
	
		exit for 
		} 		
	} 
}

- Michelle May 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1. Take the first and second line compare with each other
2. If both are identical then compare with rest of the lines whenever you find the different string report as unique line.
3. If bother are different string then compare these two strings with 3rd line and report the unique string which is not matching with 3rd line

- Senthilmanikandan May 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think a better approach will be to use hashing. Compute a hash value for the first line and store it in a local variable. Then, continue to compute hash values for the successive lines until you get a hash value other than that of the first. This way you can avoid string comparisons..
- Pradeep

- Anonymous May 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Thanks @Pardeep - That is a good possible solution.
The cost of computing the hashCode for a string in Java iterates over the string length should be less expensive than doing a string compare at least in memory useage .

- msmtdg May 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

String findUniqueLine(File f)
{
String l1,l2,current = null;
int l1count,l2count =0;
while(!eof(f))
{
current = f.readline();
if(l1==null){l1 = current;l1count++;}
elseif(l1=current){l1count++;}
elseif(l2==null){l2 = current;l2++; break}
}

if(l1count == l2count)
{
if(f.readline() == l1){return l2;}
else{return l1};
}
elseif(l1count>l2count)
{
return l2;
}
else
{
return l1;
}
}

- Muktesh May 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public void findDupInFile() throws Exception {
BufferedReader br = new BufferedReader(new FileReader(new File("E:\\Application.log")));
Set<String> line = new LinkedHashSet<String>();
List<String> unq = new LinkedList<String>();

String str = "";

while((str = br.readLine())!=null){
if(line.add(str))
unq.add(str);
else
unq.remove(str);
}
System.out.println(unq.get(0));
}

- Rahul Singh Rathore May 29, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

In the first three lines, determine the common line.
If possible, determine the unique lndex and line
Early terminate the loop for the rest if unique line is found
sites.google.com/site/spaceofjameschen/home/file-and-iostream/given-a-file-of-n-lines-where-n-1-lines-are-identical-and-1-line-is-different-find-the-unique-line-in-not-more-than-one-scan-of-the-file-amzon

- Anonymous May 30, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

How about this??
Store the first line in a string.
Compare first line with second.If not equal then compare third line with first.If they are equal then second is unique else first.
If equal then iterate till you find the unique one.

- Sibendu Dey June 30, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

num=0;
while( line != NULL ) 
{
	if (num == 0 )
	{
		num=sumOfChar(line);		
	}
	else if ( num != sumOfChar(line))
	{
		return line ; 
	}
}

- omega May 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Your solution is wrong.
For example, file contains 5 lines :
ABCD
QWERTY
QWERTY
QWERTY
QWERTY

Your program will output "QWERTY" which is wrong in this case.

- ZuZu May 26, 2013 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 vote

wouldn't xor work here?
example:
abc abc abc
abc abc abc
cba cba cba
xor 1 and 2 line and you will find it will be zero and again do xor with 3 line will give non-zero and there you get your answer.
printf("%d %d\n", 'a'^'a','a'^'a'^'c');

- aka May 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

it would work if the given 'n' is odd..

- teli.vaibhav May 25, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

what if you have
cba cba cba
abc abc abc
abc abc abc
In your algorithm, you will not be able to identify which line is unique

- msmtdg May 26, 2013 | Flag

CareerCup

Amazon Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews