How will you remove duplicate

Amazon Interview Question for Interns

0

of 0 votes

48
Answers
How will you remove duplicate characters in a string
- enthusiast November 15, 2012 in United States | Report Duplicate | Flag | PURGE
Amazon Intern Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 1 vote

///
public static void main(String[] args) {
String str = "abcabccbcbaasadfjhbuewfrhjhjdsaf";
HashMap<Character, String> strHashmap = new HashMap<Character, String>();
for (int i = 0; i < str.length(); i++) {
strHashmap.put(str.charAt(i), null);
}
str = "";
for (Character keys : strHashmap.keySet()) {
str += keys;
}
System.out.println(str);

}\\\

- Solution November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Use a LinkedHashMap as order is important...

- Anuj Agrawal February 04, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

My solution will be
1.Sort the string using ascii values.O(nlogn)
2.Check the consecutive characters,if they are equal remove it. O(n);
so time complexity will be O(nlogn) space complesity will be O(1)

- Kannan S December 29, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Build the char binary tree (Right side of tree contains char more than root char & left side of tree contains char less than root & if it is same then it is duplicate)

- Senthil November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

best answer in terms of performance...hashtable implementation is too easy

- jumbo November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Will work great on average, but you still need to code that part that actually builds a output string - that's what most of people do wrong :)
BTW, speed-wise it would be event better to create an array of booleans (65535 items) - then it takes just few cpu instructions to check whether you have that letter already or not :)

- ImmortalRat November 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

how is using a bst better? Hashtable completes the solution in O(n). using a bst requires O(nlogn) { log n to insert each element * n elements). You don't need anything to be sorted you just want to know if it is duplicated.

- Anonymous December 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Use bitvector instead of hash table ..memory efficient one...

- ranganath111 January 26, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Why not use a hashset?

- btb February 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

This in C# this should be able to delete the duplicate characters from a string of characters:

static void Main(string[] args)
{
//Delete duplicate characters from string
Console.WriteLine("Input your Value:");
var strInput = Console.ReadLine();
char[] cInputArray;
cInputArray = strInput.ToCharArray();
int iCount = cInputArray.Length;

strInput = "";

for (int i = 0; i < iCount; i++)
{
if( !strInput.Contains(cInputArray[i]))
{
strInput += cInputArray[i].ToString();
}
}
Console.WriteLine("OutPut {0}", strInput);
Console.ReadLine();
}

- kiriti November 17, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

C# later versions(with generics and LINQ support). Hope it helps!

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

 class Program
 {
        static void Main(string[] args)
        {
            StringParser stoParse = new StringParsingFunctions();
            string orString = "Thisis a duplicate, string,; with: character'srepeated.";
            string ndString = stoParse.NonDuplicateCharacters(orString);
            Console.WriteLine("Original String: [" + orString + "]");
            Console.WriteLine("non-duplicate String: [" + ndString + "]");
        }
}

     public class StringParser
    {
        // Summary:
        //     Method to remove duplicate characters in an unsorted string.
        //
        internal string NonDuplicateCharacters(string oString)
        {
            string[] aString = oString.Select(c => c.ToString()).ToArray();
            string ndString = "";
            Dictionary<string, int> sDict = new Dictionary<string, int>();
            int iCount = 0;
If(aString.Length==0) return ndString;
else
{
            for (int i = 0; i < aString.Length; i++)
            {
                if (!sDict.ContainsKey(aString[i]))
                {
                    iCount++;
                    sDict.Add(aString[i], iCount);
                    ndString += aString[i];
                }
            }
          }
           return ndString;
       }
  }

}

- O November 20, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

private void removeDuplicates() {
		// TODO Auto-generated method stub
		
		String str="oiuytrddd";
		boolean flag = false;
		char[] temp = str.toCharArray();
		char[] temp1 = new char[temp.length];
		
		int k=0;
		for (int i = 0; i < temp.length; i++) {
			
			flag=false;
			for (int j = 0; j < temp1.length; j++) {
				
				if(temp[i]==temp1[j]){
					
					//System.out.println("iiii==="+temp[i]);
					flag=true;
					break;
				}
			}
			
			if(flag==false){
				//System.out.println("kkk==="+temp1[k]);
				temp1[k]=temp[i];
				k++;
			}
			
		}
		
		//System.out.println("=========="+tempStr);
		
		for (int i = 0; i < k; i++) {
			System.out.print("===="+temp1[i]);
		}
		
	}

- suvrokroy November 21, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

void main()
{
char Hash[255];

char str[]="Maladydfsgys";
int a[]={9,6,5,0,2,1};
int L1,L2;
int t;
// L1=-1;
// L2=-2;
for(t=0;t<255;t++)
{
Hash[t]=0;
}
for(t=0;str[t]!='\0';t++)
{
Hash[str[t]]=1;
}
puts("the original string:");
puts(str);
puts("\nthe non duplicate chars:");
for(t=0;t<255;t++)
{
if(Hash[t])
printf("%c",t);
}
}
}

getch();
}

- Qrkv December 01, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

My soultion would be ..

for example: abaacdabb.

1. Take first character 'a'. Check for the other characters in the string. If 'a' exists remove it.
2. After first iteration we are left with abcdbb.
3. Take second character 'b' , compare with the other character , remove the other occurances of b
4. After 3rd step we are left with abcd.
5. Similiar steps for c.
6. Checking for d can be omiitted as its the last character .

Hope my solution help !

- yash December 10, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

initialize an array of size 255.

set index =-1 if char is found

private void removeDuplicate() {
		// TODO Auto-generated method stub
		int [] temp = new int[255];
		String str="iuytrertyui";
		String t="";
		char [] arr = str.toCharArray();
		for (int i = 0; i < str.toCharArray().length; i++) {
			if(temp[(int)arr[i]]!=-1){
				t+=arr[i];
				temp[(int)arr[i]]=-1;
			}
		}
		System.out.println("==="+t);
	}

- suvrokroy December 16, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

#include <iostream>
#include <string>

using namespace std;

void removeDuplicate(string &s);
int main(){
	string s = "aaaabbbbcad";
	removeDuplicate(s);
	cout<<s<<endl;

}

void removeDuplicate(string &s){
	string ns;
	for(int i = 0; i < s.size(); i++){
		signed int count = 0;
		bool dup = false;
		do{
			if(ns.size() == 0){
				break;
			}
			if(s[i] == ns[count]){
				dup = true;
				break;
			}
			count++;
		}while(count < ns.size());
		if(!dup)
			ns.append(1,s[i]);
	}
	s = ns;
	
}

- kmonk December 18, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public static boolean isUniqChars(String inputString)
{
int countChar = 0;
int pointer = 0;
System.out.println("countchr" + Integer.toBinaryString(countChar));
while (pointer < inputString.length()) {
int value = inputString.charAt(pointer) - 97;
if ((countChar & 1 << value) > 0) {

return false;
} else {
countChar = countChar | (1 << value);
}
pointer++;
}
return true;

}

- Raj January 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

- Raj January 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I would iterate through the "String", using a for loop, from the first to last character.
Create a Hashmap<Character, String>, (value doesnt really matter).

use put(Character, String) for every character in the "String", then use map.keySet().

Lastly, convert this set to a string (Should contain no more duplicates since hashmaps overwrites duplicate keys)

- InnoIrving January 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

LinkedHashSet is enough.

- Shu January 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Scala::

def rmDuplicates(s: String): String = s.toList match {
 case Nil => ""
 case c :: rest => c + rmDuplicates(rest.filterNot(e => e == c).mkString)
}

- rbsomeg February 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Assuming only English alphabets present in string
1)Convert string to char Array
2)Create an Boolean array of 26 all values initializes to false
3)Parse through the char array, for each char do true in the Boolean array for that alpahbet
4)use string buider to add all char == true in array of 26 to output

- Sanket February 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

for(int i=0;i<temp.length();i++)
	{
		if(newStr.find(temp[i]) == string::npos)
		{
			newStr += temp[i];
		}
	}

	cout<<newStr;

- Simple C++ Implementation October 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

import javax.swing.JOptionPane;

public class RemoveDuplicates {

		public static void main(String []args)
		{
			System.out.println(new RemoveDuplicates().removeDuplicates(JOptionPane.showInputDialog("Enter input String :")));
		}
		
		public String removeDuplicates(String input)
		{
			
			for(int i = 0;i < input.length();i++)
			{
				
				if(input.lastIndexOf(input.charAt(i)) != i)
				{
					for(int j = i+1;j < input.length();j++)
					{
						if(input.charAt(i) == input.charAt(j))
						{
							input = input.substring(0,j) + input.substring(j+1,input.length());
							
						}
					}
				}
			
			}
			
			return input;
		}
}

- Buzz_Lightyear November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

If you find a duplicate character , make the original string a concat of substrings before and after the duplicate character

- Buzz_Lightyear November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

this is probably the slowest and the most GC-intensive method to remove duplicates

- ImmortalRat November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Whats GC intensive ?

- Buzz Lightyear November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

You can use a StringBuffer to append if character isn't a duplicate but the above method does'nt use an extra data structure and does it in-place.

- Buzz Lightyear November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

GC - garbage collection. every concatenation of two strings creates a new string. but previous two do not disappear from memory - memory for string if always allocated in heap, and not in stack. which means that when you exit the method scope, all the strings that you have just allocated are all sitting in dynamic memory. They will get removed from memory during next garbage collection (unless you were doing all that on a large input string, which is 8kb or 4k characters - then it goes into the large objects heap and stays there longer), but they go create a huge fragmentation of the heap, and garbage collection will probably be forced even before you finish running the method - that blocks entire process (before .NET 4) and all threads have to wait until garbage collector reclaims all unused objects (which get created in a huge volume). Even if you use StringBuilder - you are still doing calls to Substring - that method creates a new string every time, and while you do reduce number of strings that get created - you are still creating a huge amount of work for GC. Try to run the following test - run your method for 10000 times while giving it really large string as an input (usual stuff for evaluating performance) - and check hom much memory the app uses (you can also check out counters for GC runs on generation 1).

- ImmortalRat November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Thanks ! Understood the issue !

- Buzz_Lightyear November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 3 vote

If duplicates mean same character multiple times in a row, then there are two options here:

private string RemoveDubs( string str )
{
    int length = str.Length;

    if (length < 2)
        return str;

    StringBuilder builder = new StringBuilder(length);

    char previous = (char)0;

    for (int i = 0; i < length; i++)
    {
        char current = str[i];

        if (current != previous)
        {
            builder.Append(current);
            previous = current;
        }
                
    }

    return builder.ToString();
}

(Use pointer-base implementation if extra 30% of performance is critical)

- ImmortalRat November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

You are only checking if the previous character and the current character are not same !

What if I give an input like abcda ??

You ll only check if 'a' is different from 'd' before appending to builder !!

This method won't work !

- Buzz Lightyear November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

you probably have not noticed my text right before the code :) No one ever mentioned in a task whether duplicates are treated as duplicates only if they are in sequence or not, so I specifically mentioned which kind of problem my code solves.
Imagine that you are on interview, and you have been asked to solve the problem. My solution with along my comment would get me way more credits that any other solution without a comment or clarification, as the task is intentionally unclear :)

- ImmortalRat November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Void main()
{
char str[ ];
int i, n,j;
n=strlen(str);
for(j=0;i<n;j++)
{
if(str[i]==str[j])
{
count++;
}
if(count<2)
{
str2[]=str[i];
}
else
{

- Niks November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
main()
{
char *a=(char *)malloc(1000);
char *b;
char *c;
char *d=a;
printf("enter string\n");
scanf("%s",a);
b=a;
while(*a!='\0')
{b=a+1;
while(*b!='\0')
{
if(*b==*a)
{c=b;
while(*c!='\0')
{
*(c)=*(++c);
} }
b++;

}
a++;
}
printf("%s",d);

getch();

}

- ankur November 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

class RemoveDuplicate{

static int[] str = new int[256];

static{
for(int j=0; j<256; j++){
str[j] = 0;
}
}
public static void main(String[] str2){
String str1 = "eabaccdcde";
for(int i=0;i<str1.length();i++){
if(str[str1.charAt(i)] == 0){
System.out.print(str1.charAt(i));
str[str1.charAt(i)]=1;
}
}
}
}

- Anand Thakur November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

void RemoveAllDuplicate(char* ptr)
{
bool arr[256] = {false};
char* it = ptr;
char* CopyIt = ptr;

while (*it)
{
if ( arr[*it] == false ){
*CopyIt = *it;
arr[*it] = true;
CopyIt++;
}
it++;
}
*CopyIt = '\0';
}

- Nishikant November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

- Nishikant November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

I would use an array of 26 Boolean values where first element corresponds to 'a' and the last element corresponds to 'z' to indicate if a character has previously seen in the input string.

- Anonymous November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

string s = "This is a string .";
char[] a = s.ToCharArray();
string newString = "";
for (int i = 0; i < a.Length; i++)
{
if(newString.Contains(a[i]))
{
}
else
{
newString = newString + a[i];
}
}

- NewToCoding November 16, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

-2

of 2 vote

by placing all characters of a string in hashtable. group characters from hash table to form string back

- Anonymous November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

i don't understand why it was downrated...hashing may be a good option to keep a track of chars already found if no space constraint is mentioned.

- The Artist November 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

That's because you lose the sequence. Even though it will be fast

- ImmortalRat November 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@immortalRat: how abt having a linked list of chars too, to which you will be appending the chars which isn't present in the hashmap in addition to adding it to the hashmap?

- Anonymous November 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

how about maintaining the hash-table only for checking if there was a duplicate, if yes, remove the duplicate from the given string in place?

- kat November 17, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@kat: string manipulation is a highly process (and memory) intensive task...my suggestion above (hashtable with linked list) would be a better option

- The Artist November 18, 2012 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 vote

maintain a bool visited array of 256 size.
bool visited[256]={0};
remove_duplicates(char* input)
{
int tial=0;
int len=strlen(input);
for(int i=0;i<len;i++)
{
if(visited[input[i]] == 0) {
input[tail++] = input[i];
visited[input[i]-'a'] = 1;
}
}
input[tail]='\0';
}

- bharat November 15, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Consider all unicode characters and your method will fail severely. A hash map would be better to keep track of already found chars and a linked list of chars or a string builder/buffer (in c#/java) to create the output string would be the best option to go for if there;s no space constraint.

- The Artist November 15, 2012 | Flag

CareerCup

Amazon Interview Question for Interns

Books

Videos

Resume Review

Mock Interviews