## VMWare Inc Interview Question

Interns**Country:**India

**Interview Type:**Phone Interview

Note: If the value type is integer than we can use following algorithm to remove all duplicates in O(n) complexity.

Algorithm:

============

```
Node pre = null;
removeDuplicates(LinkedList list){
Node tmp = start;
integer bitarray = 0;
while(tmp ! = null){
integer x = 1; // 0000 0001
integer value = temp.value;
/***********
Left shift inter x (1) by number of places equals to value
If value =4, After below operation x will become 0000 1000
***********/
x = x << num;
/*****************************
Bitwise and operation to add above result in a bitarray.
********************************/
if((bitarray & x) != 0){
System.out.println("Duplicate found in given linked list: " + num);
/**************************
It will delete current node with O(1) time complexity and will return next node;
We can't delet last node by using below operation so we need to check it.
*******************************/
if(temp.next == null){
pre.next == null;
return list;
} else {
pre = tmp;
list.delete(tmp);
}
} else {
bitarray = bitarray | x;
pre = tmp;
tmp = tmp.next();
}
}
}
deleteNode(LinkedList list, Node node){
/************************************************************************************
We are copieng values from next node to current node, and will remove next node;
************************************************************************************/
Node tmp = node.next;
node.value = tmp.value;
node.next = tmp .next;
tmp.next == null;
tmp = null;
}
```

No!!. It is different , we are not using extra space O(n), but we are using O(1) space, simple integer as a bit array ...:)

at first x was 1, and first encountered number was 4.

then x = x << 4;

1 << 4 becomes 16 , which is 0001 0000 ???

It does not depends on value of X. It depends on index\place of bit 1 in X. Suppose following numbers are given .. 2, 3 4, 4. So in third iteration it will be something like ...0000011100. So in fourth iteration we can easily check with & operator that 4 is duplicate ...

0 1 = 0

1 0 = 0

1 1 = 1 - Duplicate

All you are doing in your strategy is using a bit array as a type of sparse array to hold the found values in the linked list. This violates the constraint that no extra space be used. What if your integers were large numbers, or negative? Your strategy quickly breaks down.

Can be done i think, using following approach. Not tried though.

```
void deleteDuplicate (list *head)
{
int isListChanged = 0, count = 0;
list *temp, *temp1, *prev;
temp = temp1 = NULL;
do
{
count++;
isListChanged = 0;
temp = temp1 = head;
do
{
for (i=0;i<count;i++)
{
temp1 = temp1->next;
}
if (temp->data == temp1->data) isListChanged = removeNode(temp1);
temp = temp->next;
}while(temp1);
} while (isListChanged);
}
```

Basically, Idea is to use two pointers, one is temp which runs only to next every time, and other pointer is temp1 moves to by 2 in first iteration, 3 in second iteration, 4 in fourth and so on, until see some node in list is removed.

If all the values are integer then its possible to do it in O(n) time without using extra space.

Steps:

- Declare a variable say test.

- Start traversing the linked list and do a XOR of 'test' with the value in that node.

- If the result is '0' that means 'test' has been XOR'd with the same value before, so just delete that node(which is gonna take O(1) time), and retain the original value of the test.

- Else update the value of 'test' with the new XOR'd value and retain the node.

- Go to the next node in the linked list and repeat the above steps till the end of the list.

In the end we are left with the linked list with non-repeated elements.

I hope thats helpful.

Nice approach. But won't the time complexity be O(n2) since there will be two loops, one for each element and the other for taking xor of all the other elements? Please correct me if I am mistaken.

Thanks Alex.

There will be only one loop traversing the list. While traversing the list, I am updating the value of variable 'test' (depending on the conditions mentioned) and using the same value as the current value for the next traversed nodes.

So I am traversing the entire list only once, which makes it O(n).

suppose 5 4 5 elements ,

at first xor give - 101 then 001(101 ^ 100) and 100(001 ^ 101) dis is not zero ?

xor will only delete duplicates

This question has some ambiguity so the problem solver will have to clarify at least one fact.

The ambiguity is whether the idea of "repeated elements" means repeated adjacent to each other, or repeated anywhere in the list. The former case is solvable in O(n) but the latter case is essentially a sorting problem (due to the fact that building a hashtable or map is not allowed) so it probably can't be solved in less than O(n*log(n)).

The most straightforward solution is to simply use a nested loop, the outer loop pointing at each value in the linked list; and the inner loop pointing to each remaining value in the list. If a duplicate is found, you have to remove the duplicate node, which requires careful work with pointers. The downside is that complexity is O(n^2). The following C++ code ran successfully on VS2012.

```
#include <iostream> // cout
#include <Windows.h> // Sleep ()
using namespace std;
struct node
{
int val;
node * next;
node (int new_val) : val (new_val), next (nullptr)
{
}
~node ()
{
if (next) delete next;
}
void removeNext ()
{
if (next)
{
if (next->next)
{
node * n = next;
next = next->next;
n->next = nullptr;
delete n;
}
else
{
delete next;
next = nullptr;
}
}
}
void printList ()
{
int i = 0;
node * n = this;
while (n)
{
cout << i << ": " << n->val << endl;
n = n->next;
i++;
}
}
};
void removeDups (node * root)
{
node * c = root;
while (c)
{
node * x = c;
while (x)
{
if (x->next && c->val == x->next->val)
{
x->removeNext ();
}
else
{
x = x->next;
}
}
c = c->next;
}
}
void main (int argc, char ** argv)
{
node * root;
node * cur_node;
root = cur_node = new node (2);
cur_node->next = new node (4);
cur_node = cur_node->next;
cur_node->next = new node (6);
cur_node = cur_node->next;
cur_node->next = new node (4);
cur_node = cur_node->next;
cur_node->next = new node (2);
cur_node = cur_node->next;
cur_node->next = new node (2);
cout << "Original singly linked list:" << endl;
root->printList ();
removeDups (root);
cout << "After dups are removed:" << endl;
root ->printList ();
Sleep (30000);
}
```

I'm not sure but this is what I came up with. Since you can't use any extra space, you use the space given to you. How? You divide the list into processed parts and non-processed part. The processed part is from head to tail, this sub-list is where we linearly maintain uniqueness. The non process part is from tail to end of the list.

You start with tail at the beginning of the list. As you go through each node, you check if the node is already in the sub list [head, tail], if not increase the tail since the node is unique and that it includes the current node in this processed list that we are maintaining to be unique. If it is already in there, we delete the current node and check the next one. Notice how we aren't incrementing tail because the current node was a duplicate and we only want unique nodes in the sub list of [head,tail].

The time complexity is O(n). Space is O(1). Can anyone confirm if I did this right? Thoughts?

```
// Search if the object is already in the list of unique characters that's been searched
boolean isUniqueSearchSubList(Node head, node tail, Node obj){
Node current = head;
while ( current != tail.next ){
if ( current == obj ) // we found a duplicate!
return false;
current = current.next
}
return true;
}
void removeDuplicate(){
if ( head == null || head.next == null ) return; // check empty and 1 element list
Node prev = head;
Node current = head.next;
Node tail = head; // the end of the list of unique characters we gone through
while ( current != null ) {
// check if the current node is already in the list of unique objects ( head to tail )
if ( isUniqueSearchSubList( head, tail, current ) == true ){
tail = tail.next; // unique so include it in the list by incrementing tail
current = current.next
}else{
// delete the current one
prev.next = current.next;
current = current.next;
}
}
}
```

This will not be O(n) becuase initially sublist's size is 1 then 2 and increases linearly. Everytime we traverse through sublist. So It would be 1+2+3+4+....+n = n*(n+1)/2. So It would be O(n2) according to me.

I found another error in this code:

The line "if ( current == obj )" is not correct. You are comparing node address not the value. BTW you did great job. This code will work for all datatypes. All you need is to use templates.

The corrected code and results are attached:

// Search if the object is already in the list of unique characters that's been searched

bool LinkedList::isUniqueSearchSubList(Node *tail, Node *obj)

{

Node *current = head;

while ( current != tail->next )

{

// if ( current == obj ) // we found a duplicate!

// you are comparing nodes not the actual value, the code

// should be changed to

if ( current->data == obj->data ) // we found a duplicate!

return false;

current = current->next;

}

return true;

}

void LinkedList::RemoveDuplicate()

{

if ( head == NULL || head->next == NULL ) return; // check empty and 1 element list

Node *prev = head;

Node *current = head->next;

Node *tail = head; // the end of the list of unique characters we gone through

while ( current != NULL ) {

// check if the current node is already in the list of unique objects ( head to tail )

if ( isUniqueSearchSubList(tail, current ) == true )

{

tail = tail->next; // unique so include it in the list by incrementing tail

prev = current; // added this line

current = current->next;

} else {

// delete the current one

prev->next = current->next;

current = current->next;

}

}

}

Result:

Before: 1 3 7 5 8 8 8 2 3 4 5 5 6 7 7 7 9 8 9 3 2 10 1

After: 1 3 7 5 8 2 4 6 9 10

// you are missing line prev = current;

if ( isUniqueSearchSubList( head, tail, current ) == true ){

tail = tail.next; // unique so include it in the list by incrementing tail

prev = current; // it will work if you add this line

current = current.next ;

.........

simplest possible code and approach:

while(current !=NULL)

{

prev = current;

head = current->next;

while(head != NUll)

{

if(current->info != head->info)

{head=head->next;}

else

prev->next = head->next

free(head)

head= prev->next;

}

current = current->next;

}

Here's one approach which although not generic, but can accurately delete repetitive nodes with time complexity O(n) in a single linked list having maximum 4k nodes (4096 nodes.)

```
void deleteRepeatingNodes(node_t * head)
{
long int row = 0, col = 0;
char rshift = 0, cshift = 0;
node_t *prev = NULL;
if (head == NULL)
return head;
while (head) {
rshift = (head->data / 65) ? (head->data/64 - 1): 0;
cshift = (head->data % 65) ? (head->data%64 - 1): 0;
if (row & (1<<rshift) && col & (1<<cshift)) {
prev->next = head->next;
free(head);
head = prev->next;
} else {
row |= 1<<rshift;
col |= 1<<cshift;
prev = head;
head = head->next;
}
}
return;
}
```

Akshay is right in principle. however, the limit is not on number of nodes. it is rather on data. idea is to set bit flag for already seen values. the maximum we can do in C++ with this scheme is 64bit X 64bit (2d matrix)(unsigned long long) which is 4096. here is how I did it:

```
void LinkedList::RemoveDuplicates()
{
class IntMap
{
private:
unsigned long long m_Rows;
unsigned long long m_Cols;
public:
IntMap()
{
this->m_Rows = 0;
this->m_Cols = 0;
}
bool Contains(int data)
{
int quotient = (int)floor(data / 63.0F);
int remainder = data % 63;
unsigned long long rowMask = 1 << quotient;
unsigned long long colMask = 1 << remainder;
return (this->m_Rows & rowMask) == rowMask &&
(this->m_Cols & colMask) == colMask;
}
void Set(int data)
{
int quotient = (int)floor(data / 63.0F);
int remainder = data % 63;
unsigned long long rowMask = 1 << quotient;
unsigned long long colMask = 1 << remainder;
this->m_Rows |= rowMask;
this->m_Cols |= colMask;
}
} intMap;
Node* current = this->m_Start;
Node* previous = NULL;
while(NULL != current)
{
if(!intMap.Contains(current->m_Data))
{
intMap.Set(current->m_Data);
previous = current;
current = current->m_Next;
}
else
{
current = this->DeleteNode(current, previous);
}
}
}
LinkedList::Node* LinkedList::DeleteNode(LinkedList::Node* node, LinkedList::Node* previous)
{
if(NULL == node)
{
return NULL;
}
if(this->m_Start == node)
{
this->m_Start = node->m_Next;
}
else if(NULL != previous)
{
previous->m_Next = node->m_Next;
}
Node* next = node->m_Next;
--this->m_Count;
delete node;
return next;
}
```

Please note that we can extend the data limit by adding another dimension, e.g. 64x64x64. so, the data limit would be extended to 262144 and you can then add another dimension if needed.

```
public static List<Integer> findDuplicatesLinkedList(LinkedList<Integer> sequence){
// List<Integer> result = new ArrayList<Integer>();
int bitarray = 0;
Iterator<Integer> iterator = sequence.iterator();
while(iterator.hasNext()){
Integer num = iterator.next();
int x = 1;
x = x << num;
if((bitarray & x) != 0){
System.out.println("Duplicate found in given linked list: " + num);
iterator.remove();
} else {
bitarray = bitarray | x;
}
}
System.out.println(sequence);
return null;
```

}

Solved in O(n) with memory O(1). Solved using a bit vector i.e set the bit position of int value to 1. However only work for Integers i.e values from 0-31.

```
public void removeDuplicates(MyNode<Integer> current)
{
if(current == null)
{
return;
}
MyNode<Integer> temp = current;
int buffer = 0;
int offset = 1;
buffer = buffer | offset<<temp.data;
while(temp.next != null)
{
if( buffer>>temp.next.data == 1)
{
MyNode<Integer> temp1 = temp.next;
temp.next = temp.next.next;
temp1.next = null;
continue;
}
else{
buffer = buffer | offset<<temp.next.data;
}
temp = temp.next;
}
```

}

Solved in O(n) with memory O(1). Solved using a bit vector i.e set the bit position of int value to 1. However only work for Integers i.e values from 0-31.

```
public void removeDuplicates(MyNode<Integer> current)
{
if(current == null)
{
return;
}
MyNode<Integer> temp = current;
int buffer = 0;
int offset = 1;
buffer = buffer | offset<<temp.data;
while(temp.next != null)
{
if( buffer>>temp.next.data == 1)
{
MyNode<Integer> temp1 = temp.next;
temp.next = temp.next.next;
temp1.next = null;
continue;
}
else{
buffer = buffer | offset<<temp.next.data;
}
temp = temp.next;
}
```

}

This is O(n). I am traversing the list only once and no extra space. But the problem is this code works for numbers less than 32. With a little sensible modifications it can be made to work for the greater numbers. I might be wrong about O(n) but for sure traversing it only once. Positive and worthwile suggestions are welcome :)

// Program to remove the duplicate elements from the linked list //

#include<stdio.h>

#include<stdlib.h>

struct node{

int no;

struct node* next;

};

typedef struct node* Nodep;

Nodep makeLinkedList(int);

void printList(Nodep);

int main(){

int i,num,t=0;

printf("How many nodes you want in the linked list??\n");

scanf("%d",&num);

Nodep head = makeLinkedList(num);

Nodep temp = head;

Nodep p = head->next;

t = t|1<<(head->no-1);

while(p!=NULL){

if(t&1<<(p->no-1)){

p = p->next;

if(p == NULL){

temp->next = NULL;

break;

}

}

else if(temp->next == p){

temp = temp->next;

t = t | 1<<(p->no-1);

p = p->next;

}

else{

temp->next = p;

t = t | 1<<(p->no-1);

if(p->next == NULL){

temp->next = p;

}

temp = p;

p = p->next;

}

}

printf("\nThe linked list after removing duplicate nodes is:\n");

printList(head);

return 0;

}

Nodep makeLinkedList(int n){

int exec = 1,i;

Nodep start,p;

printf("Enter the numbers:\n");

for(i = 0;i<n;++i){

if(exec){

p = (Nodep)malloc(sizeof(struct node));

start = p;

p->next = NULL;

scanf("%d",&p->no);

exec = 0;

}

else{

Nodep q = (Nodep)malloc(sizeof(struct node));

q->next = NULL;

scanf("%d",&q->no);

p->next = q;

p = p->next;

}

}

return start;

}

void printList(Nodep head){

while(head->next!=NULL){

printf("%d ",head->no);

head = head->next;

}

printf("%d",head->no);

}

```
public static Node removeDuplicatesFromUnsortedLinkedListWithoutExtraSpace(Node head){
if(head == null || head.next == null)
return head;
Node prev = head;
Node curr = head.next;
Node temp = prev;
while(prev.next!=null){
if(curr != null){
if(prev.data == curr.data){
temp.next = curr.next;
curr = curr.next;
}else {
temp = curr;
curr = curr.next;
}
}else{
prev = prev.next;
curr = prev.next;
temp = prev;
}
}
return head;
}
```

1. Keep 3 pointers ( head, first,temp and last ) pointing to the head , last node and temporary first node.

2. Initialize temp to head and first to head

3. Initialize last->next = temp

4. Run from temp to last deleting all nodes that have the same value as the temp node. Once temp = last, move first forward and set temp to first.

5. Move temp node forward.

6. Readjust the last node to point to the temp node.

7. Repeat from step 4 till first is equal to last.

I think this should achieve required deletion in amortized O(n) . Please feel free to poke holes in this algo..

In python.

```
import json
x_inp=input("Please enter the list")
x=json.loads(x_inp)
def repeating_deleted(x):
out=[]
for i in range(0,len(x)):
for j in range(i+1,len(x)):
if x[i]==x[j]:
out.append(j)
result=[]
for i in range(0,len(x)):
if i not in out:
result.append(x[i])
return result
repeating_deleted(x)
```

taking array and initialize every element to unvisited , while traveling when no. is unvisited we mark that index (taking no. as index) visited , and when we found visited then delete that node .

Take a dictionary, the keys in the dictionary will be the values in the linked list.

Start traversing the linked list, add the value of the node as key in dictionary and value of dictionary will be one. Now whenever a second element is added in the list first the list value is compared with the keys, if that key exists, the value value corresponding to that key is incremented by 1 in the dictionary and we can now delete that particular node from the list, else if the key does not exist we can go on add the key in dictionary.

eg: - List elemets - 1, 4, 6, 9, 1

1 -> so add in dictionary with key as 1 and value as one.

go forward in the list and pick up the nest element now. next element is 4. so search the keys in dictionary to find whether 4 exists in dictionary if not add it

4 -> so add in dictionary with key as 4 and value as one.

6 -> so add in dictionary with key as 6 and value as one.

9 -> so add in dictionary with key as 9 and value as one.

1-> key exist , increment value by 1 delet the node.

c#

//Removing duplicates from the list

public void RemoveDuplicate()

{

Dictionary<T, int> theDictionary = new Dictionary<T, int>();

Node<T> cur = head;

theDictionary.Add(head.Data, 1);

while (cur.Next != null)

{

if (!theDictionary.ContainsKey(cur.Next.Data))

{

theDictionary.Add(cur.Next.Data, 1);

cur = cur.Next;

}

else

{

cur.Next = cur.Next.Next;

}

}

}

I was wondering if O(n) algorithm is possible.

- Ehsan August 24, 2013Assume the input is a sequence of length "n" with numbers chosen from the set {1,2,3,...,n}.

Let's assume there is an algorithm which goes through f(n) elements of the array and has "S" states. From state Si, we observe the input and based on its value we can go to another state Sj. After "f(n)" observations, we have tried at S^f(n) routes. If at some point we conclude that the remaining sequence is acceptable, then the algorithm terminates.

There is a total of n^n sequences. We must be able to differentiate between any two sequence so S^f(n) >= n^n. if f(n) = O(n), then S = Omega(n) and therefore, log2(S) = Omega(log(n)). If S = O(1), then f(n) = Omega(n log(n)).

I am not sure if my argument is correct, but my conclusion is that we either need O(log(n)) memory, or O(nlog(n)) iterations.