Amazon Interview Question
Web DevelopersQuite a strange restriction. Hashtable allowed but not additional space. What does that mean?
A hash table solution (C# like pseudo-code):
size_t UniqueIndex(int [] arr) {
int uniqueIndex = -1;
HashTable<int, bool> dupes = new HashTable<int, bool>();
foreach(int idx in arr) {
dupes[idx] = true;
}
int count = 0;
foreach(int idx in arr) {
if (dupes.KeyExists(idx)) {
count++;
continue;
}
uniqueIndex = count;
break;
}
return uniqueIndex;
}
A direct method of about N*N/4 algorithm:
(
if (N== 1) return 0;
int No1 = a[0];
for(int i=0;i<N;i++) {
if ( No1 XOR a[j] != 0 )//not repeated
continue;
else
break;
}
if (No1 XOR a[N] != 0)
return 0; // a[0] is the first non-repeated no.
for (int i=1; i<N; i++)
{
int testNo = a[i];
if (i == N)
{
if (testNo == No1)
return -1; // no non-repeat index
else
return N; // last index
)
if (testNo == No1) continue;// skip it
int j= i+1;
do
{
if (a[j] == No1 ) continue;
if (testNo XOR a[j] != 0 )
{
if (j== N) // last guy
return i;
j++;
}
else // a[j] repeats testNo
{
a[j] = No1; // don't deal with a[j] again
break;
}
} while (j<= N);
)
You may be right. XOR is not needed.
HashTable solution:
Put the numbers in the array in 100 buckets according to their last two digits.
Exam the number of members in each bucket, if some bucket has only one member, it is the one and record its index. If the buckets has more than one, apply the above direct method to find the minimum index of non-repeated number. Finally, compare all indice and find the smallest index.
But in worst case, it is not better than the direct method.
Yes, in worst case hashtable is not better than direct method, but using a hashtable is a much better option that brute force quadratic.
If you are worried about worst case, we can do O(nlogn) with a balanced tree.
But there is a comment saying no extra space, so neither hashtable nor tree will work with that restriction.
Forget the no extra spaces requirement, it does not make sense for Hash Table.
The other way is doing a O{N*lgN) search{with indice) and then search the non-repetitve indice and find the minimum. It does require up to N extra spaces. I am not sure your balance tree solution. The direct method only needs space.
A C# like balanced Tree Solution:
int UniqueIndex (int [] array) {
// A balanced Tree for which any Insert is O(logn).
//(n = # of elements already there in the tree)
BalancedTree<int> tree = new BalancedTree<int>();
foreach (int arrElement in array) {
int count = 1;
// This is O(log n)
if (tree.TryGetValue(arrElement, out count)) {
count++;
}
// Assume overwrites existing value.
// and count is untouched if TryGetValue returns false.
// This is O(log n)
tree.Insert(arrElement, count);
} // This for loop is O (n log n)
int idx = 0;
foreach (int arrElement in array) {
int count = 0;
// This is O(nlogn)
if (tree.TryGetValue(arrElement, out count)) {
if (count == 1) {
// Found it!
return idx;
}
} else { // Value not in tree. Very strange.
throw new IDontKnowWhatToDoException("Help!"); // Something seriously wrong.
}
} // In worst case this for loop is O(n log n)
// All elements repeated.
return -1;
} // This is O(n logn) in worst case.
Hash Table solution - O(N)
The key in the hashmap is value at array[i] and the value in hashmap is a linked list. A node is added if the size is 0 or 1. Node holds the index i. Since we only care about duplicate once the size of the associated linked list with a key > 1 we can ignore adding to the link list when collision occurs. Thus the max size of linked list if 2.
After complete iteration of the array. Scan the Hashmap, collect all values where associated linked list is of size one. Return the minimum from that. Worst case - when no element is duplicate is also O(N) i.e to find the minimum.
To Anonymous who claims Hash Table is O(N).
Why can't you have a collision between two _different_ keys? That is what the whole collision issue is about. So worst case, it could still be O(N^2).
Hi, Subha,
The point is not O(N), it is the maxvalue and the bucket array.
Say { 1, 10, 1, 100, 10, 1000, 100, 10000, 1000, 100000 )
Find Maximum X: N;
Initial bucket: X (100000 in this case!)
Scan number in array[i]=a, and set flag[bucket[a]]: 2N
"MAX_VAL some constant which is not in the arr", how you find it!!!
I think you see the problem.
For some reason, I am not able to reply to a specific comment.
To Anonymous who responded to my post.
Of course the actual integer value is the key. But what does that hash to? The collisions occur _after_ you hash the keys. The linked list is for each _hash value_ not each key.
Think about hashing and collision independent of this problem. In usual implementations, you do the linked list chaining for keys which map to the same hash. So you could have different keys with data in the _same linked list_.
So for this problem, in worst case hashing is not O(N).
Assuming that the array elements are reasonably small enough, and the interviewer allows for use of hashtables, then hash table offers the best solution. go thru the array each time pickin an element and inserting it into the hash table as the key with a value of 1. The first time an attempt to insert an element finds a key == to the element, then that is the 1st repeating element. simply return the index of that in the array. this should be an O(n) efficiency.
Talking limited array and BIG O is like talking a shell and the sea. O(N) means there is a constant k, such that the algorithm is good(k*N) for ANY N!
You are changing the original problem, Mr. T. Why not just conside binary numbers:
{0, 1, 0, 0, 0, ..., 1} , then you don't even need any space.
int nri(int *a,int n)
{
int *b,int flag,j,i=0;
while(i!=n)
{
j=i;
j++;
b[j]=a[i];
if(a[i]!='#')
{
flag=0;
while(j!=n)
{
if(a[i]==b[j])
{
b[j]='#';
flag=1;
}
j++;
}
if(flag==0)
return i;
}
i++;
}
}
first one is something wrong correct solution is
#include<stdio.h>
int nri(int *a,int n)
{
int flag,j,i=0;
while(i!=n)
{
j=i+1;
flag=0;
if(a[i]!=0)
{
while(j!=n)
{
if(a[i]==a[j])
{
a[j]=0;
flag=1;
}
j++;
}
if(flag==0)
return i;
}
i++;
}
}
main()
{
int n[]={5,1,1,4,5};
printf("first nonrepeated index is %d\n",nri(n,5));
return 0;
}
The string problem is an exception.
The idea of Hash Table itself is use limited number of buckets to deal with unlimited possibilities and reduce the amount of computations.
In this case, using last two digits requires only 100 buckets. If the numbers are fairly distributed, another word, if it does not contain just a few repeat numbers, or all numbers has the same last two digits, then we have a O(N) solution.
You never want to pick the value itself as key and create huge buckets.
Now you are changing the assumptions of problem. LOL.
I think you have a reading comprehension problem, probably English is not your first language. Try to first understand what others are saying before running off to write one.
I won't respond to you further on this topic.
rajnesh, who did you direct this comment ("i think u should run it and then say") at?
btw, your code is O(n^2). Try aiming for O(n).
In fact, I say, you run your code yourself and see, when n is say 10 million, or 1 billion. See how long it takes.
In designing algorithms, correctness is not the only important factor to consider. Running time is crucial too, and frequently, space usage becomes important too.
Good English!
Who(m) did you direct this comment at(to)?
Your code(algorithm) is O(n^2).
...
Not sure what all the excitement is about. But, a linear algorithm is in order:
for each item in list
if table.contains(integer)
table.store(integer, -1)
else
table.store(integer, index)
set max = Integer.max_value
for each (integer, index) in table
if index > -1 and index < max
set max = index
return max
this is a O(2n)
HTF is this linear? Can you guarantee that table.contains() will run in O(1) time? You may be able to, if you have a perfect hash function! In which case, you don't need to do this elaborate procedure. You may as well use perfect hashing to hash the items in the list to the hash table and then check the table to find the first entry that doesn't have multiple values chained to it. (This works as perfect hashing would guarantee that for every x != y, x and y hash to different locations in the hash table, else they hash to the same location)
Start storing the elements in a hashmap from the end of the array. If the element already exists in HashMap then remove it from hashmap. Now again traverse the array from beginning. Output the first element you come across which is also present in the hashmap
this is also O(n)
Your solution will not work if the array contains the same element for odd no. of times :)
Consider following soln:
private static Object getUniqueElement(int[] arr) {
Map <Integer , Integer> inMap = new HashMap <Integer , Integer>();
//add all the elements to the hashmap
for(int input : arr){
Integer localVal = inMap.get(input) ;
if(localVal != null){
localVal++;
inMap.put(input, localVal);
}
else
inMap.put(input, 0);
}
for(int result : arr){
if(inMap.get(result) == 0)
return result;
}
return null;
}
this would work....
#include<iostream>
#include<map>
#include<vector>
using namespace std;
void insertmap(map<int,int> &moh,int n)
{
pair<map<int,int>:: iterator ,bool> ret;
ret=moh.insert(pair<int ,int> (n,1));//this will check if n is already in map or not.
//if it returns true no such key =n is present so inserted
if(!ret.second )
{
moh[n]++;
}
}
int main()
{
int n;
cout<<"no of elements in list";
cin>>n;
vector< int > num(n);
map<int,int> mohan;
for(int i=0;i<n;i++)
{
cin>>num[i];
insertmap(mohan,num[i]);
}
map<int,int>::iterator trav;
// starting from first insert
//so will give first element tat occurs only once
for(int i =0;i<n;i++)
if (mohan[num[i]] == 1)
{
cout<<"first element" << num[i];
break;
}
return 0;
}
why can't we approach using the XOR way. The pseudo-python code can be:
seen = set([])
temp = 0
for i in range(len(array)):
if array[i] not found in seen:
temp = array[i] XOR array[i+1]
seen.add(array[i])
if temp not 0:
print array[i]
here seen ensures that we're not goind to do XOR of same numbers more than once, the moment we've non-zero temp means we've one number that is different and that is the answer.
/**
* This function will return the first non-repeated number.
*
* Best case: O(n) and worst case: O(n2)
**/
int check(int[] arr){
int length =arr.length;
for(int i=0; i<length; i++){
for(int j=i+1; j<length; j++){
if(arr[i] == arr[j]){
continue;
}else if(length==j){
return arr[j];
}
}
}
}
import java.util.HashMap;
import java.util.Map;
public class FirstNonRept {
public static void main(String[] args) {
int arr[]=new int[10];
arr[0]=4;
arr[1]=2;
arr[2]=3;
arr[3]=4;
arr[4]=2;
arr[5]=10;
arr[6]=11;
arr[7]=3;
arr[8]=4;
arr[9]=0;
HashMap<Integer, Integer> hm =new HashMap<Integer,Integer>();
for(int i=0;i<arr.length;i++){
if(hm.containsKey(arr[i])){
hm.remove(arr[i]);
}
else{
hm.put(arr[i],i);
}
}
int min=Integer.MAX_VALUE;
for(Map.Entry<Integer, Integer> entry : hm.entrySet()){
System.out.println(" key "+entry.getKey()+" "+entry.getValue());
if(min>entry.getValue()){
min=entry.getValue();
}
}
System.out.println("Minimum index of the value is :"+min+" value of variable :"+arr[min]);
}
}
Here is my solution..
Its in O(n) for any situation according to me.
Feel free to comment if any
Output :
key 0 9
key 4 8
key 10 5
key 11 6
Minimum index of the value is :5 value of variable :10
public static void main(String[] args) {
Set<Integer> set = new TreeSet<Integer>();
List<Integer> l = new ArrayList<Integer>();
l.add(1);
l.add(-20);
l.add(1);
l.add(29);
l.add(9);
l.add(100);
l.add(29);
set.addAll(l);
int k = 0;
for(Integer i : set ){
if(l.indexOf(i) == l.lastIndexOf(i)) {
k = i;
break;
}
}
System.out.println("index of the first non-repetitive element in the array is "+l.indexOf(k)+" ; the element is "+k);
}
int arr[N];
map<int ,int > unique_char;
for(int i=0;i<N;i++)
{
cin>>arr[i];
unique_char[arr[i]]++;
}
map<int,int> ::iterator it;
for(it=unique_char.begin();it!=unique_char.end();it++)
cout<<(*it).first<<" : "<<(*it).second<<endl;
for(int i=0;i<N;i++)
{
if(unmap[arr[i]]==1)
{
cout<<"This is the digit "<<arr[i]<<"--"<<i<<endl;
break;
}
}
if space is not a constraint then we can solve it in O(n).
- Subha March 10, 2009a[N] input Array (assumption array has only positive numbers)
and say X is the Max value that we have in the input Array
flagArr[N] = {initialized to ZERO}
bucketArr[X+1] = { initialized to MAX_VAl } // MAX_VAL some constant which is not in the array
for(i=0;i<N;i++) {
if(bucketArr[a[i]] == MAX_VAL) {
bucketArr[a[i]] = i;
}
else {
flagArr[bucketArr[a[i]]] = 1; // 1 means repetition
}
}
for(i=0;i<N;i++) {
if(flagArr[i] == 0 ) {
return i;
}
}
I have assumed that the array has only positive numbers, if negative numbers are allowed then we can add min value to each number and follow the same procedure.
any bug found ? let me know