Google Interview Question
Software Engineer / DevelopersI am afraid this is not the expected answer. Without destroying the original values and spanning negative values also, a solution would be like:
for i from 0 to n
while ( a[i + 1] != i + 1)
if ( a[ i + 1] == a[ a[ i + 1] ])
return a[ i + 1] //bingo
swap a[ i + 1], a[ a[ i + 1] ]
return null //all unique aray
Idea is just to keep swapping until the slot finds its reserved value, however do not forget to check equalities in between.
Array might contain negative values, all we need is to adjust loop offset against that.
It is O(n) since each value is visited only once (worst case twice for an all-unique array).
Since we are merely negating some elements, we can always save the duplicate once we find it, then restore the array by setting each element to its absolute value, and finally returning the duplicate. Hopefully no one will argue that we can't save the answer because we aren't even allowed to use O(1) space.
Very clever solution, however strictly speaking it's still not using O(1) space, since changing the sign is not a 0-memory operation, it involves using 1 bit of presumed "unused" memory for each array element, and therefore it's using exactly n bits of unused space, so it's O(n) (to put it in programmatic terms this wouldn't work if n=256 and you're given an array of unsigned bytes. Or if n=MAXSHORT and the array is of unsigned short)
@random visitor
First, the memory is already allocated, so it's not using any extra space, the space is there...being wasted.
Second, no need to worry about the unsigned cases...read the original question carefully...
"An array of size n+1 has integers only from 1 to n"
integers
Range from 1 to n
Maybe I'm assuming too much, but that says to me the storage is a signed type, and the sign-bit will be unused since the range is 1 to n.
@yemre_ankara
Consider array [2 6 2 6 5 4]. Your method fails to find the first one which is 2 for the array but it returns 6 as answer.
2 6 2 6 5 4
6 2 2 6 5 4
4 2 2 6 5 6
6 2 2 4 5 6
return 6
I fail to see how daywalker's solution works. Given the high number of votes and praises, however, I must assume that I am mistaking something.
Consider the (zero-based-)array [1,2,3,3] of size 3+1 = 4. On the algorithm's first iteration, a[0] is checked and found positive (it is 1). So the value -1 is stored under a[a[0]] = a[1] which is where the value 2 is located. The updated array looks like this: [1,-1,3,3].
On the second iteration, a[1] is found negative (it is -1) and therefore -1 reported as a duplicate (which it is not).
This seems wrong to me. What am I missing?
keep swap the a[0] and a[a[0]] if a[0] != a[a[0]]. Then eventually there will be a[0] == a[a[0]], this is duplicate. Note that there is a way to swap without any space.
public int firstDuplicateNumberInArray(int[] array) {
int value = array[0];
while(value != array[value]) {
swap(array, 0, value); // swap
value = array[0];
}
return value;
}
#include<iostream>
using namespace std;
void find_dup(int *,int);
int main()
{
int test_arr[]={7,7,5,9,8,8,5,1,10,1,2};
find_dup(test_arr,10);
return 0;
}
void find_dup(int *arr,int n)
{
arr[arr[0]]+=n;
for(arr[0]=1;arr[0]<n+1;++arr[0])
{
if(arr[( (arr[arr[0]]%n) ? (arr[arr[0]]%n) : n)]>n)
{
cout<<"First duplicate at zero-indexed position "<<arr[0]<<endl;
cout<<"Duplicated value is "<<( (arr[arr[0]]%n) ? (arr[arr[0]]%n) : n)<<endl;
break;
}
else
{
arr[( (arr[arr[0]]%n) ? (arr[arr[0]]%n) : n)]+=n;
}
}
}
Side effect that the first value in the array can't be recovered. Let me think about that problem for a little while, I think I have a solution to that problem as well, will post after I've tested.
Yeah, the below will preserve the ability to get the original values back, if needed.
#include<iostream>
using namespace std;
void find_dup(int *,int);
int main()
{
int test_arr[]={7,7,5,9,8,8,5,1,10,1,2};
find_dup(test_arr,10);
return 0;
}
void find_dup(int *arr,int n)
{
arr[arr[0]]+=n;
for(arr[0]+=n;arr[0] < n*(n+1);arr[0]+=n)
{
if(arr[( (arr[arr[0]/n]%n) ? (arr[arr[0]/n]%n) : n)]>n)
{
cout<<"First duplicate at zero-indexed position "<<arr[0]/n<<endl;
cout<<"Duplicated value is "<<( (arr[arr[0]/n]%n) ? (arr[arr[0]/n]%n) : n)<<endl;
break;
}
else
{
arr[( (arr[arr[0]/n]%n) ? (arr[arr[0]/n]%n) : n)]+=n;
}
}
}
Actually missed a corner case when the first element of the array was n.
Fixed below:
#include<iostream>
using namespace std;
void find_dup(int *,int);
int main()
{
int test_arr[]={7,7,6,3,1,2,4,1};
find_dup(test_arr,7);
return 0;
}
void find_dup(int *arr,int n)
{
arr[arr[0]]+=n;
for(arr[0]+=((arr[0]/n) ? 0 : n);arr[0] < n*(n+1);arr[0]+=n)
{
if(arr[( (arr[arr[0]/n]%n) ? (arr[arr[0]/n]%n) : n)]>n)
{
cout<<"First duplicate at zero-indexed position "<<arr[0]/n<<endl;
cout<<"Duplicated value is "
<<( (arr[arr[0]/n]%n) ? (arr[arr[0]/n]%n) : n)<<endl;
break;
}
else
{
arr[( (arr[arr[0]/n]%n) ? (arr[arr[0]/n]%n) : n)]+=n;
}
}
}
Appreciate ur great effort. But, I'd more appreciate if you first give a glimpse of the underlying algorithm of your solution. I know it's harder to explain sometimes than to provide code. But, to my view, in the former approach the job seekers on this site would be much more be benefited. Thanks.
I second the opinion. It is very hard to deduce the logic from the code without spending considerable amount of time. Also
Sorry, I'll explain my efforts.
You have to notice a few concepts to understand my line of thinking.
1) The first item in the array can't be the first duplicate, you have to have at least a second item.
2) In c/c++(and many other languages you have 0 based indexing), this means that in a n+1 sized array you have indicies that range from 0 to n.
3) If the numbers are in the range 1 to n, you can create a modular group, ie let's say n=4, 1 mod 4 = 1, 2 mod 4 = 2, 3 mod 4 = 3, 4 mod 4 = 0, adding n to any number doesn't change where it is in the modular group, ie. 3+4=7 -> 7 mod 4 = 3
So I use the zero index item as a counter to go through the array. Adding n each time keeps the value the same in the modular group, but allows me to get an index value by then using integer division by n.
I use the other indicies as counters. Any time I see a 1 in the list I increment my 1's counter, in order to do that I add n to the value in the arr[1] position, that keeps the value at that position the same in the modular group, but also encodes that I've seen a 1.
As I iterate over the array, when I go to increment the counter if I see a value greater than n, I know I've already incremented that counter at least once. So I know I'm currently sitting at a duplicate. Since I break out of the loop at the first duplicate, I only find the first one.
Hopefully that was clear, I think I might be able to explain better with graphics. I will try to draw something up, and post a link.
I've had some time to put together a powerpoint explaining my thought process, hopefully better than my above post, you can download it here:
sites.google.com/site/kmddevelopment/Home/GoogleInterviewQuestion.ppt?attredirects=0&d=1
Division by n+1 gets rid of the conditional operator "?:"
void findDuplicate(int* arr, const int n) {
arr[arr[0]] += n + 1;
for (arr[0] = 1; arr[0] <= n; arr[0]++) {
if (arr[arr[arr[0]] % (n + 1)] > n) {
printf("position: %d, value: %d\n", arr[0], arr[arr[0]] % (n + 1));
return;
}
arr[arr[arr[0]] % (n + 1)] += n + 1;
}
}
The idea is interesting. It is the same as using a bit vector except that the bit vector is "stolen" from the original array.
As a result, tt doesn't work for all N. Does not work when N > 2^31 (on 32-bit machines with integer size of 32 bits). In such cases adding N will cause an overflow.
@Freddy: Good point, I can't believe I missed such a simple tweak.
@Anonymous: Agreed, while it's not a cure all, you could change the operation to subtraction instead of addition. Given all numbers will be in the range 1..n, and you're really only worried about changing the flags once, after a single subtraction you have numbers in the range (1-n)..0 or -n..-1 if you use the n+1 change.
Of course there's problems with that too, it can only be used on signed types, and it really doesn't fix the counter issue for iterating over the array.
if i am not wrong ur solution not working for this test case {1,1,1,1,2,2,3,0,0,0,1,1,4}; assume you have given number from 0 to n ..& why should ur solution works as thats finding only 1st repeating element but what if need to find the all repeating element in given constraints then have a look at WgpShashank's solution @bottom :)
Do notify me if i missed something :)
/**
* The basic idea is, the first time you see a number v=A[i], you mark A[v]
* as A[v]+i*N^2. Later on if you see v again, you mark A[v] as A[v]+N. for
* example, the third time you see v, the value in A[v] will be like
* orginalA[v]+i*N^2+j*N where i is where you see v first time and j is
* later on how many times you see it again.
*
* After one scanning of the array doing above, we start to look for the
* first duplicated number simply by decoding the wired format to get i and
* j
*
* @param A
* The array, size N+1, containing number from 1 to N
* @param N
* The N
* @return The first duplicate number in that array
*/
public static int FindDuplicate(int[] A, int N) {
int CurrentDupPositionSoFar = N;
int DupFoundSoFar = 0;
for (int i = 0; i <= N; i++) {
int value = A[i]; // declare value here for better code readability
// If value less than N^2, this is the first time i appears in the
// array; otherwise, i appeared before and a[i] has been transformed.
// so we need to restore it before using it as array index
if (value > N * N) {
value = value % (N * N);
// Now value is original+j*N
// If value greater than N, means i appears multiple times
// before, restore again
if (value > N) {
value = value % N;
// Now value is the original value in A[i]
}
}
if (A[value] < N * N) {
A[value] += N * N * (i + 1);
} else {
A[value] += N;
}
}
for (int i = 0; i <= N; i++) {
// Where does this number appear first
int PositionOfFirstAppearance = A[i] / (N * N);
// How many times it duplicates
int TimesBeenDuplicated = (A[i] % (N * N))/N;
if (PositionOfFirstAppearance > 0) {
if (TimesBeenDuplicated > 0) {
if (CurrentDupPositionSoFar > PositionOfFirstAppearance ) {
CurrentDupPositionSoFar = PositionOfFirstAppearance ;
DupFoundSoFar = i;
}
}
}
}
return DupFoundSoFar;
}
}
<pre lang="" line="1" title="CodeMonkey46833" class="run-this">/**
* The basic idea is, the first time you see a number v=A[i], you mark A[v]
* as A[v]+i*N^2. Later on if you see v again, you mark A[v] as A[v]+N. for
* example, the third time you see v, the value in A[v] will be like
* orginalA[v]+i*N^2+j*N where i is where you see v first time and j is
* later on how many times you see it again.
*
* After one scanning of the array doing above, we start to look for the
* first duplicated number simply by decoding the wired format to get i and
* j
*
* @param A
* The array, size N+1, containing number from 1 to N
* @param N
* The N
* @return The first duplicate number in that array
*/
public static int FindDuplicate(int[] A, int N) {
int CurrentDupPositionSoFar = N;
int DupFoundSoFar = 0;
for (int i = 0; i <= N; i++) {
int value = A[i]; // declare value here for better code readability
// If value less than N^2, this is the first time i appears in the
// array; otherwise, i appeared before and a[i] has been transformed.
// so we need to restore it before using it as array index
if (value > N * N) {
value = value % (N * N);
// Now value is original+j*N
// If value greater than N, means i appears multiple times
// before, restore again
if (value > N) {
value = value % N;
// Now value is the original value in A[i]
}
}
if (A[value] < N * N) {
A[value] += N * N * (i + 1);
} else {
A[value] += N;
}
}
for (int i = 0; i <= N; i++) {
// Where does this number appear first
int PositionOfFirstAppearance = A[i] / (N * N);
// How many times it duplicates
int TimesBeenDuplicated = (A[i] % (N * N))/N;
if (PositionOfFirstAppearance > 0) {
if (TimesBeenDuplicated > 0) {
if (CurrentDupPositionSoFar > PositionOfFirstAppearance ) {
CurrentDupPositionSoFar = PositionOfFirstAppearance ;
DupFoundSoFar = i;
}
}
}
}
return DupFoundSoFar;
}
}
</pre>
Your solution uses extra space though.
int CurrentDupPositionSoFar = N;
int DupFoundSoFar = 0;
for (int i = 0; i <= N; i++)
for (int i = 0; i <= N; i++)
All of those lines create an int. While it is O(1) space, the question specifically asks for an in place algorithm.
<pre lang="" line="1" title="CodeMonkey27155" class="run-this">def markitem(array, i):
array[i] = -abs(array[i])
def decodeitem(array, i):
return abs(array[i])
def ismarked(array, i):
return array[i] < 0
def findfirstrep(array):
for i in xrange(len(array)):
n = decodeitem(array, i)
if ismarked(array, n):
return n
markitem(array, n)
if __name__ == "__main__":
a = [10,9,9,7,6,5,4,3,2,1,1]
print findfirstrep(a)
</pre>
Modified Algorithm(Optimized As Well)
1.While Looping Down from Array Calculate the index of each array element ist clear as all elements will in range of 1 to n so no over flow or underflow conditions.? so index=a[i]%size;
2. then check if value at this index is zero or not if its zero then set its as INT_MIN or -size as you wants. a[index]=-size
3.if above condition fails then check if value at given index is in given range or
not if its in range then multiply the value at given index with -1.
4. if number is less then zero this means its already occurs so print it , multiply with -1 to reset & set value to this index to a[index]=value+size due to time complexity constraints & problem statement i used this when if a number occur say k times algorithm wont fail.
5.
#include
#include
void printRepeating(int arr[], int size)
{
int i,index=0;
printf("The repeating elements are: \n");
for(i = 0; i < size; i++) { index=abs(arr[i])%size; if(arr[index] == 0) arr[index]=-size; else if(arr[index]>0 && arr[index]
arr[index] = -arr[index];
else if(arr[index]<0)
{
printf(" %d ", abs(arr[i]));
arr[index]=-arr[index];
arr[index]+=size;// try to dry run for each number you will get it !!! :)
}
}
//restore previous array
for(i=0;i
arr[i]=abs(arr[i])%size;
}
int main()
{
int arr[] = {2,2,1,1,1,2,2,3,0,0,0,1,1,2,0,7,5};
int arr_size = sizeof(arr)/sizeof(arr[0]);
printRepeating(arr, arr_size);
getchar();
return 0;
}
Time Complexity O(N)
Space Complexity O(1)
idea can be extended to if we have array elemnmt in range of -n to n e.g. total 2n+1 elements :)
i have used a flag array ,which actually saves the status of the all the numbers from 1 to n (initially as false)
then from i=1 to n+1 i traverse the array and if flag[i] is true then breaks....
my code is :
#include<iostream>
#include<conio.h>
using namespace std;
void main(){
int arr[10],flag[10],n,i,j;
cout<<"enter the number of elements: ";
cin>>n;
for(i=1;i<10;i++){
flag[i]=false;
}
for(i=0;i<n;i++)
cin>>arr[i];
for(i=0;i<10;i++){
if(flag[arr[i]]==false)
flag[arr[i]]=true;
else{
cout<<"the the 1st repeated number is :"<<arr[i];
break;
}
}
_getch();
}
For index i, the element is a[i].
For every a[a[i]], if we multiply its value, then when we already see a -ve value, we can say that that element in its index position is first repeated. Order of N complexity
int findRepeation (int a [ ], int n) // where size of array n + 1
{
int i = 0;
for(int i = 0; i < size; i++)
{
if(a [ a [ i ] ] > 0)
a [ a [ i ] ] *= -1;
else
break; // break from the for loop
}
return i; // index of the first repeation
}
Without destroying the original values and spanning negative values also, a solution would be like:
for i from 0 to n
while ( a[i + 1] != i + 1)
if ( a[ i + 1] == a[ a[ i + 1] ])
return a[ i + 1] //bingo
swap a[ i + 1], a[ a[ i + 1] ]
return null //all unique aray
Idea is just to keep swapping until the slot finds its reserved value, however do not forget to check equalities in between.
Array might contain negative values, all we need is to adjust loop offset against that.
It is O(n) since each value is visited only once (worst case twice for an all-unique array).
Only a couple problems with that...First the variable "i" you use for your loop, as an extra variable. Read the problem very carefully...
Restrictions: O(n) algo required. Cannot use extra space(not even O(1)).
hi kevin,
now that i am terribly wondering how the solution from daywalker with 4 votes manages to "scan the array" without even simple looping or whatever, then. someone pls explain that or what "extra space (not even O(1) and loop variable)" means exactly.
Thanks...
Daywalker's solution is somewhat vague in how it loops through the array, in that sense, I think he/she should have been asked to explain further. However even Daywalker claims strictly O(1) space, disqualifying the answer. "extra space (not even O(1))" means you cannot declare any new variables, you have to solve the problem with only the memory of the array itself. You can't copy the array, you can't have a loop variable, you can't do a traditional swap(though an in-place swap would be fine), basically if you measured the memory use of your solution it should never be greater than that of the original array. Just slightly below Daywalker's post, I suggested an approach as well, that I believe meets the requirements but has several severe limitations on the data-type. At the time I was posting as KMD, but as questions about my thought process came up I opened an account as kevinday.home, I even posted a powerpoint explaining the process. Check it out, critique it, like it or hate it. Today if asked the question, I'd go with a slightly modified version of the one I posted.
Since the numbers are in range 1 to n we can do it in O(n) by,
1) scan through the array changing the sign of elements a[a[i]] = -1 * a[i] if a[i] is positive;
if a[i] is negative then it is a duplicate
O(n) time and strictly O(1) time complexity.
Since the numbers are in range 1 to n we can do it in O(n) by,
- daywalker March 30, 20111) scan through the array changing the sign of elements a[a[i]] = -1 * a[i] if a[i] is positive;
if a[i] is negative then it is a duplicate
O(n) time and strictly O(1) time complexity.