## Amazon Interview Question

Software Engineer / Developerslong Random7(){return (Random5()^Random5());}

Ex-OR two random values of the Random 5 function.

Yes, It works perfectly fine because using XOR:

1) does not disturb the probability distribution as the two possible outcomes (0 and 1) are equally likely. Remember, this is not the case with AND and OR.

2) makes sure that the maximum value that can be generated is 7. Therefore, there is no possibility of _overflow_ which definitely disturbs the probability distribution.

```
int Random7()
{ int i1,i2,i3;
i1=Random(5)%2;
i2=Random(5)%2;
i3=Random(5)%2;
return (i1*4+i2*2+i1);
}
```

return (d[2]*4+d[1]*2+d[0]);

... will almost certainly not work at the end of ANY solution to this problem.

The result we want is for the probability of each digit to be 1 to be 4/7's overall, which breaks down to 100% if all the other numbers turn out 0, and 50% if at least one of the other numbers turned out to be 1. Since each digit's probability distribution depends on every other number's probability distribution, none of the numbers can be calculated before the rest and the combined number cannot be generated from independant calculations of it's bits.

The correct solution is to throw away bad solutions and re-throw the dice. If you are doing that, though, then you might as well generate a random number x between 1 and 25, rethrow on 25, and then return x/3

YYY, you might be wrong, since 5=1+4=2+3=3+2=4+1, it has 4 situations.

Inspired by your method, we can do a "reject sampling", for example, if we get these seven cases, like:

<1, 3>, <2, 2>, <3, 1>, <1, 4>, <2, 3>, <3, 2>, <4, 1>, we output 1 to 7. For other cases, we reject the sampling and sample agian.

If John von Neumann's reject sampling proof is capable here, then this Random7() function works.

rejection is the correct procedure i believe. but your algorithm seems inefficient. I'd rather something like this:

generate m --> random(5)

generate n --> random(2) (based on rejection principle like n = 0 if n is 1 or 2, 1 if 3 or 4.)

if m == 3 or 4 and n == 0 { reject };

if m == 3 or 4 and n == 1 { output m};

else

if n == 0 output m

else output bitwise complement of m;

... here every element has a probability of occurence 1/10 * 4/5. Thus probability of rejection is 9/25.

If you are going to use rejection, why not use random5() to generate random25 and reject anything outside of random7?

```
int random7() {
while (true) {
int random25 = 5*random5() + random5();
if (random25 >= 0 && random25 <= 6) {
return random25;
}
}
```

This assumes random5() generates (0,1,2,3,4) and random7() generates (0,1,2,3,4,5,6). If not you can add/subtract 1 at the appropriate places.

I've a solution can you guys tell me whether it is correct or not ?

I assume random5() will generate 1,2,3,4,5 all of whose binary representations has LSB as 1,0,1,0,1 resp. My algo is

int random7() {

str[0] = LSB(random5());

str[1] = LSB(random5());

str[2] = LSB(random5());

return decimal(str);

}

In this way assuming equal probability of each numbers in random5() the min num is 1 and max is 7 from 001 & 111 resp.

i believe this is a good approach .. however i will change it a little bit to

int random7()

{

int i1 = LSB(random5());

int i2 = LSB(random5());

int i3 = LSB(random5());

return( i1<<2 + i2<<1 + i3 );

}

LSB of 1,2,3,4,5 is 1,0,1,0,1, so the probability of returning 0:1 is 2:3. Your solution will not produce a uniform distribution.

Originally, I would try this function:

int rand7()

{

return round(rand5()*5.0/7);

}

Taking binary ops into account:

5 => 001 to 101

7 => 001 to 111

Notice that the ranges intersect from 001 to 101. They do not intersect on 110 and 111. This suggests flipping the 2 least significants bits randomly.

Perhaps:

int rand7()

{

int num = rand5();

return num | (~num&3);

}

Example:

2=010

110 | (~110&011) = 110 | (001&011) = 110 | (001) = 111 = 7

Consider rand1 to generate random numbers uniformly between 0 and 1 (consider fractions too)

Now, suppose you need to generate rand5, one could do so by: rand5 = (rand1*5). rand5 is still uniform, however, we lose some precision - consider rand1 possibly generates 1000 distinct numbers (this is precision), then rand5 also generates 1000 (usually higher) distinct numbers because of the math.

Also, suppose we have rand5, then (rand5/5) gives rand1, but again this is uniform and we lose some precision.

Note, precision is memory limitation not due to math.

So, for our problem, (rand5/5)*7 should give us rand7, which is uniform but less precision.

We assumed rand5 generates fractions too. Like above, if we consider rand5 to generate only whole numbers, then rand7 generated above could possibly generate only 5 values not 7, uniform though!!

The problem with this solution is that it assumes the interviewer is asking a problem in which the choice of 5 and 7 are meaningless. You could just as easily ask the problem with p and q. When an interviewer picks small, slightly unusual numbers, like 5 and 7, you can assume that he thinks they are important somehow. In other words, the interviewer liking your response is contingient on him not liking his question. We have a priori knowledge that at least he is choosing to ask it, so giving an answer like this doesn't seem like the best way to optimize your percieved interview performance.

Given RandX (returns values 0 to X-1), RandY can be implemented using a RandZ helper (where Z=2^log2(X)) to generate a series of log2(X)-bit slices. For example, given Rand5, implement Rand7 using a Rand4 helper:

uint Rand7()

{

uint result;

do

{

result = Rand4() + /* low slice, 2 significant bits */

(Rand4() & 1) << 2); /* hi slice, 1 significant bit */

} while (result > 6);

return result;

}

uint Rand4()

{

uint result;

do

{

result = Rand5();

} while (result > 3);

return result;

}

Guys... i dont know Y u guys are troubling urself a lot...the Xor solution from the first thread works fine...

Hello World

0 xor 0 = 0

0 xor 1 = 1

0 xor 2 = 2

0 xor 3 = 3

0 xor 4 = 4

0 xor 5 = 5

1 xor 0 = 1

1 xor 1 = 0

1 xor 2 = 3

1 xor 3 = 2

1 xor 4 = 5

1 xor 5 = 4

2 xor 0 = 2

2 xor 1 = 3

2 xor 2 = 0

2 xor 3 = 1

2 xor 4 = 6

2 xor 5 = 7

3 xor 0 = 3

3 xor 1 = 2

3 xor 2 = 1

3 xor 3 = 0

3 xor 4 = 7

3 xor 5 = 6

4 xor 0 = 4

4 xor 1 = 5

4 xor 2 = 6

4 xor 3 = 7

4 xor 4 = 0

4 xor 5 = 1

5 xor 0 = 5

5 xor 1 = 4

5 xor 2 = 7

5 xor 3 = 6

5 xor 4 = 1

5 xor 5 = 0

if u guys dint want 7 as a part of rand(7) ... then the solution will be

rand(7) = (rand(5) ^ rand(5) ) % 7

Rem: the hint is think bitwise... so it has to be some bit calculation

badri,

Looking at the number of occurances of the various rand5 XOR rand5 results (where rand5 returns 0-5), we see:

result = 0 : occurances = 6

result = 1 : occurances = 6

result = 2 : occurances = 4

result = 3 : occurances = 4

result = 4 : occurances = 4

result = 5 : occurances = 4

result = 6 : occurances = 4

result = 7 : occurances = 4

(total occurances = 36)

As is, this does not provide for an even probability of distribution of resulting values 0-7 for rand7. Looking at it another way, 36 (# unique input combos) % 8 (# unique results) != 0, which should raise a flag. In order to accomplish an even distribution with this method, a couple of result=0/1 input combos (e.g. (0,0), (1,1) and (0,1), (1,0), respectively) would need to be explicitly checked for and the results discarded when they occured; this would happen 4 times in 36 on average, or 11.1% of the time, and rand5 would have to be called again twice, the results checked, etc. For example:

uint rand7() /* returns 0-7 */

{

uint x, y;

do

{

x = rand5(); /* returns 0-5 */

y = rand5();

} while (x < 2 && y < 2);

return x ^ y;

}

A more performant alternative (based on average # of calls to rand5, 2.11 versus 2.22) uses the generic bit-harvesting method I mentioned above (which has a missing '(' typo, btw), with the modification of also employing any probablistically-correct bits of the rejected values. For example:

uint rand7() /* returns 0-7 */

{

uint result, numBitsA, numBitsB;

result = harvestRand5Bits (&numBitsA);

result |= harvestRand5Bits (&numBitsB) << numBitsA;

if ((numBitsA + numBitsB) < 3)

/* Happens 1 time in 9 (11.1%) on avg */

result |= (rand5() & 1) << 2;

else if ((numBitsA + numBitsB) > 3)

/* mask highest of 4 bits ret'd */

result &= 7;

return result;

}

uint harvestRand5Bits (uint *numBits)

{

uint x = rand5(); /* returns 0-5 */

if (x < 4)

{

*numBits = 2;

return x & 3;

}

*numBits = 1;

return x & 1;

}

An improvement to the above (in single-thread apps, or multithread apps where perf overhead of serialization is significantly lower than calls to rand5) would be to save any unused random bits generated during the course of a given call to rand7 for utilization in subsequent calls, e.g.:

uint rand7() /* returns 0-7 */

{

static uint residualBit, numResidualBits = 0;

uint result, numBitsA, numBitsB, numBitsC;

if (numResidualBits > 0)

{

result = residualBit;

numBitsA = numResidualBits;

numResidualBits = 0;

}

else

{

result = harvestRand5Bits (&numBitsA);

}

result |= harvestRand5Bits (&numBitsB) << numBitsA;

if ((numBitsA + numBitsB) < 3)

result |= harvestRand5Bits (&numBitsC) << (numBitsA + numBitsB);

else

numBitsC = 0;

if ((numBitsA + numBitsB + numBitsC) > 3)

{

residualBit = result & 1;

result >>= 1;

numResidualBits = 1;

}

return result;

}

Best case here is 1.5 calls to rand5 per call to rand7: first call to rand7 generates 4 random bits from two calls to harvestRand5Bits (occurs 4 times in 9 on avg, 44.4% of the time), and one of these bits is saved for use by second call to rand7 which generates 2 random bits from a single call to harvestRand5Bits (occurs 2 times in 3, 66.7% of the time).

Worst case here is 3 calls to rand5 per call to rand7: first call to rand7 generates 3 random bits from 3 calls to harvestRand5Bits (occurs 1 time in 27 on avg, 3.7% of the time), with no random bits left over.

Average case here is 1.8 calls to rand5 per call to rand7. Consider that three calls to harvestRand5Bits will generate an average of 5 random bits, and nine calls to harvestRand5Bits will generate an average of 15 random bits. Since each rand7 requires 3 random bits, nine calls to harvestRand5 should on average generate enough random bits to satisfy five calls to rand7: 9/5 = 1.8.

sorry, this is the right answer

int x = (random5() +

random5() +

random5() +

random5() +

random5() +

random5() +

random5()) % 7

This is incorrect in that the possible values assigned to "x" would not be equally probable. As a simple proof, consider the 5*5*5*5*5*5*5=78125 distinct, and equally likely, result combos of the seven calls to random5 (e.g. all calls to random5 return 0, first call to random5 returns 1 and the rest 0's, etc). Since 78125 is not evenly divisible by 7, the probability of (...)%7 returning 0 will necessarily be different that at least one of the other return values 1-6.

Also, given that the perf cost of random5() is an unknown it would be prudent to minimize the number calls to that.

dk,

thwen you do rand5()+rand5(), then i should be considered a combination (with repetition) instead of a permutation because

1+2 will be the same as 2+1

So in that case rand5()+rand5()+rand5()+rand5()+rand5()+rand5()+rand5() (basically rand5() added to itself 7 times) would yield a total outcomes of

[(5+7-1)! / 7!*(5-1)!] = 330.

Also if the no. of outcomes isn't a multiple of 7, it's still ok. what matters is if all the different values gotten by adding the individual outcomes of each experiment and modding them by 7 is actually 7.

So what we're looking at here are 330 outcomes. So here are a few outcomes when you call rand5() 7 times

1 1 1 1 1 1 1

1 1 1 1 1 1 2

1 1 1 1 1 1 3

1 1 1 1 1 1 4

1 1 1 1 1 1 5

1 1 1 1 1 2 2

1 1 1 1 1 2 3

1 1 1 1 1 2 4

1 1 1 1 1 2 5

1 1 1 1 1 3 3

1 1 1 1 1 3 4

1 1 1 1 1 3 5

1 1 1 1 1 4 4

1 1 1 1 1 4 5

1 1 1 1 1 5 5

1 1 1 1 2 2 2

1 1 1 1 2 2 3

When you add each row, you get nos. in [0-6] which is [1-7]-1 really. I wrote a little program to see the frequencies of each no. resulting from

(rand5()+rand5()+rand5()+rand5()+rand5()+rand5()+rand5()) % 7

and it looks like the probabilities are as follows

0:48/330

1:47/330

2:47/330

3:47/330

4:47/330

5:47/330

6:47/330

This gives a close uniform distribution, atleast closer than all the other methods described.

Hi keyvez,

You mention " 1+2 will be the same as 2+1 ", which is true in terms of the result of the mod. However, the combination of these two possibilities will tend on average to occur twice as frequently as will the single "1+1" possibility, and that does not make for a uniform distribution. Permutations factor in frequency, while combinations do not.

Probably we should use the hint: "think at bit level" as an advise to don't bother with bit manipulations.

(Random5()-1)*5+Random5()-1 will allow to generated numbers from 0 to 24.

and any number from 0 to 24 will have the same probability 1/25.

let's ignore numbers 21,22,23,24 - the remaining numbers 0-20 will still get equal chances to be generated, so we will use only them to get our Random7 numbers.

int random7(){

while (true){

int v = (random5()-1)*5 + random5() -1;

if (v > 20) {

continue;

}

return v/3 + 1;

}

- Fascinating thread!..:-).

Implementing rand7 (with bits manipulation or else) from 5 uniformly distributed numbers as they are is I think impossible.

Why? Basically, we deal with given probabilities:

... 1/5, 2/5, 2/5.

Solving this to get uniform 1/7 out of the above fractions: be my guest.

So,

"reject sampling" (as somebody above has called it) seems to be the only easy choice.

Here is how I did it:

a) from rand5(), take 4 and reject the 5th; get rand4();

b) from rand4(), build rand2() and rand8();

c) from rand8(), again take 7 and reject the 8th.

Here is the implementation:

def rand5():

...return random.randrange(5) # this is the only "real" random

def rand4():

...while(1):

......n = rand5()

......if n < 4:

.........return n

def rand2():

...return rand4() <2

def rand8():

...n4 = rand4()

...if rand2():

......n4 += 4

...return n4

def rand_5_7():

...while(1):

......n = rand8()

......if n < 7:

.........return n

A few takes (each one calls it 100000 times and stores the results):

:~> ./rand_5_7.py 100000

0: 14424; 1: 14114; 2: 14207; 3: 14517; 4: 14241; 5: 14250; 6: 14247;

:~> ./rand_5_7.py 100000

0: 14383; 1: 14195; 2: 14140; 3: 14265; 4: 14226; 5: 14389; 6: 14402;

:~> ./rand_5_7.py 100000

0: 14281; 1: 14195; 2: 14156; 3: 14287; 4: 14245; 5: 14531; 6: 14305;

:~>

Looks like random to me:-)...

Your site- www.careercup.com is good resource, tnks, admin. look at this <a href= http://howdoqj6.netfirms.com/index.html > buy liquor </a>

The site www.careercup.com is cool resource, tnks, admin.

viagra <a href= http://sci.rutgers.edu/forum/member.php?u=25882 > viagra </a> drugs.

The site www.careercup.com is interesting resource, tnks, admin.

<a href= http://www.tetongravity.com/forums/member.php?u=19805 > Buy Cialis </a> <a href= http://board.spawn.com/forums/member.php?u=59753 > Buy viagra </a> <a href= http://www.tetongravity.com/forums/member.php?u=19806 > Buy Levitra </a>

Here is a solution. Got close approximation when monte carlo simulated.

with rand5(), you can do rand2 and rand4.

x=floor(1+(rand5/5)*2);

rand2=(x>2)?2:x; (to avoid the condition when [1+(5/5)*2 = 3] happens. |||ly you can do rand4.

Do a=rand2 , b=rand4, do a+b.which means you get 8 possibilities.

2 3 3 4 4 5 5 6, and doing a reject for a = b = 1;

which means your sample space becomes 3 3 4 4 5 5 6.

Now for rand7 you need a probability of 4/7 for 1, for any bit.

So if you take P(a+b == 3 || a+b == 4), you can get probability of 4/7 (nr of 1s in any bit position).

I tried a monte carlo simulation with the above probability and got close approx to 4/7. I also feel that the confidence interval for this wud be as high as >95%. When running this over many samples. I havent calculated that though.

Comments/suggestions/criticisms welcome.

Thanks,

Rohith Menon.

<pre lang="java" line="1" title="CodeMonkey18409" class="run-this">mport java.util.*;

import java.lang.*;

class Main

{

private static int getR5(Random g) {

return g.nextInt(5) + 1;

}

private static int getR1(Random g) {

int r = getR5(g) - 1;

while ((r != 0) && (r != 1)) {

r = getR5(g) - 1;

}

return r;

}

private static int getR7(Random g) {

int r1[] = {getR1(g), getR1(g), getR1(g)};

int r7 = (r1[0] << 2) + (r1[1] << 1) + r1[2];

if (r7 == 0) {

return getR7(g);

}

return r7;

}

public static void main(String[] args) {

Random g = new Random(System.currentTimeMillis());

int dist[] = new int[7];

for (int i = 0; i < 1000; i++) {

dist[getR7(g) - 1]++;

}

for (int i = 0; i < 7; i++) {

System.out.println(dist[i]);

}

}

}

</pre><pre title="CodeMonkey18409" input="yes">

</pre>

Using 50%/50% distribution on a bit generator using random5

```
#include <iostream>
#include <cstdlib>
using namespace std;
int random5() {
return (rand()%5)+1;
}
int random7() {
int val=0;
int retval=0;
int bit[3]={0,0,0};
for(int i=0;i<3;i++){
while(true){
val=random5();
if(val>3){
bit[i]=1;
break;
}else if(val<3){
break;
}
}
}
for(int i=0;i<3;i++){
if(bit[i])retval++;
if(i<2)retval<<=1;
}
return retval+1;
}
int main(int argv, char **argc){
float sum=0.;
int hist[7]={0,0,0,0,0,0,0};
for(int i=0;i<100000;i++){
hist[random7()-1]++;
}
for(int i=0;i<7;i++)sum+=(float)hist[i];
for(int i=0;i<7;i++){
cout << 100.0*(hist[i]/sum) << "\% for " << i+1 << endl;
}
exit(EXIT_SUCCESS);
}
```

but what if both Random5() return same value

here is a simple solution

long Random7()

{

long i=Random5();

return i+(i/5)*2;

}

Please read the question carefully. The hint from the interviewer is to think at bit-level.

Maverick's algorithm is not correct, so is Affan's comment. The problem is "the two possible outcomes (0 and 1) are NOT equally likely" with the algorithm! For example, the probability with "1" appearing at the 2nd bit is NOT 1/2.

- xxx October 10, 2007