Yahoo Interview Question
Software Development ManagersCountry: United States
Interview Type: Phone Interview
this method has flaws, what if the probability list is[0.1,0.22222,0.67778]? the big N will be only 10, which led to W1:[1,2.2222,6.7778], and the output number would be[1,3,7].
The findN function needs to be modified as below:
int findN( double W[] )
{
double num = minElement( W );
int n = 1;
while( num < 1 )
{
n = n*10;
num = num*10;
}
return n;
}
char randomchar()
{
double temp = 0;
int random = rand() % 100;
for(int i=0; i<n; i++)
{
temp += W[i];
if(random < temp*100)
return A[i];
}
}
As mentioned in question, these weights and characters are given once but the randomChar() is called multiple number of times. You are doing a O(n) for each of these calls. A good approach would be finding cumulative probabilities in one parse initially. And then whenever the randomChar() is called, you do a binary search with complexity O(logn).
public static Character random(
char[] cha, double[] dba) {
if (cha.length == dba.length) {
List<Character> arr = new ArrayList<>();
int[] inta = new int[cha.length];
for (int i=0;i<inta.length;i++) {
inta[i]=(int) Math.round(dba[i]*100);
}
for (int i=0;i<inta.length;i++)
for (int j = 0; j < inta[i]; j++)
arr.add(cha[i]);
return arr.get((int) Math.round(Math.random()*100));
} else
return null;
}
1. Sort the the probability array in decreasing order (sort character array accordingly).
W'=[0.5,0.3,0.2]
A'=['b', 'a', 'c']
2. Iterate over arrays to calculate cumulative probability as suggested.
W'=[0.5,0.8,1.0]
3. Generate random number 'r' between 0 - 1
4. Iterate array to find 'i' until W'[i+1] > r or i ==n
5. Return A'[i]
public class NewTest{
// to search for the given range
static int binSearch(double[] W, int start, int end, double random){
if(start==end)
return start;
int mid = (end - start)/2;
if(random <= W[mid])
return binSearch(W,start,mid,random);
else
return binSearch(W,mid+1,end,random);
}
static char generateRandom(double[] W, char[] A){
double random = Math.random(); //random double between 0 and 1
return A[binSearch(W,0,W.length-1,random)]; //
}
public static void main(String[] args){
double W[] = {0.2, 0.3, 0.5};
char A[] = {'a', 'b', 'c'};
//calculating cumulative probability
for(int i=1; i < W.length; i++){
W[i] = W[i-1]+W[i];
}
//calling multiple times and counting - for test only
int count[] = {0,0,0};
for(int i=0;i<10000;i++)
{
char r = generateRandom(W,A);
count[r-'a']++;
}
for(int i=0; i<count.length; i++)
System.out.println((char)('a'+i)+" - "+count[i]);
}
}
Once you computed cumulative sum:
for(int i=1; i < W.length; i++){
W[i] = W[i-1]+W[i];
}
your algo has become O(n) solution since A.length = W.length.
Always a trade-off space vs. time:
import java.util.Random;
public class MyRandomChar {
// Assumption: Array is not given as part of the randomChar() -
// "There is an array of characters... "
private final char[] A = { 'a', 'b', 'c' };
private final double[] W = { 0.3, 0.5, 0.2 };
private final int[] lookup;
private int totalLookupValues = 0;
private final Random rand = new Random();
public MyRandomChar() {
// Assumption: Preparation can take O(n) one time:
totalLookupValues = 0;
// Assumption: "... randomChar() called 100 times should return approx. ..."
// Used m = 10 as the resolution factor (can be increased) hence space O(m):
for (double w : W)
totalLookupValues += Math.round(w * 10);
lookup = new int[totalLookupValues];
int lookupIndex = 0;
for (int weightsIndex = 0; weightsIndex < W.length; weightsIndex++) {
for (int weightRepeat = 0; weightRepeat < Math.round(W[weightsIndex] * 10); weightRepeat++)
lookup[lookupIndex++] = weightsIndex;
}
}
public char randomChar() {
// O(1):
return A[lookup[rand.nextInt(totalLookupValues - 1)]];
}
}
Here's my modification of the binary search to identify the interval:
public int binarySearchInterval(double[] A, double target, int start, int end) {
if (start > end) return -1;
int mid = (start + end) / 2;
if (target <= A[mid] && (mid == 0 || target > A[mid-1])) return mid;
if (target > A[mid]) return binarySearchInterval(A, target, mid+1, end);
else return binarySearchInterval(A, target, start, mid-1);
}
Here's my binary search:
unsigned getIndexLower(const vector<float> &vec, float target){
if (vec.size() <=1 )return vec.size();
unsigned low =0, high= vec.size()-1;
while(low < high){
unsigned mid = (low+high)/2;
if (vec[mid] <= target && (mid <high && vec[mid+1] > target)){
return mid+1;
}
if (vec[mid] < target) { low = mid+1; }
else { high = mid; }
}
return low;
}
public static void main (String[] args) throws java.lang.Exception
{
char[] A = {'a','b','c'};
double[] W = {0.33,0.5,0.17};
int k=100,x=1000;
/*
k is the precison upto which you want to multiply your probability with
x is the number of times this experiment would run, make this number
too big it would reach the given probabilities
S containsts the count of number of times a, b or c came
*/
int[] S = {0,0,0};
Random randomGenerator = new Random();
double sum=0.0;
for (int idx = 1; idx <= x; ++idx){
int ra = randomGenerator.nextInt(k);
sum=0.0;
for(int i=0;i<A.length;i++)
{
sum+=W[i]*k;
if(ra<sum)
{
S[i]++;
break;
}
}
}
for(int i=0;i<A.length;i++)
System.out.println((double)S[i]/x);
}
public static void main (String[] args) throws java.lang.Exception
{
char[] A = {'a','b','c'};
double[] W = {0.33,0.5,0.17};
int k=100,x=1000;
/*
k is the precison upto which you want to multiply your probability with
x is the number of times this experiment would run, make this number
too big it would reach the given probabilities
S containsts the count of number of times a, b or c came
*/
int[] S = {0,0,0};
Random randomGenerator = new Random();
double sum=0.0;
for (int idx = 1; idx <= x; ++idx){
int ra = randomGenerator.nextInt(k);
sum=0.0;
for(int i=0;i<A.length;i++)
{
sum+=W[i]*k;
if(ra<sum)
{
S[i]++;
break;
}
}
}
for(int i=0;i<A.length;i++)
System.out.println((double)S[i]/x);
}
public static void main (String[] args) throws java.lang.Exception
{
char[] A = {'a','b','c'};
double[] W = {0.33,0.5,0.17};
int k=100,x=1000;
/*
k is the precison upto which you want to multiply your probability with
x is the number of times this experiment would run, make this number
too big it would reach the given probabilities
S containsts the count of number of times a, b or c came
*/
int[] S = {0,0,0};
Random randomGenerator = new Random();
double sum=0.0;
for (int idx = 1; idx <= x; ++idx){
int ra = randomGenerator.nextInt(k);
sum=0.0;
for(int i=0;i<A.length;i++)
{
sum+=W[i]*k;
if(ra<sum)
{
S[i]++;
break;
}
}
}
for(int i=0;i<A.length;i++)
System.out.println((double)S[i]/x);
}
Lets assume the same example
- iprep.bc November 22, 2013A = ['a', 'b', 'c']
W = [0.3, 0.5, 0.2]
By probability, when randomChar( ) is called 10 times, a should appear 3 times, b should appear 5 times, c should appear 2 times.
Here 10 is a good number because it ensures every probability is turned into an integer corr. character is printed that many times. If we had an example like,
A = ['a', 'b', 'c']
W = [0.03, 0.5, 0.47]
we have to take 100 as the baseline since 0.03*100=3.
This method finds this ānā
int findN()
{
n=minElement(W);
while(n>0)
{
n=n*10;
}
return n;
}
// Create a copy of W in W1 and multiply all elements of W with this n.
void initializeWeights()
{
for(int i=0;i<W.length();i++)
{
W1[i]=W[i]*n;
}
}
This W1 is the number of times each character has to be printed if randChar( ) is called ānā times.
If the randChar( ) is run for n times, our code should ensure that every A[i] is print W[i] times *randomly*. Remember, it cant just print every element sequentially, then randChar() has no meaning! ;)
char randChar()
{
while(1)
{
p=rand(W1.length()); // printing a random number from 1 to W.length()
if(W1[p]>0) // this ensures that when a char at A[p] is printed for that many times it is supposed to
{
W1[p]--;
return A[p];
}
}
}
int main()
{
int j,n;
n=findN();
m=100; // number of times we are asked to print it
for(j=1;j<=10;j++)
{
initializeWeights() // assign back all the weights
for(k=1;k<=n;k++)
{
cout<<randChar( );
}
}
// set W back to its original numbers every n times
}
This is the first time I am posting answers in this forum. Please pardon syntactical mistakes!