Google Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
We can assume that probs given in the array can be related to frequency and
// calculate the cummulative frequency
ps[0] = freq[0];
for (int i=0; i < n; i ++)
ps[i] += ps[i-1] + freq[i]
int x = rand() % ps[n-1]; // u can generate numbers between (0, ps[n-1]).
// find the ceil idx in prefix_sum of freq
int idx = upper_bound(ps.begin(), ps.end(), x);
return a[idx]; // return the number associated with that
It assumes that probs[i] is the probability of getting strings[i], which means probs.size() == strings.size() and sum(probs) = 1.0
std::string GetRandomString(std::vector<std::string> strings, std::vector<double> probs) {
std::vector<double> density;
density.push_back(probs[0]);
for (int i = 1; i < probs.size(); ++i)
density.push_back(probs[i]+density.back());
auto r = random.NextFloat();
int index = 0;
while (r > density[index])
++index;
return strings[index];
}
The term "arbitrary probability distribution" over the given string set requires further clarification. It is due to the fact that this distribution determines the way we will be simulating the randomness to pick the next string. For instance, the probability array introduces a uniform distribution over the strings. In such a setting, we can simulate a random variable that gives the next string's index as:
where n is the total number of strings and rnd() denotes a random number generator. This code snippet returns a value [1...n]. However, if the probability distribution defined over the set of strings is not uniform then we must use an implementation of the random-index-generator above that suites the given distribution.
- Anonymous September 26, 2013