How to sort data according to probability so that each probability interval has a result?
For example, assume there is a set of data as follows
{
a: 40,
b: 20,
c: 10,
d: 5,
e: 5,
f: 5,
g: 5,
h: 5,
i: 3,
j: 2
}
Key is the value to be sorted, and value is the probability of each value appearing at that position in the array. For example, the probability of a appearing at position 0 in the array is 40%. That is, in the calculated array, a has 40% The opportunity is displayed first in the array, and the rest are algorithmically sorted according to probability.
My current solution (Low's method, and it cannot continue to be supported as the array expands):
1. Divide the interval according to the existing probability, assuming that the interval of a is 0-40, and b is 40-60, c is 60-70, and so on
2. Use the function to obtain a random number in the range of 1-100, and then throw the result (that is, put it in the corresponding range)
The code is as follows (seeking optimization ideas)
public function getRandValue($rate, $max, $min, $arr)
{
while (count($rate)) {
$rand = $this->getRand($min, $max);
if (0 < $rand && $rand <= 40) {
$num = 40;
} else if (40 < $rand && $rand <= 60) {
$num = 20;
} else if (60 < $rand && $rand <= 70) {
$num = 10;
} else if (70 < $rand && $rand <= 75) {
$num = 5;
} else if (75 < $rand && $rand <= 80) {
$num = 5;
} else if (80 < $rand && $rand <= 85) {
$num = 5;
} else if (85 < $rand && $rand <= 90) {
$num = 5;
} else if (90 < $rand && $rand <= 95) {
$num = 5;
} else if (95 < $rand && $rand <= 98) {
$num = 3;
} else if (98 < $rand && $rand <= 100) {
$num = 2;
}
if (!in_array($num, $arr) && in_array($num, array(40, 20, 10, 3, 2))) {
$arr[] = $num;
} elseif (!in_array($num, array(40, 20, 10, 3, 2))) {
$arr[] = $num;
}
if (count($arr) >= 10) {
break;
}
}
return $arr;
}
Problems encountered: (in_array judgment is because the values of these intervals can only be calculated once)
1. The calculated value does not necessarily have the value of each interval
2. Code No scalability
I also hope you can give me some advice, please give me some advice, thank you all!
黄舟2017-05-24 11:35:58
I think there is a problem with this question. Such input does not even guarantee that a distribution that satisfies the conditions exists.
With {a: 60, b: 40}
为例:全排列的空间是{ab, ba}
. Then according to your definition it should be:
a出现在位置0的概率为60%,所以 P(ab) = 0.6
且
b出现在位置1的概率为40%,所以 P(ab) = 0.4