c++ - 基数排序，有一句话看不懂？

Question

我复制了完整的c++办基数排序的算法实现如下： {代码...} {代码...}

天蓬老师 · Answer

I’ll use the code on the wiki to explain:

void radixsort(int data[], int n) //基数排序
{
    int d = maxbit(data, n);
    int *tmp = new int[n];
    int *count = new int[10]; //计数器
    int i, j, k;
    int radix = 1;
    
    for(i = 1; i <= d; i++) //进行d次排序
    {
        // 分段1
        for (j = 0; j < 10; j++) {
            count[j] = 0;
        }
        
        // 分段2    
        for (j = 0; j < n; j++) {
            k = (data[j] / radix) % 10;
            count[k]++;
        }
        
        // 分段3
        for (j = 1; j < 10; j++) {
            count[j] = count[j - 1] + count[j];
        }
        
        // 分段4
        for (j = n - 1; j >= 0; j--) {
            k = (data[j] / radix) % 10;
            tmp[count[k] - 1] = data[j];
            count[k]--;
        }
        
        // 分段5
        for (j = 0; j < n; j++) {
            data[j] = tmp[j];
        }
        
        radix = radix * 10;
    }
    delete []tmp;
    delete []count;
}

I rewrote the code here and marked the sections with comments to facilitate discussion.

Let me give you an example to illustrate. Suppose we want to sort data as follows:

int data[10] = {72, 99, 6, 25, 17, 22, 35, 64, 98, 12};

We sort based on decimal, so we need to prepare 10 buckets numbered 0~9.

If you have a basic understanding of radix sort, you will know that we will perform d rounds in sequence, and each round uses the k-th bit to allocate buckets. In this example, the largest number is a two-digit number, so:

d == 2  // 進行兩個回合

We’re only talking about what happens in one turn here. Suppose we now proceed to the first round, which is to use single digits to assign sorting.

Section 1

Nothing special here, we will clear all values from 0~9 of count to 0.

Section 2

k = (data[j] / radix) % 10; is to calculate the k-th digit of the integer data[j], which in this round represents the single digit of data[j].

So what this paragraph means is to put each data data[j] into the corresponding bucket according to its single digit number. Although we did not actually generate buckets to store the data, the corresponding count reflects the bucket. There are several data items inside.

Bucket No.	data	count
0	{ }	0
1	{ }	0
2	{72, 22, 12}	3
3	{}	0
4	{64}	1
5	{25, 35}	2
6	{6}	1
7	{17}	1
8	{98}	1
9	{99}	1

Section 3

Here is the key. The questioner’s problem lies here, but to understand this, you must understand the next paragraph. Let’s first observe what he does. Here he adds up the count corresponding to each bucket. stand up.

count[j] = count[j - 1] + count[j];

Bucket No.	data	original count	final count
0	{ }	0	`0`
1	{ }	0 (+`0`)	`0`
2	{72, 22, 12}	3 (+`0`)	`3`
3	{}	0 (+`3`)	`3`
4	{64}	1 (+`3`)	`4`
5	{25, 35}	2 (+`4`)	`6`
6	{6}	1 (+`6`)	`7`
7	{17}	1 (+`7`)	`8`
8	{98}	1 (+`8`)	`9`
9	{99}	1 (+`9`)	`10`

Section 4

This paragraph is the second key. Here we place it in order from the last item of data to tmp, and complete the task of sorting according to the k-th position in this round.

Then what does the allocation below mean:

tmp[count[k] - 1] = data[j];
count[k]--;

In fact, we can look at the meaning of the action in segment 3 like this:

Bucket No.	data	original count	final count	corresponding idx of tmp
0	{ }	0	`0`	X
1	{ }	0 (+`0`)	`0`	X
2	{72, 22, 12}	3 (+`0`)	`3`	0, 1, 2
3	{}	0 (+`3`)	`3`	X
4	{64}	1 (+`3`)	`4`	3
5	{25, 35}	2 (+`4`)	`6`	4, 5
6	{6}	1 (+`6`)	`7`	6
7	{17}	1 (+`7`)	`8`	7
8	{98}	1 (+`8`)	`9`	8
9	{99}	1 (+`9`)	`10`	9

You will find:

original count represents the amount of data in the bucket, and also represents how many tmp index
final count represents the maximum index allocated to the data in the bucket plus 1

Taking bucket number 5 as an example, the two numbers including 25 and 35 occupy the first 6 (tmp) positions of final count, so count back two idx from 6, 4 and 5 are the positions of 25 and 35 in tmp respectively.

That’s why we start from the last numbers in this paragraph (first 35 and then 25), 35 is placed in count[k]-1(5), then count[k]--, so that 25 is placed in count[k]-1(4) .

Section 5

Nothing special, copy the data in tmp back to data to complete the sorting of this round

Conclusion

I hope the above explanation can help you understand a little bit

Questions I answered: Python-QA