Home  >  Q&A  >  body text

效率 - c++中float的计算速度比double慢?

c++中float的计算速度比double慢

 浮点运算都是以双精度进行的,即使只有float的运算,也要先转换成double型再计算。
  所以doublex型比float型要快一点。
  C++ 标准要求 float 类型至少要能精确表示到小数点后6位,并且整数部分的表示范围至少要达到 1.0-37 -- 10+37 。float 一般是 32 位的。
  C ++标准规定double 类型的整数部分的最小表示范围和 float 一样,都是 1.0E-37 到 1.0E+37,但是它要求 double 类型的小数部分至少要能精确到小数点后 10 位。double 通常是 64 位的。

VC编译的时候float会转double,建议以后直接用double做浮点数运算,这样既能保证精度有能提高速度。


这是在网上找到的一段回答。 但在没有指定编译环境和语言时,很多网上的帖子都直接说float比较快。 所以现在我乱了。
巴扎黑巴扎黑2766 days ago1078

reply all(2)I'll reply

  • 伊谢尔伦

    伊谢尔伦2017-04-17 13:08:16

    For VC++, there are generally two situations.

    The first case is when you compile a 32-bit program, it will use the X87 instruction set. In modern CPUs, there is a small stack inside X87, and each element is an 80-bit or 128-bit floating point. Regardless of whether you use float, double, or other types, when you push it in, it will be normalized into an 80-bit or 128-bit floating point of the same length, and then transferred back to you after all calculations are completed. So the speed should be almost the same.

    The second case is that you compile a 64-bit program, or turn on MMX, SSE, and AVX instruction set optimization, or you simply use intrinsics to use these instruction sets directly. These instructions currently support float and double, and will not be converted to the same format as X87. Not only does double have twice the data of float, but the number of doubles that can be stored simultaneously in a register of the same size is less than half the number of floats. Therefore, after parallelization, float will be much faster than double.

    Of course, in many cases, the accuracy of float is actually not enough, and when using intrinsics, the impact of your own level is several orders of magnitude greater than the impact of double, so it depends on your needs. Demand dominates.

    reply
    0
  • 大家讲道理

    大家讲道理2017-04-17 13:08:16

    It’s double faster.
    I wrote a test code `main()
    {
    //float f1=0.0;
    double f1=0.0;
    int i,j;
    for(i= 0;i<100000;i++){
    for(j=0;j<10000;j++)f1+=1.1;
    f1-=11000;
    }
    printf("%fn", f1);
    }
    `
    float:
    root@i5a:~/test# time ./a.out
    -1412.595703

    real 0m3.063s
    user 0m3.065s
    sys 0m0.000s

    doubule
    time ./a.out
    0.000204

    real 0m0.843s
    user 0m0.840s
    sys 0m0.004s

    The difference is nearly 4 times
    Let’s take a look at gcc -c -S, just look at the loop body part
    double:

    .L2:
            movl    000, %eax
            .p2align 4,,10
            .p2align 3
    .L5:
            subl    , %eax
            addsd   %xmm1, %xmm0
            jne     .L5
            subl    , %edx
            subsd   %xmm2, %xmm0
            jne     .L2
            

    Let’s look at float again:

     .L2:
            movl    000, %eax
            .p2align 4,,10
            .p2align 3
    .L5:
            unpcklps        %xmm0, %xmm0
            subl    , %eax
            cvtps2pd        %xmm0, %xmm0
            addsd   %xmm1, %xmm0
            unpcklpd        %xmm0, %xmm0
            cvtpd2ps        %xmm0, %xmm0
            jne     .L5
            subl    , %edx
            subss   %xmm2, %xmm0
            jne     .L2
            unpcklps        %xmm0, %xmm0
            movl    $.LC3, %edi
            movl    , %eax
            cvtps2pd        %xmm0, %xmm0
            jmp     printf
            
            

    -O2 optimization has been turned on.

    Let’s take a look at compiling to 32-bit.
    double:
    .L8:

        fxch    %st(1)

    .L2:

        movl    000, %eax
        .p2align 4,,7
        .p2align 3

    .L5:

        subl    , %eax
        fadd    %st, %st(1)
        jne     .L5
        fxch    %st(1)
        subl    , %edx
        fsubs   .LC2
        jne     .L8
    

    float:
    .L9:

        fxch    %st(1)

    .L2:

        movl    000, %eax
        jmp     .L5
        .p2align 4,,7
        .p2align 3

    .L8:

        fxch    %st(1)

    .L5:

        fadd    %st, %st(1)
        fxch    %st(1)
        subl    , %eax
        fstps   12(%esp)
        flds    12(%esp)
        jne     .L8
        subl    , %edx
        fsubs   .LC2
        jne     .L9

    The test results are similar to 64-bit, 0.85 seconds for double, and 2.78 seconds for float, which is a little faster than 64-bit float.

    reply
    0
  • Cancelreply