浮点运算都是以双精度进行的,即使只有float的运算,也要先转换成double型再计算。
所以doublex型比float型要快一点。
C++ 标准要求 float 类型至少要能精确表示到小数点后6位,并且整数部分的表示范围至少要达到 1.0-37 -- 10+37 。float 一般是 32 位的。
C ++标准规定double 类型的整数部分的最小表示范围和 float 一样,都是 1.0E-37 到 1.0E+37,但是它要求 double 类型的小数部分至少要能精确到小数点后 10 位。double 通常是 64 位的。
VC编译的时候float会转double,建议以后直接用double做浮点数运算,这样既能保证精度有能提高速度。
这是在网上找到的一段回答。 但在没有指定编译环境和语言时,很多网上的帖子都直接说float比较快。 所以现在我乱了。
伊谢尔伦2017-04-17 13:08:16
For VC++, there are generally two situations.
The first case is when you compile a 32-bit program, it will use the X87 instruction set. In modern CPUs, there is a small stack inside X87, and each element is an 80-bit or 128-bit floating point. Regardless of whether you use float, double, or other types, when you push it in, it will be normalized into an 80-bit or 128-bit floating point of the same length, and then transferred back to you after all calculations are completed. So the speed should be almost the same.
The second case is that you compile a 64-bit program, or turn on MMX, SSE, and AVX instruction set optimization, or you simply use intrinsics to use these instruction sets directly. These instructions currently support float and double, and will not be converted to the same format as X87. Not only does double have twice the data of float, but the number of doubles that can be stored simultaneously in a register of the same size is less than half the number of floats. Therefore, after parallelization, float will be much faster than double.
Of course, in many cases, the accuracy of float is actually not enough, and when using intrinsics, the impact of your own level is several orders of magnitude greater than the impact of double, so it depends on your needs. Demand dominates.
大家讲道理2017-04-17 13:08:16
It’s double faster.
I wrote a test code `main()
{
//float f1=0.0;
double f1=0.0;
int i,j;
for(i= 0;i<100000;i++){
for(j=0;j<10000;j++)f1+=1.1;
f1-=11000;
}
printf("%fn", f1);
}
`
float:
root@i5a:~/test# time ./a.out
-1412.595703
real 0m3.063s
user 0m3.065s
sys 0m0.000s
doubule
time ./a.out
0.000204
real 0m0.843s
user 0m0.840s
sys 0m0.004s
The difference is nearly 4 times
Let’s take a look at gcc -c -S, just look at the loop body part
double:
.L2:
movl 000, %eax
.p2align 4,,10
.p2align 3
.L5:
subl , %eax
addsd %xmm1, %xmm0
jne .L5
subl , %edx
subsd %xmm2, %xmm0
jne .L2
Let’s look at float again:
.L2:
movl 000, %eax
.p2align 4,,10
.p2align 3
.L5:
unpcklps %xmm0, %xmm0
subl , %eax
cvtps2pd %xmm0, %xmm0
addsd %xmm1, %xmm0
unpcklpd %xmm0, %xmm0
cvtpd2ps %xmm0, %xmm0
jne .L5
subl , %edx
subss %xmm2, %xmm0
jne .L2
unpcklps %xmm0, %xmm0
movl $.LC3, %edi
movl , %eax
cvtps2pd %xmm0, %xmm0
jmp printf
-O2 optimization has been turned on.
Let’s take a look at compiling to 32-bit.
double:
.L8:
fxch %st(1)
.L2:
movl 000, %eax
.p2align 4,,7
.p2align 3
.L5:
subl , %eax
fadd %st, %st(1)
jne .L5
fxch %st(1)
subl , %edx
fsubs .LC2
jne .L8
float:
.L9:
fxch %st(1)
.L2:
movl 000, %eax
jmp .L5
.p2align 4,,7
.p2align 3
.L8:
fxch %st(1)
.L5:
fadd %st, %st(1)
fxch %st(1)
subl , %eax
fstps 12(%esp)
flds 12(%esp)
jne .L8
subl , %edx
fsubs .LC2
jne .L9
The test results are similar to 64-bit, 0.85 seconds for double, and 2.78 seconds for float, which is a little faster than 64-bit float.