23

GNU GCC does not round floating-point divisions to the nearest value

 3 years ago
source link: https://lemire.me/blog/2020/06/26/gcc-not-nearest/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

I know that floating-point arithmetic is a bit crazy on modern computers. For example, floating-point numbers are not associative:

0.1+(0.2+0.3) == 0.599999999999999978
(0.1+0.2)+0.3 == 0.600000000000000089

But, at least, this is fairly consistent in my experience. You should simply not assume fancy properties like associativity to work in the real world.

Today I stumbled on a fun puzzle. Consider the following:

double x = 50178230318.0;
double y = 100000000000.0;
double ratio = x/y;

If God did exist, the variable ratio would be 0.50178230318 and the story would end there. Unfortunately, there is no floating-point number that is exactly 0.50178230318. Instead it falls between the floating-point number 0.501782303179999944 and the floating-point number 0.501782303180000055.

It important to be a bit more precise. The 64-bit floating-point standard represents numbers as a 53-bit mantissa followed by a power of two. So let us spell it out the way the computer sees it:

0.501782303179999944 == 4519653187245114  * 2 ** -53
0.501782303180000055 == 4519653187245115  * 2 ** -53

We have to pick the mantissa 4519653187245114 or the mantissa 4519653187245115. There is no way to represent exactly anything that falls in-between using 64-bit floating-point numbers. So where does 0.50178230318 fall exactly? We have approximately…

0.50178230318 = 4519653187245114.50011795456 * 2 ** -53

So the number is best approximated by the largest of the two values (0.501782303180000055).

Python gets it right:

>>> "%18.18f" % (50178230318.0/100000000000.0)
'0.501782303180000055'

JavaScript gets it right:

> 0.50178230318 == 0.501782303180000055
true
> 0.50178230318 == 0.501782303179999944
false

So the story would end there, right?

Let us spin up the GNU GCC 7 compiler for x86 systems and run the following C/C++ program:

#include <stdio.h>
int main() {
  double x = 50178230318.0;
  double y = 100000000000.0;
  double ratio = x/y;
  printf("x/y  = %18.18f\n", ratio);
}

Can you predict the result?

$ g++ -o round round.cpp
$ ./round
x/y  = 0.501782303179999944

So GNU GCC actually picks the smallest and furthest value, instead of the nearest value. It remains true even if you set specifically ask the compiler to round to nearest ( fesetround(FE_TONEAREST) ).

You may doubt me so I have created a docker-based test .

You might that it is a bug that should be report, right?

There are dozens if not hundreds of similar reports to the GNU GCC team. They are being flagged as invalid .

Let me recap: the GNU GCC compiler may round the result of a division between two floating-point numbers to a value is not the nearest . And it is not considered a bug.

The explanation is that the compiler first rounds to nearest using 80 bits and then rounds again. This is what fancy numerical folks call FLT_EVAL_METHOD = 2 .

However, the value of FLT_EVAL_METHOD remains at 2 even if you add optimization flags such as -O2 , and yet the result will change. The explanation is that the optimizer figures out the solution at compile-time and does so ignoring the FLT_EVAL_METHOD value.

You can also try to pass GNU GCC the flags -msse -mfpmath=sse , as experts recommend, but as my script demonstrates, it does not solve the issue (and then you get FLT_EVAL_METHOD = -1 ). You need to also add an appropriate target (e.g., -msse -mfpmath=sse -march=pentium4 ).

If you are confused as to why all of this could be possible without any of it being a bug, welcome to the club. You can examine the assembly on godbolt .


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK