FLECKmarks: Measuring Floating Point Performance using a
Full IEEE Compliant Arithmetic
BenchmarK, with David Gay
Errata The "subnormal" timings for x86 processors are
incorrect; they do not perform operations on subnormals at full speed.
However, what values are subnormals on the x86 are different than on
other processors due to that architecture's unusual floating-point
register design. On the x86, when a 64-bit double value is loaded
into a register, it has the 15 bit exponent of the 80-bit double
extended format instead of the 11 bit exponent of the double format.
Confusingly, this occurs even if the processor's precision control is
set to round to double precision. The test programs in this
project used computations that would be subnormal in a pure double
format, not in double with extended exponent range (non-zero
subnormals in double with extended exponent range would round to zero
in pure double). Operations on subnormals in double precision with
extended exponent range should take about 100 cycles on the Pentium
Pro and subsequent Intel x86 chips.
Thanks to David Scott of Intel for pointing out this error.