#
352835 |
|
28-Sep-2019 |
dim |
MFC r352710:
Do not left-shift a negative number (inducing undefined behavior in C/C++) in exp(3), expf(3), expm1(3) and expm1f(3) during intermediate computations that compute the IEEE-754 bit pattern for |2**k| for integer |k|.
The implementations of exp(3), expf(3), expm1(3) and expm1f(3) need to compute IEEE-754 bit patterns for 2**k in certain places. (k is an integer and 2**k is exactly representable in IEEE-754.)
Currently they do things like 0x3FF0'0000+(k<<20), which is to say they take the bit pattern representing 1 and then add directly to the exponent field to get the desired power of two. This is fine when k is non-negative.
But when k<0 (and certain classes of input trigger this), this left-shifts a negative number -- an operation with undefined behavior in C and C++.
The desired semantics can be achieved by instead adding the possibly-negative k to the IEEE-754 exponent bias to get the desired exponent field, _then_ shifting that into its proper overall position.
(Note that in case of s_expm1.c and s_expm1f.c, there are SET_HIGH_WORD and SET_FLOAT_WORD uses further down in each of these files that perform shift operations involving k, but by these points k's range has been restricted to 2 < k <= 56, and the shift operations under those circumstances can't do anything that would be UB.)
Submitted by: Jeff Walden, https://github.com/jswalden Obtained from: https://github.com/freebsd/freebsd/pull/411 Obtained from: https://github.com/freebsd/freebsd/pull/412
|
#
256281 |
|
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
251024 |
|
27-May-2013 |
das |
Fix some regressions caused by the switch from gcc to clang. The fixes are workarounds for various symptoms of the problem described in clang bugs 3929, 8100, 8241, 10409, and 12958.
The regression tests did their job: they failed, someone brought it up on the mailing lists, and then the issue got ignored for 6 months. Oops. There may still be some regressions for functions we don't have test coverage for yet.
|
#
241887 |
|
22-Oct-2012 |
imp |
Revert r241756
|
#
241756 |
|
19-Oct-2012 |
imp |
Document the method used to compute expf. Taken from exp, with changes to reflect differences in computation between the two.
|
#
226596 |
|
21-Oct-2011 |
das |
Use STRICT_ASSIGN() to ensure that the compiler doesn't screw things up by storing x in a wider type than it's supposed to.
Submitted by: bde
|
#
218508 |
|
10-Feb-2011 |
das |
Fix a bogus threshold that was copied from the double precision version. This commit should have no effect on correctness; it merely changes the threshold at which a simpler approximation can be used.
Reviewed by: bde
|
#
176451 |
|
22-Feb-2008 |
das |
s/rcsid/__FBSDID/
|
#
176074 |
|
07-Feb-2008 |
bde |
Use a better method of scaling by 2**k. Instead of adding to the exponent bits of the reduced result, construct 2**k (hopefully in parallel with the construction of the reduced result) and multiply by it. This tends to be much faster if the construction of 2**k is actually in parallel, and might be faster even with no parallelism since adjustment of the exponent requires a read-modify-wrtite at an unfortunate time for pipelines.
In some cases involving exp2* on amd64 (A64), this change saves about 40 cycles or 30%. I think it is inherently only about 12 cycles faster in these cases and the rest of the speedup is from partly-accidentally avoiding compiler pessimizations (the construction of 2**k is now manually scheduled for good results, and -O2 doesn't always mess this up). In most cases on amd64 (A64) and i386 (A64) the speedup is about 20 cycles. The worst case that I found is expf on ia64 where this change is a pessimization of about 10 cycles or 5%. The manual scheduling for plain exp[f] is harder and not as tuned.
This change ld128/s_exp2l.c has not been tested.
|
#
176032 |
|
06-Feb-2008 |
bde |
As for the float trig functions and logf, use a minimax polynomial that is specialized for float precision. The new polynomial has degree 5 instead of 11, and a maximum error of 2**-27.74 ulps instead of 2**-30.64. This doesn't affect the final error significantly; the maximum error was and is about 0.9101 ulps on amd64 -01 and the number of cases with an error of > 0.5 ulps is actually reduced by epsilon despite the larger error in the polynomial.
This is about 15% faster on amd64 (A64), i386 (A64) and ia64. The asm version is still used instead of this on i386 since it is faster and more accurate.
|
#
175468 |
|
18-Jan-2008 |
das |
Use volatile hacks to make sure these functions generate an underflow exception when they're supposed to. Previously, gcc -O2 was optimizing away the statement that generated it.
|
#
152947 |
|
30-Nov-2005 |
bde |
Fixed the hi+lo approximation to log(2). The normal 17+24 bit decomposition that was used doesn't work normally here, since we want to be able to multiply `hi' by the exponent of x _exactly_, and the exponent of x has more than 7 significant bits for most denormal x's, so the multiplication was not always exact despite a cloned comment claiming that it was. (The comment is correct in the double precision case -- with the normal 33+53 bit decomposition the exponent can have 20 significant bits and the extra bit for denormals is only the 11th.)
Fixing this had little or no effect for denormals (I think because more precision is inherently lost for denormals than is lost by roundoff errors in the multiplication).
The fix is to reduce the precision of the decomposition to 16+24 bits. Due to 2 bugs in the old deomposition and numerical accidents, reducing the precision actually increased the precision of hi+lo. The old hi+lo had about 39 bits instead of at least 41 like it should have had. There were off-by-1-bit errors in each of hi and lo, apparently due to mistranslation from the double precision hi and lo. The correct 16 bit hi happens to give about 19 bits of precision, so the correct hi+lo gives about 43 bits instead of at least 40. The end result is that expf() is now perfectly rounded (to nearest) except in 52561 cases instead of except in 67027 cases, and the maximum error is 0.5013 ulps instead of 0.5023 ulps.
|
#
142369 |
|
24-Feb-2005 |
das |
Revert rev 1.8, which causes small (e.g. 2 ulp) errors for some inputs. The trouble with replacing two floats with a double is that the latter has 6 extra bits of precision, which actually hurts accuracy in many cases. All of the constants are optimal when float arithmetic is used, and would need to be recomputed to do this right.
Noticed by: bde (ucbtest)
|
#
142181 |
|
21-Feb-2005 |
das |
Use double arithmetic instead of simulating it with two floats. This results in a performance gain on the order of 10% for amd64 (sledge), ia64 (pluto1), i386+SSE (Pentium 4), and sparc64 (panther), and a negligible improvement for i386 without SSE. (The i386 port still uses the hardware instruction, though.)
|
#
97407 |
|
28-May-2002 |
alfred |
Assume __STDC__, remove non-__STDC__ code.
Submitted by: keramida
|
#
50476 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
22993 |
|
22-Feb-1997 |
peter |
Revert $FreeBSD$ to $Id$
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
17141 |
|
12-Jul-1996 |
jkh |
General -Wall warning cleanup, part I. Submitted-By: Kent Vander Velden <graphix@iastate.edu>
|
#
8870 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
2117 |
|
19-Aug-1994 |
jkh |
This commit was generated by cvs2svn to compensate for changes in r2116, which included commits to RCS files with non-trunk default branches.
|
#
2116 |
|
19-Aug-1994 |
jkh |
J.T. Conklin's latest version of the Sun math library.
-- Begin comments from J.T. Conklin: The most significant improvement is the addition of "float" versions of the math functions that take float arguments, return floats, and do all operations in floating point. This doesn't help (performance) much on the i386, but they are still nice to have.
The float versions were orginally done by Cygnus' Ian Taylor when fdlibm was integrated into the libm we support for embedded systems. I gave Ian a copy of my libm as a starting point since I had already fixed a lot of bugs & problems in Sun's original code. After he was done, I cleaned it up a bit and integrated the changes back into my libm. -- End comments
Reviewed by: jkh Submitted by: jtc
|