Cross Reference: /freebsd-9.3-release/lib/msun/src/math

History log of /freebsd-9.3-release/lib/msun/src/math_private.h
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
# 267654	19-Jun-2014	gjb	Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation /freebsd-9.3-release
# 239529	21-Aug-2012	dim	MFC r239192: Change a few extern inline functions in libm to static inline, since they need to refer to static constants, which C99 does not allow for extern inline functions. While here, change a comment in e_rem_pio2f.c to mention the correct number of bits. Reviewed by: bde MFC r239195: Add __always_inline to __ieee754_rem_pio2() and __ieee754_rem_pio2f(), since some older versions of gcc refuse to inline these otherwise. Requested by: bde
# 229839	09-Jan-2012	das	MFC various fma{,f,l} improvements: r226245 - refactoring r226371 - fix double-rounding bug r226373 - new math_private.h macros r226601 - fix nit in r226371
# 225736	22-Sep-2011	kensmith	Copy head to stable/9 as part of 9.0-RELEASE release cycle. Approved by: re (implicit)
# 216211	05-Dec-2010	das	Add log2() and log2f().
# 194000	11-Jun-2009	ed	Use the documented machine constraint for SSE registers. The amd64-specific bits of msun use an undocumented constraint, which is less likely to be supported by other compilers (such as Clang). Change the code to use a more common machine constraint. Obtained from: /projects/clangbsd/
# 193368	03-Jun-2009	ed	Use ISO C99 style inline semantics in msun. Because we use ISO C99 nowadays, we can just get rid of enforcing GNU89-style inlining.
# 189803	14-Mar-2009	das	Eliminate __real__ and __imag__ gccisms.
# 186461	23-Dec-2008	marcel	Add support for the FPA floating-point format on ARM. The FPA floating-point format is identical to the VFP format, but is always stored in big-endian. Introduce _IEEE_WORD_ORDER to describe the byte-order of the FP representation. Obtained from: Juniper Networks, Inc
# 176552	25-Feb-2008	bde	Change __ieee754_rem_pio2f() to return double instead of float so that this function and its callers cosf(), sinf() and tanf() don't waste time converting values from doubles to floats and back for \|x\| > 9pi/4. All these functions were optimized a few years ago to mostly use doubles internally and across the __kernel() interfaces but not across the __ieee754_rem_pio2f() interface. This saves about 40 cycles in cosf(), sinf() and tanf() for \|x\| > 9pi/4 on amd64 (A64), and about 20 cycles on i386 (A64) (except for cosf() and sinf() in the upper range). 40 cycles is about 35% for \|x\| < 9pi/4 <= 219pi/2 and about 5% for \|x\| > 2*19pi/2. The saving is much larger on amd64 than on i386 since the conversions are not easy to optimize except on i386 where some of them are automatic and others are optimized invalidly. amd64 is still about 10% slower in cosf() and tanf() in the lower range due to conversion overhead. This also gives a tiny speedup for \|x\| <= 9pi/4 on amd64 (by simplifying the code). It also avoids compiler bugs and/or additional slowness in the conversions on (not yet supported) machines where double_t != double.
# 176462	22-Feb-2008	bde	Add an irint() function in inline asm for amd64 and i386. irint() is the same as lrint() except it returns int instead of long. Though the extern lrint() is fairly fast on these arches, it still takes about 12 cycles longer than the inline version, and 12 cycles is a lot in applications where [li]rint() is used to avoid slow conversions that are only a couple of times slower. This is only for internal use. The libm versions of rint() should also be inline, but that would take would take more header engineering. Implementing irint() instead of lrint() also avoids a conflict with the extern declaration of the latter.
# 176356	17-Feb-2008	das	Add more pi for long doubles. Also, avoid storing multiple copies of the pi/2 array, as it is unlikely to vary, except in Indiana.
# 175503	19-Jan-2008	bde	Do an ordinary assignment in STRICT_ASSIGN() except for floats until there is a problem with non-floats (when i386 defaults to extra precision). This essentially restores yesterday's behaviour for doubles on i386 (since generic rint() isn't used and everywhere else assumed working assignment), but for arches that use the generic rint() it finishes restoring some of 1995's behaviour (don't waste time doing unnecessary store/load).
# 175403	17-Jan-2008	bde	Add a macro STRICT_ASSIGN() to help avoid the compiler bug that assignments and casts don't clip extra precision, if any. The implementation is to assign to a temporary volatile variable and read the result back to assign to the original lvalue. lib/msun currently 2 different hard-coded hacks to avoid the problem in just a few places and needs it in a few more places. One variant uses volatile for the original lvalue. This works but is slower than necessary. Another temporarily casts the lvalue to volatile. This broke with gcc-4.2.1 or earlier (gcc now stores to the lvalue but doesn't load from it).
# 174759	18-Dec-2007	das	Since nan() is supposed to work the same as strtod("nan(...)", NULL), my original implementation made both use the same code. Unfortunately, this meant libm depended on a vendor header at compile time and previously- unexposed vendor bits in libc at runtime. Hence, I just wrote my own version of the relevant vendor routine. As it turns out, mine has a factor of 8 fewer of lines of code, and is a bit more readable anyway. The strtod() and *scanf() routines still use vendor code. Reviewed by: bde
# 152869	28-Nov-2005	bde	Use only double precision for "kernel" cosf and sinf (except for returning float). The functions are renamed from __kernel_{cos,sin}f() to __kernel_{cos,sin}df() so that misuses of them will cause link errors and not crashes. This version is an almost-routine translation with no special optimizations for accuracy or efficiency. The not-quite-routine part is that in __kernel_cosf(), regenerating the minimax polynomial with double precision coefficients gives a coefficient for the x2 term that is not quite -0.5, so the literal 0.5 in the code and the related `hz' variable need to be modified; also, the special code for reducing the error in 1.0-x20.5 is no longer needed, so it is convenient to adjust all the logic for the x2 term a little. Note that without extra precision, it would be very bad to use a coefficient of other than -0.5 for the x2 term -- the old version depends on multiplication by -0.5 being infinitely precise so as not to need even more special code for reducing the error in 1-x20.5. This gives an unimportant increase in accuracy, from ~0.8 to ~0.501 ulps. Almost all of the error is from the final rounding step, since the choice of the minimax polynomials so that their contribution to the error is a bit less than 0.5 ulps just happens to give contributions that are significantly less (~.001 ulps). An Athlons, for uniformly distributed args in [-2pi, 2pi], this gives overall speed increases in the 10-20% range, despite giving a speed decrease of typically 19% (from 31 cycles up to 37) for sinf() on args in [-pi/4, pi/4].
# 152713	23-Nov-2005	bde	Use only double precision for "kernel" tanf (except for returning float). This is a minor interface change. The function is renamed from __kernel_tanf() to __kernel_tandf() so that misues of it will cause link errors and not crashes. This version is a routine translation with no special optimizations for accuracy or efficiency. It gives an unimportant increase in accuracy, from ~0.9 ulps to 0.5285 ulps. Almost all of the error is from the minimax polynomial (~0.03 ulps and the final rounding step (< 0.5 ulps). It gives strange differences in efficiency in the -5 to +10% range, with -O1 fairly consistently becoming faster and -O2 slower on AXP and A64 with gcc-3.3 and gcc-3.4.
# 151865	29-Oct-2005	bde	Implement inline functions to give the complex result x+Iy from float or double args x and y. x+Iy cannot be used directly yet due to compiler bugs. Submitted by: Steve Kargl <sgk@troutmask.apl.washington.edu>
# 141302	04-Feb-2005	das	Fix a small scripting snafu in the previous revision.
# 141280	04-Feb-2005	das	Remove wrappers and other cruft intended to support SVID, mistakes in C90, and other arcana. Most of these features were never fully supported or enabled by default. Ok: bde, stefanf
# 117912	23-Jul-2003	peter	Only provide one copy of the math functions. If we provide a MD function, do not also provide a __generic_XXX version as well. This is how we used to runtime select the generic vs i387 versions on the i386 platform. This saves a pile of #defines in the src/math_private.h file to undo the __generic_XXX renames in some of the *.c files.
# 117909	23-Jul-2003	peter	Now that we do not need to do runtime detection for the broken default fp emulator, stop doing the runtime selection of hardware or emulated floating point operations on i386. Note that I have not suppressed the duplicate compiles yet. While here, fix the alpha. It has provided specific copysign/copysignf functions since the beginning of time, but they have never been used.
# 114331	30-Apr-2003	peter	AMD64 support (another IEEEFP platform)
# 97045	21-May-2002	benno	Spread the word of PowerPC.
# 92917	21-Mar-2002	obrien	Remove __P() usage.
# 88801	02-Jan-2002	jake	Add ifdef sparc64.
# 87805	13-Dec-2001	phantom	Fix style bugs (mostly remove 'extern' from function prototypes) Inspired by: conversation with bde
# 84662	08-Oct-2001	dfr	Port to ia64. Actually, just do like the alpha.
# 67166	15-Oct-2000	brian	Fix #include order Spotted by: imura
# 50476	27-Aug-1999	peter	$Id$ -> $FreeBSD$
# 35925	10-May-1998	jb	There is no alpha asm code like on i386, so all the functions that the i386 builds with a __generic prefix need to have that stripped.
# 22993	22-Feb-1997	peter	Revert $FreeBSD$ to $Id$
# 21673	14-Jan-1997	jkh	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
# 8870	30-May-1995	rgrimes	Remove trailing whitespace.
# 2117	19-Aug-1994	jkh	This commit was generated by cvs2svn to compensate for changes in r2116, which included commits to RCS files with non-trunk default branches.
# 2116	19-Aug-1994	jkh	J.T. Conklin's latest version of the Sun math library. -- Begin comments from J.T. Conklin: The most significant improvement is the addition of "float" versions of the math functions that take float arguments, return floats, and do all operations in floating point. This doesn't help (performance) much on the i386, but they are still nice to have. The float versions were orginally done by Cygnus' Ian Taylor when fdlibm was integrated into the libm we support for embedded systems. I gave Ian a copy of my libm as a starting point since I had already fixed a lot of bugs & problems in Sun's original code. After he was done, I cleaned it up a bit and integrated the changes back into my libm. -- End comments Reviewed by: jkh Submitted by: jtc