#
267654 |
|
19-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
239529 |
|
21-Aug-2012 |
dim |
MFC r239192:
Change a few extern inline functions in libm to static inline, since they need to refer to static constants, which C99 does not allow for extern inline functions.
While here, change a comment in e_rem_pio2f.c to mention the correct number of bits.
Reviewed by: bde
MFC r239195:
Add __always_inline to __ieee754_rem_pio2() and __ieee754_rem_pio2f(), since some older versions of gcc refuse to inline these otherwise.
Requested by: bde
|
#
229839 |
|
09-Jan-2012 |
das |
MFC various fma{,f,l} improvements:
r226245 - refactoring r226371 - fix double-rounding bug r226373 - new math_private.h macros r226601 - fix nit in r226371
|
#
225736 |
|
22-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
#
216211 |
|
05-Dec-2010 |
das |
Add log2() and log2f().
|
#
194000 |
|
11-Jun-2009 |
ed |
Use the documented machine constraint for SSE registers.
The amd64-specific bits of msun use an undocumented constraint, which is less likely to be supported by other compilers (such as Clang). Change the code to use a more common machine constraint.
Obtained from: /projects/clangbsd/
|
#
193368 |
|
03-Jun-2009 |
ed |
Use ISO C99 style inline semantics in msun.
Because we use ISO C99 nowadays, we can just get rid of enforcing GNU89-style inlining.
|
#
189803 |
|
14-Mar-2009 |
das |
Eliminate __real__ and __imag__ gccisms.
|
#
186461 |
|
23-Dec-2008 |
marcel |
Add support for the FPA floating-point format on ARM. The FPA floating-point format is identical to the VFP format, but is always stored in big-endian. Introduce _IEEE_WORD_ORDER to describe the byte-order of the FP representation.
Obtained from: Juniper Networks, Inc
|
#
176552 |
|
25-Feb-2008 |
bde |
Change __ieee754_rem_pio2f() to return double instead of float so that this function and its callers cosf(), sinf() and tanf() don't waste time converting values from doubles to floats and back for |x| > 9pi/4. All these functions were optimized a few years ago to mostly use doubles internally and across the __kernel*() interfaces but not across the __ieee754_rem_pio2f() interface.
This saves about 40 cycles in cosf(), sinf() and tanf() for |x| > 9pi/4 on amd64 (A64), and about 20 cycles on i386 (A64) (except for cosf() and sinf() in the upper range). 40 cycles is about 35% for |x| < 9pi/4 <= 2**19pi/2 and about 5% for |x| > 2**19pi/2. The saving is much larger on amd64 than on i386 since the conversions are not easy to optimize except on i386 where some of them are automatic and others are optimized invalidly. amd64 is still about 10% slower in cosf() and tanf() in the lower range due to conversion overhead.
This also gives a tiny speedup for |x| <= 9pi/4 on amd64 (by simplifying the code). It also avoids compiler bugs and/or additional slowness in the conversions on (not yet supported) machines where double_t != double.
|
#
176462 |
|
22-Feb-2008 |
bde |
Add an irint() function in inline asm for amd64 and i386. irint() is the same as lrint() except it returns int instead of long. Though the extern lrint() is fairly fast on these arches, it still takes about 12 cycles longer than the inline version, and 12 cycles is a lot in applications where [li]rint() is used to avoid slow conversions that are only a couple of times slower.
This is only for internal use. The libm versions of *rint*() should also be inline, but that would take would take more header engineering. Implementing irint() instead of lrint() also avoids a conflict with the extern declaration of the latter.
|
#
176356 |
|
17-Feb-2008 |
das |
Add more pi for long doubles. Also, avoid storing multiple copies of the pi/2 array, as it is unlikely to vary, except in Indiana.
|
#
175503 |
|
19-Jan-2008 |
bde |
Do an ordinary assignment in STRICT_ASSIGN() except for floats until there is a problem with non-floats (when i386 defaults to extra precision). This essentially restores yesterday's behaviour for doubles on i386 (since generic rint() isn't used and everywhere else assumed working assignment), but for arches that use the generic rint() it finishes restoring some of 1995's behaviour (don't waste time doing unnecessary store/load).
|
#
175403 |
|
17-Jan-2008 |
bde |
Add a macro STRICT_ASSIGN() to help avoid the compiler bug that assignments and casts don't clip extra precision, if any. The implementation is to assign to a temporary volatile variable and read the result back to assign to the original lvalue.
lib/msun currently 2 different hard-coded hacks to avoid the problem in just a few places and needs it in a few more places. One variant uses volatile for the original lvalue. This works but is slower than necessary. Another temporarily casts the lvalue to volatile. This broke with gcc-4.2.1 or earlier (gcc now stores to the lvalue but doesn't load from it).
|
#
174759 |
|
18-Dec-2007 |
das |
Since nan() is supposed to work the same as strtod("nan(...)", NULL), my original implementation made both use the same code. Unfortunately, this meant libm depended on a vendor header at compile time and previously- unexposed vendor bits in libc at runtime.
Hence, I just wrote my own version of the relevant vendor routine. As it turns out, mine has a factor of 8 fewer of lines of code, and is a bit more readable anyway. The strtod() and *scanf() routines still use vendor code.
Reviewed by: bde
|
#
152869 |
|
28-Nov-2005 |
bde |
Use only double precision for "kernel" cosf and sinf (except for returning float). The functions are renamed from __kernel_{cos,sin}f() to __kernel_{cos,sin}df() so that misuses of them will cause link errors and not crashes.
This version is an almost-routine translation with no special optimizations for accuracy or efficiency. The not-quite-routine part is that in __kernel_cosf(), regenerating the minimax polynomial with double precision coefficients gives a coefficient for the x**2 term that is not quite -0.5, so the literal 0.5 in the code and the related `hz' variable need to be modified; also, the special code for reducing the error in 1.0-x**2*0.5 is no longer needed, so it is convenient to adjust all the logic for the x**2 term a little. Note that without extra precision, it would be very bad to use a coefficient of other than -0.5 for the x**2 term -- the old version depends on multiplication by -0.5 being infinitely precise so as not to need even more special code for reducing the error in 1-x**2*0.5.
This gives an unimportant increase in accuracy, from ~0.8 to ~0.501 ulps. Almost all of the error is from the final rounding step, since the choice of the minimax polynomials so that their contribution to the error is a bit less than 0.5 ulps just happens to give contributions that are significantly less (~.001 ulps).
An Athlons, for uniformly distributed args in [-2pi, 2pi], this gives overall speed increases in the 10-20% range, despite giving a speed decrease of typically 19% (from 31 cycles up to 37) for sinf() on args in [-pi/4, pi/4].
|
#
152713 |
|
23-Nov-2005 |
bde |
Use only double precision for "kernel" tanf (except for returning float). This is a minor interface change. The function is renamed from __kernel_tanf() to __kernel_tandf() so that misues of it will cause link errors and not crashes.
This version is a routine translation with no special optimizations for accuracy or efficiency. It gives an unimportant increase in accuracy, from ~0.9 ulps to 0.5285 ulps. Almost all of the error is from the minimax polynomial (~0.03 ulps and the final rounding step (< 0.5 ulps). It gives strange differences in efficiency in the -5 to +10% range, with -O1 fairly consistently becoming faster and -O2 slower on AXP and A64 with gcc-3.3 and gcc-3.4.
|
#
151865 |
|
29-Oct-2005 |
bde |
Implement inline functions to give the complex result x+I*y from float or double args x and y. x+I*y cannot be used directly yet due to compiler bugs.
Submitted by: Steve Kargl <sgk@troutmask.apl.washington.edu>
|
#
141302 |
|
04-Feb-2005 |
das |
Fix a small scripting snafu in the previous revision.
|
#
141280 |
|
04-Feb-2005 |
das |
Remove wrappers and other cruft intended to support SVID, mistakes in C90, and other arcana. Most of these features were never fully supported or enabled by default.
Ok: bde, stefanf
|
#
117912 |
|
23-Jul-2003 |
peter |
Only provide one copy of the math functions. If we provide a MD function, do not also provide a __generic_XXX version as well. This is how we used to runtime select the generic vs i387 versions on the i386 platform.
This saves a pile of #defines in the src/math_private.h file to undo the __generic_XXX renames in some of the *.c files.
|
#
117909 |
|
23-Jul-2003 |
peter |
Now that we do not need to do runtime detection for the broken default fp emulator, stop doing the runtime selection of hardware or emulated floating point operations on i386. Note that I have not suppressed the duplicate compiles yet.
While here, fix the alpha. It has provided specific copysign/copysignf functions since the beginning of time, but they have never been used.
|
#
114331 |
|
30-Apr-2003 |
peter |
AMD64 support (another IEEEFP platform)
|
#
97045 |
|
21-May-2002 |
benno |
Spread the word of PowerPC.
|
#
92917 |
|
21-Mar-2002 |
obrien |
Remove __P() usage.
|
#
88801 |
|
02-Jan-2002 |
jake |
Add ifdef sparc64.
|
#
87805 |
|
13-Dec-2001 |
phantom |
Fix style bugs (mostly remove 'extern' from function prototypes)
Inspired by: conversation with bde
|
#
84662 |
|
08-Oct-2001 |
dfr |
Port to ia64. Actually, just do like the alpha.
|
#
67166 |
|
15-Oct-2000 |
brian |
Fix #include order
Spotted by: imura
|
#
50476 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
35925 |
|
10-May-1998 |
jb |
There is no alpha asm code like on i386, so all the functions that the i386 builds with a __generic prefix need to have that stripped.
|
#
22993 |
|
22-Feb-1997 |
peter |
Revert $FreeBSD$ to $Id$
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
8870 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
2117 |
|
19-Aug-1994 |
jkh |
This commit was generated by cvs2svn to compensate for changes in r2116, which included commits to RCS files with non-trunk default branches.
|
#
2116 |
|
19-Aug-1994 |
jkh |
J.T. Conklin's latest version of the Sun math library.
-- Begin comments from J.T. Conklin: The most significant improvement is the addition of "float" versions of the math functions that take float arguments, return floats, and do all operations in floating point. This doesn't help (performance) much on the i386, but they are still nice to have.
The float versions were orginally done by Cygnus' Ian Taylor when fdlibm was integrated into the libm we support for embedded systems. I gave Ian a copy of my libm as a starting point since I had already fixed a lot of bugs & problems in Sun's original code. After he was done, I cleaned it up a bit and integrated the changes back into my libm. -- End comments
Reviewed by: jkh Submitted by: jtc
|