History log of /freebsd-9.3-release/lib/msun/src/math_private.h
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 239529 21-Aug-2012 dim

MFC r239192:

Change a few extern inline functions in libm to static inline, since
they need to refer to static constants, which C99 does not allow for
extern inline functions.

While here, change a comment in e_rem_pio2f.c to mention the correct
number of bits.

Reviewed by: bde

MFC r239195:

Add __always_inline to __ieee754_rem_pio2() and __ieee754_rem_pio2f(),
since some older versions of gcc refuse to inline these otherwise.

Requested by: bde


# 229839 09-Jan-2012 das

MFC various fma{,f,l} improvements:

r226245 - refactoring
r226371 - fix double-rounding bug
r226373 - new math_private.h macros
r226601 - fix nit in r226371


# 225736 22-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


# 216211 05-Dec-2010 das

Add log2() and log2f().


# 194000 11-Jun-2009 ed

Use the documented machine constraint for SSE registers.

The amd64-specific bits of msun use an undocumented constraint, which is
less likely to be supported by other compilers (such as Clang). Change
the code to use a more common machine constraint.

Obtained from: /projects/clangbsd/


# 193368 03-Jun-2009 ed

Use ISO C99 style inline semantics in msun.

Because we use ISO C99 nowadays, we can just get rid of enforcing
GNU89-style inlining.


# 189803 14-Mar-2009 das

Eliminate __real__ and __imag__ gccisms.


# 186461 23-Dec-2008 marcel

Add support for the FPA floating-point format on ARM. The
FPA floating-point format is identical to the VFP format,
but is always stored in big-endian.
Introduce _IEEE_WORD_ORDER to describe the byte-order of
the FP representation.

Obtained from: Juniper Networks, Inc


# 176552 25-Feb-2008 bde

Change __ieee754_rem_pio2f() to return double instead of float so that
this function and its callers cosf(), sinf() and tanf() don't waste time
converting values from doubles to floats and back for |x| > 9pi/4.
All these functions were optimized a few years ago to mostly use doubles
internally and across the __kernel*() interfaces but not across the
__ieee754_rem_pio2f() interface.

This saves about 40 cycles in cosf(), sinf() and tanf() for |x| > 9pi/4
on amd64 (A64), and about 20 cycles on i386 (A64) (except for cosf()
and sinf() in the upper range). 40 cycles is about 35% for |x| < 9pi/4
<= 2**19pi/2 and about 5% for |x| > 2**19pi/2. The saving is much
larger on amd64 than on i386 since the conversions are not easy to
optimize except on i386 where some of them are automatic and others
are optimized invalidly. amd64 is still about 10% slower in cosf()
and tanf() in the lower range due to conversion overhead.

This also gives a tiny speedup for |x| <= 9pi/4 on amd64 (by simplifying
the code). It also avoids compiler bugs and/or additional slowness
in the conversions on (not yet supported) machines where double_t !=
double.


# 176462 22-Feb-2008 bde

Add an irint() function in inline asm for amd64 and i386. irint() is
the same as lrint() except it returns int instead of long. Though the
extern lrint() is fairly fast on these arches, it still takes about
12 cycles longer than the inline version, and 12 cycles is a lot in
applications where [li]rint() is used to avoid slow conversions that
are only a couple of times slower.

This is only for internal use. The libm versions of *rint*() should
also be inline, but that would take would take more header engineering.
Implementing irint() instead of lrint() also avoids a conflict with
the extern declaration of the latter.


# 176356 17-Feb-2008 das

Add more pi for long doubles. Also, avoid storing multiple copies
of the pi/2 array, as it is unlikely to vary, except in Indiana.


# 175503 19-Jan-2008 bde

Do an ordinary assignment in STRICT_ASSIGN() except for floats until
there is a problem with non-floats (when i386 defaults to extra
precision). This essentially restores yesterday's behaviour for doubles
on i386 (since generic rint() isn't used and everywhere else assumed
working assignment), but for arches that use the generic rint() it
finishes restoring some of 1995's behaviour (don't waste time doing
unnecessary store/load).


# 175403 17-Jan-2008 bde

Add a macro STRICT_ASSIGN() to help avoid the compiler bug that
assignments and casts don't clip extra precision, if any. The
implementation is to assign to a temporary volatile variable and read
the result back to assign to the original lvalue.

lib/msun currently 2 different hard-coded hacks to avoid the problem
in just a few places and needs it in a few more places. One variant
uses volatile for the original lvalue. This works but is slower than
necessary. Another temporarily casts the lvalue to volatile. This
broke with gcc-4.2.1 or earlier (gcc now stores to the lvalue but
doesn't load from it).


# 174759 18-Dec-2007 das

Since nan() is supposed to work the same as strtod("nan(...)", NULL),
my original implementation made both use the same code. Unfortunately,
this meant libm depended on a vendor header at compile time and previously-
unexposed vendor bits in libc at runtime.

Hence, I just wrote my own version of the relevant vendor routine. As it
turns out, mine has a factor of 8 fewer of lines of code, and is a bit more
readable anyway. The strtod() and *scanf() routines still use vendor code.

Reviewed by: bde


# 152869 28-Nov-2005 bde

Use only double precision for "kernel" cosf and sinf (except for
returning float). The functions are renamed from __kernel_{cos,sin}f()
to __kernel_{cos,sin}df() so that misuses of them will cause link errors
and not crashes.

This version is an almost-routine translation with no special optimizations
for accuracy or efficiency. The not-quite-routine part is that in
__kernel_cosf(), regenerating the minimax polynomial with double
precision coefficients gives a coefficient for the x**2 term that is
not quite -0.5, so the literal 0.5 in the code and the related `hz'
variable need to be modified; also, the special code for reducing the
error in 1.0-x**2*0.5 is no longer needed, so it is convenient to
adjust all the logic for the x**2 term a little. Note that without
extra precision, it would be very bad to use a coefficient of other
than -0.5 for the x**2 term -- the old version depends on multiplication
by -0.5 being infinitely precise so as not to need even more special
code for reducing the error in 1-x**2*0.5.

This gives an unimportant increase in accuracy, from ~0.8 to ~0.501
ulps. Almost all of the error is from the final rounding step, since
the choice of the minimax polynomials so that their contribution to the
error is a bit less than 0.5 ulps just happens to give contributions that
are significantly less (~.001 ulps).

An Athlons, for uniformly distributed args in [-2pi, 2pi], this gives
overall speed increases in the 10-20% range, despite giving a speed
decrease of typically 19% (from 31 cycles up to 37) for sinf() on args
in [-pi/4, pi/4].


# 152713 23-Nov-2005 bde

Use only double precision for "kernel" tanf (except for returning float).
This is a minor interface change. The function is renamed from
__kernel_tanf() to __kernel_tandf() so that misues of it will cause
link errors and not crashes.

This version is a routine translation with no special optimizations
for accuracy or efficiency. It gives an unimportant increase in
accuracy, from ~0.9 ulps to 0.5285 ulps. Almost all of the error is
from the minimax polynomial (~0.03 ulps and the final rounding step
(< 0.5 ulps). It gives strange differences in efficiency in the -5
to +10% range, with -O1 fairly consistently becoming faster and -O2
slower on AXP and A64 with gcc-3.3 and gcc-3.4.


# 151865 29-Oct-2005 bde

Implement inline functions to give the complex result x+I*y from float
or double args x and y. x+I*y cannot be used directly yet due to compiler
bugs.

Submitted by: Steve Kargl <sgk@troutmask.apl.washington.edu>


# 141302 04-Feb-2005 das

Fix a small scripting snafu in the previous revision.


# 141280 04-Feb-2005 das

Remove wrappers and other cruft intended to support SVID, mistakes in
C90, and other arcana. Most of these features were never fully
supported or enabled by default.

Ok: bde, stefanf


# 117912 23-Jul-2003 peter

Only provide one copy of the math functions. If we provide a MD function,
do not also provide a __generic_XXX version as well. This is how we
used to runtime select the generic vs i387 versions on the i386 platform.

This saves a pile of #defines in the src/math_private.h file to undo the
__generic_XXX renames in some of the *.c files.


# 117909 23-Jul-2003 peter

Now that we do not need to do runtime detection for the broken default
fp emulator, stop doing the runtime selection of hardware or emulated
floating point operations on i386. Note that I have not suppressed the
duplicate compiles yet.

While here, fix the alpha. It has provided specific copysign/copysignf
functions since the beginning of time, but they have never been used.


# 114331 30-Apr-2003 peter

AMD64 support (another IEEEFP platform)


# 97045 21-May-2002 benno

Spread the word of PowerPC.


# 92917 21-Mar-2002 obrien

Remove __P() usage.


# 88801 02-Jan-2002 jake

Add ifdef sparc64.


# 87805 13-Dec-2001 phantom

Fix style bugs (mostly remove 'extern' from function prototypes)

Inspired by: conversation with bde


# 84662 08-Oct-2001 dfr

Port to ia64. Actually, just do like the alpha.


# 67166 15-Oct-2000 brian

Fix #include order

Spotted by: imura


# 50476 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 35925 10-May-1998 jb

There is no alpha asm code like on i386, so all the functions that
the i386 builds with a __generic prefix need to have that stripped.


# 22993 22-Feb-1997 peter

Revert $FreeBSD$ to $Id$


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 8870 30-May-1995 rgrimes

Remove trailing whitespace.


# 2117 19-Aug-1994 jkh

This commit was generated by cvs2svn to compensate for changes in r2116,
which included commits to RCS files with non-trunk default branches.


# 2116 19-Aug-1994 jkh

J.T. Conklin's latest version of the Sun math library.

-- Begin comments from J.T. Conklin:
The most significant improvement is the addition of "float" versions
of the math functions that take float arguments, return floats, and do
all operations in floating point. This doesn't help (performance)
much on the i386, but they are still nice to have.

The float versions were orginally done by Cygnus' Ian Taylor when
fdlibm was integrated into the libm we support for embedded systems.
I gave Ian a copy of my libm as a starting point since I had already
fixed a lot of bugs & problems in Sun's original code. After he was
done, I cleaned it up a bit and integrated the changes back into my
libm.
-- End comments

Reviewed by: jkh
Submitted by: jtc