test/testfloat/testfloat.txt

206917Smarius
206917SmariusTestFloat Release 2a General Documentation
206917Smarius
206917SmariusJohn R. Hauser
206917Smarius1998 December 16
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusIntroduction
206917Smarius
206917SmariusTestFloat is a program for testing that a floating-point implementation
206917Smariusconforms to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
206917SmariusAll standard operations supported by the system can be tested, except for
206917Smariusconversions to and from decimal.  Any of the following machine formats can
206917Smariusbe tested:  single precision, double precision, extended double precision,
206917Smariusand/or quadruple precision.
206917Smarius
206917SmariusTestFloat actually comes in two variants:  one is a program for testing
206917Smariusa machine's floating-point, and the other is a program for testing
206917Smariusthe SoftFloat software implementation of floating-point.  (Information
206917Smariusabout SoftFloat can be found at the SoftFloat Web page, `http://
206917SmariusHTTP.CS.Berkeley.EDU/~jhauser/arithmetic/SoftFloat.html'.)  The version that
206917Smariustests SoftFloat is expected to be of interest only to people compiling the
206917SmariusSoftFloat sources.  However, because the two versions share much in common,
206917Smariusthey are discussed together in all the TestFloat documentation.
206917Smarius
206917SmariusThis document explains how to use the TestFloat programs.  It does not
206917Smariusattempt to define or explain the IEC/IEEE Standard for floating-point.
206917SmariusDetails about the standard are available elsewhere.
206917Smarius
206917SmariusThe first release of TestFloat (Release 1) was called _FloatTest_.  The old
206917Smariusname has been obsolete for some time.
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusLimitations
206917Smarius
206917SmariusTestFloat's output is not always easily interpreted.  Detailed knowledge
206917Smariusof the IEC/IEEE Standard and its vagaries is needed to use TestFloat
206917Smariusresponsibly.
206917Smarius
206917SmariusTestFloat performs relatively simple tests designed to check the fundamental
206917Smariussoundness of the floating-point under test.  TestFloat may also at times
206917Smariusmanage to find rarer and more subtle bugs, but it will probably only find
206917Smariussuch bugs by accident.  Software that purposefully seeks out various kinds
206917Smariusof subtle floating-point bugs can be found through links posted on the
206917SmariusTestFloat Web page (`http://HTTP.CS.Berkeley.EDU/~jhauser/arithmetic/
206917SmariusTestFloat.html').
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusContents
206917Smarius
206917Smarius    Introduction
206917Smarius    Limitations
206917Smarius    Contents
206917Smarius    Legal Notice
206917Smarius    What TestFloat Does
206917Smarius    Executing TestFloat
206917Smarius    Functions Tested by TestFloat
206917Smarius        Conversion Functions
206917Smarius        Standard Arithmetic Functions
206917Smarius        Remainder and Round-to-Integer Functions
206917Smarius        Comparison Functions
206917Smarius    Interpreting TestFloat Output
206917Smarius    Variations Allowed by the IEC/IEEE Standard
206917Smarius        Underflow
206917Smarius        NaNs
206917Smarius        Conversions to Integer
206917Smarius    TestFloat Options
206917Smarius        -help
206917Smarius        -list
206917Smarius        -level <num>
206917Smarius        -errors <num>
206917Smarius        -errorstop
206917Smarius        -forever
206917Smarius        -checkNaNs
206917Smarius        -precision32, -precision64, -precision80
206917Smarius        -nearesteven, -tozero, -down, -up
206917Smarius        -tininessbefore, -tininessafter
206917Smarius    Function Sets
206917Smarius    Contact Information
206917Smarius
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusLegal Notice
206917Smarius
206917SmariusTestFloat was written by John R. Hauser.
206917Smarius
206917SmariusTHIS SOFTWARE IS DISTRIBUTED AS IS, FOR FREE.  Although reasonable effort
206917Smariushas been made to avoid it, THIS SOFTWARE MAY CONTAIN FAULTS THAT WILL AT
206917SmariusTIMES RESULT IN INCORRECT BEHAVIOR.  USE OF THIS SOFTWARE IS RESTRICTED TO
206917SmariusPERSONS AND ORGANIZATIONS WHO CAN AND WILL TAKE FULL RESPONSIBILITY FOR ANY
206917SmariusAND ALL LOSSES, COSTS, OR OTHER PROBLEMS ARISING FROM ITS USE.
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusWhat TestFloat Does
206917Smarius
206917SmariusTestFloat tests a system's floating-point by comparing its behavior with
206917Smariusthat of TestFloat's own internal floating-point implemented in software.
206917SmariusFor each operation tested, TestFloat generates a large number of test cases,
206917Smariusmade up of simple pattern tests intermixed with weighted random inputs.
206917SmariusThe cases generated should be adequate for testing carry chain propagations,
206917Smariusplus the rounding of adds, subtracts, multiplies, and simple operations like
206917Smariusconversions.  TestFloat makes a point of checking all boundary cases of the
206917Smariusarithmetic, including underflows, overflows, invalid operations, subnormal
206917Smariusinputs, zeros (positive and negative), infinities, and NaNs.  For the
206917Smariusinteresting operations like adds and multiplies, literally millions of test
206917Smariuscases can be checked.
206917Smarius
206917SmariusTestFloat is not remarkably good at testing difficult rounding cases for
206917Smariusdivisions and square roots.  It also makes no attempt to find bugs specific
206917Smariusto SRT divisions and the like (such as the infamous Pentium divide bug).
206917SmariusSoftware that tests for such failures can be found through links on the
206917SmariusTestFloat Web page, `http://HTTP.CS.Berkeley.EDU/~jhauser/arithmetic/
206917SmariusTestFloat.html'.
206917Smarius
206917SmariusNOTE!
206917SmariusIt is the responsibility of the user to verify that the discrepancies
206917SmariusTestFloat finds actually represent faults in the system being tested.
206917SmariusAdvice to help with this task is provided later in this document.
206917SmariusFurthermore, even if TestFloat finds no fault with a floating-point
206917Smariusimplementation, that in no way guarantees that the implementation is bug-
206917Smariusfree.
206917Smarius
206917SmariusFor each operation, TestFloat can test all four rounding modes required
206917Smariusby the IEC/IEEE Standard.  TestFloat verifies not only that the numeric
206917Smariusresults of an operation are correct, but also that the proper floating-point
206917Smariusexception flags are raised.  All five exception flags are tested, including
206917Smariusthe inexact flag.  TestFloat does not attempt to verify that the floating-
206917Smariuspoint exception flags are actually implemented as sticky flags.
206917Smarius
206917SmariusFor machines that implement extended double precision with rounding
206917Smariusprecision control (such as Intel's 80x86), TestFloat can test the add,
206917Smariussubtract, multiply, divide, and square root functions at all the standard
206917Smariusrounding precisions.  The rounding precision can be set equivalent to single
206917Smariusprecision, to double precision, or to the full extended double precision.
206917SmariusRounding precision control can only be applied to the extended double-
206917Smariusprecision format and only for the five standard arithmetic operations:  add,
206917Smariussubtract, multiply, divide, and square root.  Other functions can be tested
206917Smariusonly at full precision.
206917Smarius
206917SmariusAs a rule, TestFloat is not particular about the bit patterns of NaNs that
206917Smariusappear as function results.  Any NaN is considered as good a result as
206917Smariusanother.  This laxness can be overridden so that TestFloat checks for
206917Smariusparticular bit patterns within NaN results.  See the sections _Variations_
206917Smarius_Allowed_by_the_IEC/IEEE_Standard_ and _TestFloat_Options_ for details.
206917Smarius
206917SmariusNot all IEC/IEEE Standard functions are supported by all machines.
206917SmariusTestFloat can only test functions that exist on the machine.  But even if
206917Smariusa function is supported by the machine, TestFloat may still not be able
206917Smariusto test the function if it is not accessible through standard ISO C (the
206917Smariusprogramming language in which TestFloat is written) and if the person who
206917Smariuscompiled TestFloat did not provide an alternate means for TestFloat to
206917Smariusinvoke the machine function.
206917Smarius
206917SmariusTestFloat compares a machine's floating-point against the SoftFloat software
206917Smariusimplementation of floating-point, also written by me.  SoftFloat is built
206917Smariusinto the TestFloat executable and does not need to be supplied by the user.
206917SmariusIf SoftFloat is wanted for some other reason (to compile a new version
206917Smariusof TestFloat, for instance), it can be found separately at the Web page
206917Smarius`http://HTTP.CS.Berkeley.EDU/~jhauser/arithmetic/SoftFloat.html'.
206917Smarius
206917SmariusFor testing SoftFloat itself, the TestFloat package includes a program that
206917Smariuscompares SoftFloat's floating-point against _another_ software floating-
206917Smariuspoint implementation.  The second software floating-point is simpler and
206917Smariusslower than SoftFloat, and is completely independent of SoftFloat.  Although
206917Smariusthe second software floating-point cannot be guaranteed to be bug-free, the
206917Smariuschance that it would mimic any of SoftFloat's bugs is remote.  Consequently,
206917Smariusan error in one or the other floating-point version should appear as an
206917Smariusunexpected discrepancy between the two implementations.  Note that testing
206917SmariusSoftFloat should only be necessary when compiling a new TestFloat executable
206917Smariusor when compiling SoftFloat for some other reason.
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusExecuting TestFloat
206917Smarius
206917SmariusTestFloat is intended to be executed from a command line interpreter.  The
206917Smarius`testfloat' program is invoked as follows:
206917Smarius
206917Smarius    testfloat [<option>...] <function>
206917Smarius
206917SmariusHere square brackets ([]) indicate optional items, while angled brackets
206917Smarius(<>) denote parameters to be filled in.
206917Smarius
206917SmariusThe `<function>' argument is a name like `float32_add' or `float64_to_int32'.
206917SmariusThe complete list of function names is given in the next section,
206917Smarius_Functions_Tested_by_TestFloat_.  It is also possible to test all machine
206917Smariusfunctions in a single invocation.  The various options to TestFloat are
206917Smariusdetailed in the section _TestFloat_Options_ later in this document.  If
206917Smarius`testfloat' is executed without any arguments, a summary of TestFloat usage
206917Smariusis written.
206917Smarius
206917SmariusTestFloat will ordinarily test a function for all four rounding modes, one
206917Smariusafter the other.  If the rounding mode is not supposed to have any affect
206917Smariuson the results--for instance, some operations do not require rounding--only
206917Smariusthe nearest/even rounding mode is checked.  For extended double-precision
206917Smariusoperations affected by rounding precision control, TestFloat also tests all
206917Smariusthree rounding precision modes, one after the other.  Testing can be limited
206917Smariusto a single rounding mode and/or rounding precision with appropriate options
206917Smarius(see _TestFloat_Options_).
206917Smarius
206917SmariusAs it executes, TestFloat writes status information to the standard error
206917Smariusoutput, which should be the screen by default.  In order for this status to
206917Smariusbe displayed properly, the standard error stream should not be redirected
206917Smariusto a file.  The discrepancies TestFloat finds are written to the standard
206917Smariusoutput stream, which is easily redirected to a file if desired.  Ordinarily,
206917Smariusthe errors TestFloat reports and the ongoing status information appear
206917Smariusintermixed on the same screen.
206917Smarius
206917SmariusThe version of TestFloat for testing SoftFloat is called `testsoftfloat'.
206917SmariusIt is invoked the same as `testfloat',
206917Smarius
206917Smarius    testsoftfloat [<option>...] <function>
206917Smarius
206917Smariusand operates similarly.
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusFunctions Tested by TestFloat
206917Smarius
206917SmariusTestFloat tests all operations required by the IEC/IEEE Standard except for
206917Smariusconversions to and from decimal.  The operations are
206917Smarius
206917Smarius-- Conversions among the supported floating-point formats, and also between
206917Smarius   integers (32-bit and 64-bit) and any of the floating-point formats.
206917Smarius
206917Smarius-- The usual add, subtract, multiply, divide, and square root operations
206917Smarius   for all supported floating-point formats.
206917Smarius
206917Smarius-- For each format, the floating-point remainder operation defined by the
206917Smarius   IEC/IEEE Standard.
206917Smarius
206917Smarius-- For each floating-point format, a ``round to integer'' operation that
206917Smarius   rounds to the nearest integer value in the same format.  (The floating-
206917Smarius   point formats can hold integer values, of course.)
206917Smarius
206917Smarius-- Comparisons between two values in the same floating-point format.
206917Smarius
206917SmariusDetailed information about these functions is given below.  In the function
206917Smariusnames used by TestFloat, single precision is called `float32', double
206917Smariusprecision is `float64', extended double precision is `floatx80', and
206917Smariusquadruple precision is `float128'.  TestFloat uses the same names for
206917Smariusfunctions as SoftFloat.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917SmariusConversion Functions
206917Smarius
206917SmariusAll conversions among the floating-point formats and all conversion between
206917Smariusa floating-point format and 32-bit and 64-bit signed integers can be tested.
206917SmariusThe conversion functions are:
206917Smarius
206917Smarius   int32_to_float32      int64_to_float32
206917Smarius   int32_to_float64      int64_to_float32
206917Smarius   int32_to_floatx80     int64_to_floatx80
206917Smarius   int32_to_float128     int64_to_float128
206917Smarius
206917Smarius   float32_to_int32      float32_to_int64
206917Smarius   float32_to_int32      float64_to_int64
206917Smarius   floatx80_to_int32     floatx80_to_int64
206917Smarius   float128_to_int32     float128_to_int64
206917Smarius
206917Smarius   float32_to_float64    float32_to_floatx80   float32_to_float128
206917Smarius   float64_to_float32    float64_to_floatx80   float64_to_float128
206917Smarius   floatx80_to_float32   floatx80_to_float64   floatx80_to_float128
206917Smarius   float128_to_float32   float128_to_float64   float128_to_floatx80
206917Smarius
206917SmariusThese conversions all round according to the current rounding mode as
206917Smariusnecessary.  Conversions from a smaller to a larger floating-point format are
206917Smariusalways exact and so require no rounding.  Conversions from 32-bit integers
206917Smariusto double precision or to any larger floating-point format are also exact,
206917Smariusand likewise for conversions from 64-bit integers to extended double and
206917Smariusquadruple precisions.
206917Smarius
206917SmariusISO/ANSI C requires that conversions to integers be rounded toward zero.
206917SmariusSuch conversions can be tested with the following functions that ignore any
206917Smariusrounding mode:
206917Smarius
206917Smarius   float32_to_int32_round_to_zero    float32_to_int64_round_to_zero
206917Smarius   float64_to_int32_round_to_zero    float64_to_int64_round_to_zero
206917Smarius   floatx80_to_int32_round_to_zero   floatx80_to_int64_round_to_zero
206917Smarius   float128_to_int32_round_to_zero   float128_to_int64_round_to_zero
206917Smarius
206917SmariusTestFloat assumes that conversions from floating-point to integer should
206917Smariusraise the invalid exception if the source value cannot be rounded to a
206917Smariusrepresentable integer of the desired size (32 or 64 bits).  If such a
206917Smariusconversion overflows, TestFloat expects the largest integer with the same
206917Smariussign as the operand to be returned.  If the floating-point operand is a NaN,
228975SuqsTestFloat allows either the largest positive or largest negative integer to
206917Smariusbe returned.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917SmariusStandard Arithmetic Functions
206917Smarius
206917SmariusThe following standard arithmetic functions can be tested:
206917Smarius
206917Smarius   float32_add    float32_sub    float32_mul    float32_div    float32_sqrt
206917Smarius   float64_add    float64_sub    float64_mul    float64_div    float64_sqrt
206917Smarius   floatx80_add   floatx80_sub   floatx80_mul   floatx80_div   floatx80_sqrt
206917Smarius   float128_add   float128_sub   float128_mul   float128_div   float128_sqrt
206917Smarius
206917SmariusThe extended double-precision (`floatx80') functions can be rounded to
206917Smariusreduced precision under rounding precision control.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917SmariusRemainder and Round-to-Integer Functions
206917Smarius
206917SmariusFor each format, TestFloat can test the IEC/IEEE Standard remainder and
206917Smariusround-to-integer functions.  The remainder functions are:
206917Smarius
206917Smarius   float32_rem
206917Smarius   float64_rem
206917Smarius   floatx80_rem
206917Smarius   float128_rem
206917Smarius
206917SmariusThe round-to-integer functions are:
206917Smarius
206917Smarius   float32_round_to_int
206917Smarius   float64_round_to_int
206917Smarius   floatx80_round_to_int
206917Smarius   float128_round_to_int
206917Smarius
206917SmariusThe remainder functions are always exact and so do not require rounding.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917SmariusComparison Functions
206917Smarius
206917SmariusThe following floating-point comparison functions can be tested:
206917Smarius
206917Smarius   float32_eq    float32_le    float32_lt
206917Smarius   float64_eq    float64_le    float64_lt
206917Smarius   floatx80_eq   floatx80_le   floatx80_lt
206917Smarius   float128_eq   float128_le   float128_lt
206917Smarius
206917SmariusThe abbreviation `eq' stands for ``equal'' (=); `le' stands for ``less than
206917Smariusor equal'' (<=); and `lt' stands for ``less than'' (<).
206917Smarius
206917SmariusThe IEC/IEEE Standard specifies that the less-than-or-equal and less-than
206917Smariusfunctions raise the invalid exception if either input is any kind of NaN.
206917SmariusThe equal functions, for their part, are defined not to raise the invalid
206917Smariusexception on quiet NaNs.  For completeness, the following additional
206917Smariusfunctions can be tested if supported:
206917Smarius
206917Smarius   float32_eq_signaling    float32_le_quiet    float32_lt_quiet
206917Smarius   float64_eq_signaling    float64_le_quiet    float64_lt_quiet
206917Smarius   floatx80_eq_signaling   floatx80_le_quiet   floatx80_lt_quiet
206917Smarius   float128_eq_signaling   float128_le_quiet   float128_lt_quiet
206917Smarius
206917SmariusThe `signaling' equal functions are identical to the standard functions
206917Smariusexcept that the invalid exception should be raised for any NaN input.
206917SmariusLikewise, the `quiet' comparison functions should be identical to their
206917Smariuscounterparts except that the invalid exception is not raised for quiet NaNs.
206917Smarius
206917SmariusObviously, no comparison functions ever require rounding.  Any rounding mode
206917Smariusis ignored.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusInterpreting TestFloat Output
206917Smarius
206917SmariusThe ``errors'' reported by TestFloat may or may not really represent errors
206917Smariusin the system being tested.  For each test case tried, TestFloat performs
206917Smariusthe same floating-point operation for the two implementations being compared
206917Smariusand reports any unexpected difference in the results.  The two results could
206917Smariusdiffer for several reasons:
206917Smarius
206917Smarius-- The IEC/IEEE Standard allows for some variation in how conforming
206917Smarius   floating-point behaves.  Two implementations can occasionally give
206917Smarius   different results without either being incorrect.
206917Smarius
206917Smarius-- The trusted floating-point emulation could be faulty.  This could be
206917Smarius   because there is a bug in the way the enulation is coded, or because a
206917Smarius   mistake was made when the code was compiled for the current system.
206917Smarius
206917Smarius-- TestFloat may not work properly, reporting discrepancies that do not
206917Smarius   exist.
206917Smarius
206917Smarius-- Lastly, the floating-point being tested could actually be faulty.
206917Smarius
206917SmariusIt is the responsibility of the user to determine the causes for the
206917Smariusdiscrepancies TestFloat reports.  Making this determination can require
206917Smariusdetailed knowledge about the IEC/IEEE Standard.  Assuming TestFloat is
206917Smariusworking properly, any differences found will be due to either the first or
206917Smariuslast of these reasons.  Variations in the IEC/IEEE Standard that could lead
206917Smariusto false error reports are discussed in the section _Variations_Allowed_by_
206917Smarius_the_IEC/IEEE_Standard_.
206917Smarius
206917SmariusFor each error (or apparent error) TestFloat reports, a line of text
206917Smariusis written to the default output.  If a line would be longer than 79
206917Smariuscharacters, it is divided.  The first part of each error line begins in the
206917Smariusleftmost column, and any subsequent ``continuation'' lines are indented with
206917Smariusa tab.
206917Smarius
206917SmariusEach error reported by `testfloat' is of the form:
206917Smarius
206917Smarius    <inputs>  soft: <output-from-emulation>  syst: <output-from-system>
206917Smarius
206917SmariusThe `<inputs>' are the inputs to the operation.  Each output is shown as a
206917Smariuspair:  the result value first, followed by the exception flags.  The `soft'
206917Smariuslabel stands for ``software'' (or ``SoftFloat''), while `syst' stands for
206917Smarius``system,'' the machine's floating-point.
206917Smarius
206917SmariusFor example, two typical error lines could be
206917Smarius
206917Smarius    800.7FFF00  87F.000100  soft: 001.000000 ....x  syst: 001.000000 ...ux
206917Smarius    081.000004  000.1FFFFF  soft: 001.000000 ....x  syst: 001.000000 ...ux
206917Smarius
206917SmariusIn the first line, the inputs are `800.7FFF00' and `87F.000100'.  The
206917Smariusinternal emulation result is `001.000000' with flags `....x', and the
206917Smariussystem result is the same but with flags `...ux'.  All the items composed of
206917Smariushexadecimal digits and a single period represent floating-point values (here
206917Smariussingle precision).  These cases were reported as errors because the flag
206917Smariusresults differ.
206917Smarius
206917SmariusIn addition to the exception flags, there are seven data types that may
206917Smariusbe represented.  Four are floating-point types:  single precision, double
206917Smariusprecision, extended double precision, and quadruple precision.  The
206917Smariusremaining three types are 32-bit and 64-bit two's-complement integers and
206917SmariusBoolean values (the results of comparison operations).  Boolean values are
206917Smariusrepresented as a single character, either a `0' or a `1'.  32-bit integers
206917Smariusare written as 8 hexadecimal digits in two's-complement form.  Thus,
206917Smarius`FFFFFFFF' is -1, and `7FFFFFFF' is the largest positive 32-bit integer.
206917Smarius64-bit integers are the same except with 16 hexadecimal digits.
206917Smarius
206917SmariusFloating-point values are written in a correspondingly primitive form.
206917SmariusDouble-precision values are represented by 16 hexadecimal digits that give
206917Smariusthe raw bits of the floating-point encoding.  A period separates the 3rd and
206917Smarius4th hexadecimal digits to mark the division between the exponent bits and
206917Smariusfraction bits.  Some notable double-precision values include:
206917Smarius
206917Smarius    000.0000000000000    +0
206917Smarius    3FF.0000000000000     1
206917Smarius    400.0000000000000     2
206917Smarius    7FF.0000000000000    +infinity
206917Smarius
206917Smarius    800.0000000000000    -0
206917Smarius    BFF.0000000000000    -1
206917Smarius    C00.0000000000000    -2
206917Smarius    FFF.0000000000000    -infinity
206917Smarius
206917Smarius    3FE.FFFFFFFFFFFFF    largest representable number preceding +1
206917Smarius
206917SmariusThe following categories are easily distinguished (assuming the `x's are not
206917Smariusall 0):
206917Smarius
206917Smarius    000.xxxxxxxxxxxxx    positive subnormal (denormalized) numbers
206917Smarius    7FF.xxxxxxxxxxxxx    positive NaNs
206917Smarius    800.xxxxxxxxxxxxx    negative subnormal numbers
206917Smarius    FFF.xxxxxxxxxxxxx    negative NaNs
206917Smarius
206917SmariusQuadruple-precision values are written the same except with 4 hexadecimal
206917Smariusdigits for the sign and exponent and 28 for the fraction.  Notable values
206917Smariusinclude:
206917Smarius
206917Smarius    0000.0000000000000000000000000000    +0
206917Smarius    3FFF.0000000000000000000000000000     1
206917Smarius    4000.0000000000000000000000000000     2
206917Smarius    7FFF.0000000000000000000000000000    +infinity
206917Smarius
206917Smarius    8000.0000000000000000000000000000    -0
206917Smarius    BFFF.0000000000000000000000000000    -1
206917Smarius    C000.0000000000000000000000000000    -2
206917Smarius    FFFF.0000000000000000000000000000    -infinity
206917Smarius
206917Smarius    3FFE.FFFFFFFFFFFFFFFFFFFFFFFFFFFF    largest representable number
206917Smarius                                             preceding +1
206917Smarius
206917SmariusExtended double-precision values are a little unusual in that the leading
206917Smariussignificand bit is not hidden as with other formats.  When correctly
206917Smariusencoded, the leading significand bit of an extended double-precision value
206917Smariuswill be 0 if the value is zero or subnormal, and will be 1 otherwise.
206917SmariusHence, the same values listed above appear in extended double-precision as
206917Smariusfollows (note the leading `8' digit in the significands):
206917Smarius
206917Smarius    0000.0000000000000000    +0
206917Smarius    3FFF.8000000000000000     1
206917Smarius    4000.8000000000000000     2
206917Smarius    7FFF.8000000000000000    +infinity
206917Smarius
206917Smarius    8000.0000000000000000    -0
206917Smarius    BFFF.8000000000000000    -1
206917Smarius    C000.8000000000000000    -2
206917Smarius    FFFF.8000000000000000    -infinity
206917Smarius
206917Smarius    3FFE.FFFFFFFFFFFFFFFF    largest representable number preceding +1
206917Smarius
206917SmariusThe representation of single-precision values is unusual for a different
206917Smariusreason.  Because the subfields of standard single-precision do not fall
206917Smariuson neat 4-bit boundaries, single-precision outputs are slightly perturbed.
206917SmariusThese are written as 9 hexadecimal digits, with a period separating the 3rd
206917Smariusand 4th hexadecimal digits.  Broken out into bits, the 9 hexademical digits
206917Smariuscover the single-precision subfields as follows:
206917Smarius
206917Smarius    x000 .... ....  .  .... .... .... .... .... ....    sign       (1 bit)
206917Smarius    .... xxxx xxxx  .  .... .... .... .... .... ....    exponent   (8 bits)
206917Smarius    .... .... ....  .  0xxx xxxx xxxx xxxx xxxx xxxx    fraction  (23 bits)
206917Smarius
206917SmariusAs shown in this schematic, the first hexadecimal digit contains only
206917Smariusthe sign, and will be either `0' or `8'.  The next two digits give the
206917Smariusbiased exponent as an 8-bit integer.  This is followed by a period and
206917Smarius6 hexadecimal digits of fraction.  The most significant hexadecimal digit
206917Smariusof the fraction can be at most a `7'.
206917Smarius
206917SmariusNotable single-precision values include:
206917Smarius
206917Smarius    000.000000    +0
206917Smarius    07F.000000     1
206917Smarius    080.000000     2
206917Smarius    0FF.000000    +infinity
206917Smarius
206917Smarius    800.000000    -0
206917Smarius    87F.000000    -1
206917Smarius    880.000000    -2
206917Smarius    8FF.000000    -infinity
206917Smarius
206917Smarius    07E.7FFFFF    largest representable number preceding +1
206917Smarius
206917SmariusAgain, certain categories are easily distinguished (assuming the `x's are
206917Smariusnot all 0):
206917Smarius
206917Smarius    000.xxxxxx    positive subnormal (denormalized) numbers
206917Smarius    0FF.xxxxxx    positive NaNs
206917Smarius    800.xxxxxx    negative subnormal numbers
206917Smarius    8FF.xxxxxx    negative NaNs
206917Smarius
206917SmariusLastly, exception flag values are represented by five characters, one
206917Smariuscharacter per flag.  Each flag is written as either a letter or a period
206917Smarius(`.') according to whether the flag was set or not by the operation.  A
206917Smariusperiod indicates the flag was not set.  The letter used to indicate a set
206917Smariusflag depends on the flag:
206917Smarius
206917Smarius    v    invalid flag
206917Smarius    z    division-by-zero flag
206917Smarius    o    overflow flag
206917Smarius    u    underflow flag
206917Smarius    x    inexact flag
206917Smarius
206917SmariusFor example, the notation `...ux' indicates that the underflow and inexact
206917Smariusexception flags were set and that the other three flags (invalid, division-
206917Smariusby-zero, and overflow) were not set.  The exception flags are always shown
206917Smariusfollowing the value returned as the result of the operation.
206917Smarius
206917SmariusThe output from `testsoftfloat' is of the same form, except that the results
206917Smariusare labeled `true' and `soft':
206917Smarius
206917Smarius    <inputs>  true: <simple-software-result>  soft: <SoftFloat-result>
206917Smarius
206917SmariusThe ``true'' result is from the simpler, slower software floating-point,
206917Smariuswhich, although not necessarily correct, is more likely to be right than
206917Smariusthe SoftFloat (`soft') result.
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusVariations Allowed by the IEC/IEEE Standard
206917Smarius
206917SmariusThe IEC/IEEE Standard admits some variation among conforming
206917Smariusimplementations.  Because TestFloat expects the two implementations being
206917Smariuscompared to deliver bit-for-bit identical results under most circumstances,
206917Smariusthis leeway in the standard can result in false errors being reported if
206917Smariusthe two implementations do not make the same choices everywhere the standard
206917Smariusprovides an option.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917SmariusUnderflow
206917Smarius
206917SmariusThe standard specifies that the underflow exception flag is to be raised
206917Smariuswhen two conditions are met simultaneously:  (1) _tininess_ and (2) _loss_
206917Smarius_of_accuracy_.  A result is tiny when its magnitude is nonzero yet smaller
206917Smariusthan any normalized floating-point number.  The standard allows tininess to
206917Smariusbe determined either before or after a result is rounded to the destination
206917Smariusprecision.  If tininess is detected before rounding, some borderline cases
206917Smariuswill be flagged as underflows even though the result after rounding actually
206917Smariuslies within the normal floating-point range.  By detecting tininess after
206917Smariusrounding, a system can avoid some unnecessary signaling of underflow.
206917Smarius
206917SmariusLoss of accuracy occurs when the subnormal format is not sufficient
206917Smariusto represent an underflowed result accurately.  The standard allows
206917Smariusloss of accuracy to be detected either as an _inexact_result_ or as a
206917Smarius_denormalization_loss_.  If loss of accuracy is detected as an inexact
206917Smariusresult, the underflow flag is raised whenever an underflowed quantity
206917Smariuscannot be exactly represented in the subnormal format (that is, whenever the
206917Smariusinexact flag is also raised).  A denormalization loss, on the other hand,
206917Smariusoccurs only when the subnormal format is not able to represent the result
206917Smariusthat would have been returned if the destination format had infinite range.
206917SmariusSome underflowed results are inexact but do not suffer a denormalization
206917Smariusloss.  By detecting loss of accuracy as a denormalization loss, a system can
206917Smariusonce again avoid some unnecessary signaling of underflow.
206917Smarius
206917SmariusThe `-tininessbefore' and `-tininessafter' options can be used to control
206917Smariuswhether TestFloat expects tininess on underflow to be detected before or
206917Smariusafter rounding.  (See _TestFloat_Options_ below.)  One or the other is
206917Smariusselected as the default when TestFloat is compiled, but these command
206917Smariusoptions allow the default to be overridden.
206917Smarius
206917SmariusMost (possibly all) systems detect loss of accuracy as an inexact result.
206917SmariusThe current version of TestFloat can only test for this case.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917SmariusNaNs
206917Smarius
206917SmariusThe IEC/IEEE Standard gives the floating-point formats a large number of
206917SmariusNaN encodings and specifies that NaNs are to be returned as results under
206917Smariuscertain conditions.  However, the standard allows an implementation almost
206917Smariuscomplete freedom over _which_ NaN to return in each situation.
206917Smarius
206917SmariusBy default, TestFloat does not check the bit patterns of NaN results.  When
206917Smariusthe result of an operation should be a NaN, any NaN is considered as good
206917Smariusas another.  This laxness can be overridden with the `-checkNaNs' option.
206917Smarius(See _TestFloat_Options_ below.)  In order for this option to be sensible,
206917SmariusTestFloat must have been compiled so that its internal floating-point
206917Smariusimplementation (SoftFloat) generates the proper NaN results for the system
206917Smariusbeing tested.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917SmariusConversions to Integer
206917Smarius
206917SmariusConversion of a floating-point value to an integer format will fail if the
206917Smariussource value is a NaN or if it is too large.  The IEC/IEEE Standard does not
206917Smariusspecify what value should be returned as the integer result in these cases.
206917SmariusMoreover, according to the standard, the invalid exception can be raised or
206917Smariusan unspecified alternative mechanism may be used to signal such cases.
206917Smarius
206917SmariusTestFloat assumes that conversions to integer will raise the invalid
206917Smariusexception if the source value cannot be rounded to a representable integer.
206917SmariusWhen the conversion overflows, TestFloat expects the largest integer with
206917Smariusthe same sign as the operand to be returned.  If the floating-point operand
228975Suqsis a NaN, TestFloat allows either the largest positive or largest negative
206917Smariusinteger to be returned.  The current version of TestFloat provides no means
206917Smariusto alter these conventions.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusTestFloat Options
206917Smarius
206917SmariusThe `testfloat' (and `testsoftfloat') program accepts several command
206917Smariusoptions.  If mutually contradictory options are given, the last one has
206917Smariuspriority.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-help
206917Smarius
206917SmariusThe `-help' option causes a summary of program usage to be written, after
206917Smariuswhich the program exits.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-list
206917Smarius
206917SmariusThe `-list' option causes a list of testable functions to be written,
206917Smariusafter which the program exits.  Some machines do not implement all of the
206917Smariusfunctions TestFloat can test, plus it may not be possible to test functions
206917Smariusthat are inaccessible from the C language.
206917Smarius
206917SmariusThe `testsoftfloat' program does not have this option.  All SoftFloat
206917Smariusfunctions can be tested by `testsoftfloat'.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-level <num>
206917Smarius
206917SmariusThe `-level' option sets the level of testing.  The argument to `-level' can
206917Smariusbe either 1 or 2.  The default is level 1.  Level 2 performs many more tests
206917Smariusthan level 1.  Testing at level 2 can take as much as a day (even longer for
206917Smarius`testsoftfloat'), but can reveal bugs not found by level 1.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-errors <num>
206917Smarius
206917SmariusThe `-errors' option instructs TestFloat to report no more than the
206917Smariusspecified number of errors for any combination of function, rounding mode,
206917Smariusetc.  The argument to `-errors' must be a nonnegative decimal number.  Once
206917Smariusthe specified number of error reports has been generated, TestFloat ends the
206917Smariuscurrent test and begins the next one, if any.  The default is `-errors 20'.
206917Smarius
206917SmariusAgainst intuition, `-errors 0' causes TestFloat to report every error it
206917Smariusfinds.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-errorstop
206917Smarius
206917SmariusThe `-errorstop' option causes the program to exit after the first function
206917Smariusfor which any errors are reported.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-forever
206917Smarius
206917SmariusThe `-forever' option causes a single operation to be repeatedly tested.
206917SmariusOnly one rounding mode and/or rounding precision can be tested in a single
206917Smariusinvocation.  If not specified, the rounding mode defaults to nearest/even.
206917SmariusFor extended double-precision operations, the rounding precision defaults
206917Smariusto full extended double precision.  The testing level is set to 2 by this
206917Smariusoption.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-checkNaNs
206917Smarius
206917SmariusThe `-checkNaNs' option causes TestFloat to verify the bitwise correctness
206917Smariusof NaN results.  In order for this option to be sensible, TestFloat must
206917Smariushave been compiled so that its internal floating-point implementation
206917Smarius(SoftFloat) generates the proper NaN results for the system being tested.
206917Smarius
206917SmariusThis option is not available to `testsoftfloat'.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-precision32, -precision64, -precision80
206917Smarius
206917SmariusFor extended double-precision functions affected by rounding precision
206917Smariuscontrol, the `-precision32' option restricts testing to only the cases
206917Smariusin which rounding precision is equivalent to single precision.  The other
206917Smariusrounding precision options are not tested.  Likewise, the `-precision64'
206917Smariusand `-precision80' options fix the rounding precision equivalent to double
206917Smariusprecision or extended double precision, respectively.  These options are
206917Smariusignored for functions not affected by rounding precision control.
206917Smarius
206917SmariusThese options are not available if extended double precision is not
206917Smariussupported by the machine or if extended double precision functions cannot be
206917Smariustested.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-nearesteven, -tozero, -down, -up
206917Smarius
206917SmariusThe `-nearesteven' option restricts testing to only the cases in which the
206917Smariusrounding mode is nearest/even.  The other rounding mode options are not
206917Smariustested.  Likewise, `-tozero' forces rounding to zero; `-down' forces
206917Smariusrounding down; and `-up' forces rounding up.  These options are ignored for
206917Smariusfunctions that are exact and thus do not round.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius-tininessbefore, -tininessafter
206917Smarius
206917SmariusThe `-tininessbefore' option indicates that the system detects tininess
206917Smariuson underflow before rounding.  The `-tininessafter' option indicates that
206917Smariustininess is detected after rounding.  TestFloat alters its expectations
206917Smariusaccordingly.  These options override the default selected when TestFloat was
206917Smariuscompiled.  Choosing the wrong one of these two options should cause error
206917Smariusreports for some (not all) functions.
206917Smarius
206917SmariusFor `testsoftfloat', these options operate more like the rounding precision
206917Smariusand rounding mode options, in that they restrict the tests performed by
206917Smarius`testsoftfloat'.  By default, `testsoftfloat' tests both cases for any
206917Smariusfunction for which there is a difference.
206917Smarius
206917Smarius- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusFunction Sets
206917Smarius
206917SmariusJust as TestFloat can test an operation for all four rounding modes in
206917Smariussequence, multiple operations can be tested with a single invocation of
206917SmariusTestFloat.  Three sets are recognized:  `-all1', `-all2', and `-all'.  The
206917Smariusset `-all1' comprises all one-operand functions; `-all2' is all two-operand
206917Smariusfunctions; and `-all' is all functions.  A function set can be used in place
206917Smariusof a function name in the TestFloat command line, such as
206917Smarius
206917Smarius    testfloat [<option>...] -all
206917Smarius
206917Smarius
206917Smarius-------------------------------------------------------------------------------
206917SmariusContact Information
206917Smarius
206917SmariusAt the time of this writing, the most up-to-date information about
206917SmariusTestFloat and the latest release can be found at the Web page `http://
206917SmariusHTTP.CS.Berkeley.EDU/~jhauser/arithmetic/TestFloat.html'.
206917Smarius
206917Smarius