1169689SkanLong double format
2169689Skan==================
3169689Skan
4169689Skan  Each long double is made up of two IEEE doubles.  The value of the
5169689Skanlong double is the sum of the values of the two parts (except for
6169689Skan-0.0).  The most significant part is required to be the value of the
7169689Skanlong double rounded to the nearest double, as specified by IEEE.  For
8169689SkanInf values, the least significant part is required to be one of +0.0
9169689Skanor -0.0.  No other requirements are made; so, for example, 1.0 may be
10169689Skanrepresented as (1.0, +0.0) or (1.0, -0.0), and the low part of a NaN
11169689Skanis don't-care.
12169689Skan
13169689SkanClassification
14169689Skan--------------
15169689Skan
16169689SkanA long double can represent any value of the form
17169689Skan  s * 2^e * sum(k=0...105: f_k * 2^(-k))
18169689Skanwhere 's' is +1 or -1, 'e' is between 1022 and -968 inclusive, f_0 is
19169689Skan1, and f_k for k>0 is 0 or 1.  These are the 'normal' long doubles.
20169689Skan
21169689SkanA long double can also represent any value of the form
22169689Skan  s * 2^-968 * sum(k=0...105: f_k * 2^(-k))
23169689Skanwhere 's' is +1 or -1, f_0 is 0, and f_k for k>0 is 0 or 1.  These are
24169689Skanthe 'subnormal' long doubles.
25169689Skan
26169689SkanThere are four long doubles that represent zero, two that represent
27169689Skan+0.0 and two that represent -0.0.  The sign of the high part is the
28169689Skansign of the long double, and the sign of the low part is ignored.
29169689Skan
30169689SkanLikewise, there are four long doubles that represent infinities, two
31169689Skanfor +Inf and two for -Inf.
32169689Skan
33169689SkanEach NaN, quiet or signalling, that can be represented as a 'double'
34169689Skancan be represented as a 'long double'.  In fact, there are 2^64
35169689Skanequivalent representations for each one.
36169689Skan
37169689SkanThere are certain other valid long doubles where both parts are
38169689Skannonzero but the low part represents a value which has a bit set below
39169689Skan2^(e-105).  These, together with the subnormal long doubles, make up
40169689Skanthe denormal long doubles.
41169689Skan
42169689SkanMany possible long double bit patterns are not valid long doubles.
43169689SkanThese do not represent any value.
44169689Skan
45169689SkanLimits
46169689Skan------
47169689Skan
48169689SkanThe maximum representable long double is 2^1024-2^918.  The smallest
49169689Skan*normal* positive long double is 2^-968.  The smallest denormalised
50169689Skanpositive long double is 2^-1074 (this is the same as for 'double').
51169689Skan
52169689SkanConversions
53169689Skan-----------
54169689Skan
55169689SkanA double can be converted to a long double by adding a zero low part.
56169689Skan
57169689SkanA long double can be converted to a double by removing the low part.
58169689Skan
59169689SkanComparisons
60169689Skan-----------
61169689Skan
62169689SkanTwo long doubles can be compared by comparing the high parts, and if
63169689Skanthose compare equal, comparing the low parts.
64169689Skan
65169689SkanArithmetic
66169689Skan----------
67169689Skan
68169689SkanThe unary negate operation operates by negating the low and high parts.
69169689Skan
70169689SkanAn absolute or absolute-negate operation must be done by comparing
71169689Skanagainst zero and negating if necessary.
72169689Skan
73169689SkanAddition and subtraction are performed using library routines.  They
74169689Skanare not at present performed perfectly accurately, the result produced
75169689Skanwill be within 1ulp of the range generated by adding or subtracting
76169689Skan1ulp from the input values, where a 'ulp' is 2^(e-106) given the
77169689Skanexponent 'e'.  In the presence of cancellation, this may be
78169689Skanarbitrarily inaccurate.  Subtraction is done by negation and addition.
79169689Skan
80169689SkanMultiplication is also performed using a library routine.  Its result
81169689Skanwill be within 2ulp of the correct result.
82169689Skan
83169689SkanDivision is also performed using a library routine.  Its result will
84169689Skanbe within 3ulp of the correct result.
85