internals.xml revision 1.1.1.1
1<sect1 id="appendix.porting.internals" xreflabel="Portin Internals">
2<?dbhtml filename="internals.html"?>
3
4<sect1info>
5  <keywordset>
6    <keyword>
7      ISO C++
8    </keyword>
9    <keyword>
10      internals
11    </keyword>
12  </keywordset>
13</sect1info>
14
15<title>Porting to New Hardware or Operating Systems</title>
16
17<para>
18</para>
19
20
21<para>This document explains how to port libstdc++ (the GNU C++ library) to
22a new target.
23</para>
24
25   <para>In order to make the GNU C++ library (libstdc++) work with a new
26target, you must edit some configuration files and provide some new
27header files.  Unless this is done, libstdc++ will use generic
28settings which may not be correct for your target; even if they are
29correct, they will likely be inefficient.
30   </para>
31
32   <para>Before you get started, make sure that you have a working C library on
33your target.  The C library need not precisely comply with any
34particular standard, but should generally conform to the requirements
35imposed by the ANSI/ISO standard.
36   </para>
37
38   <para>In addition, you should try to verify that the C++ compiler generally
39works.  It is difficult to test the C++ compiler without a working
40library, but you should at least try some minimal test cases.
41   </para>
42
43   <para>(Note that what we think of as a "target," the library refers to as
44a "host."  The comment at the top of <code>configure.ac</code> explains why.)
45   </para>
46
47
48<sect2 id="internals.os">
49<title>Operating System</title>
50
51<para>If you are porting to a new operating system (as opposed to a new chip
52using an existing operating system), you will need to create a new
53directory in the <code>config/os</code> hierarchy.  For example, the IRIX
54configuration files are all in <code>config/os/irix</code>.  There is no set
55way to organize the OS configuration directory.  For example,
56<code>config/os/solaris/solaris-2.6</code> and
57<code>config/os/solaris/solaris-2.7</code> are used as configuration
58directories for these two versions of Solaris.  On the other hand, both
59Solaris 2.7 and Solaris 2.8 use the <code>config/os/solaris/solaris-2.7</code>
60directory.  The important information is that there needs to be a
61directory under <code>config/os</code> to store the files for your operating
62system.
63</para>
64
65   <para>You might have to change the <code>configure.host</code> file to ensure that
66your new directory is activated.  Look for the switch statement that sets
67<code>os_include_dir</code>, and add a pattern to handle your operating system
68if the default will not suffice.  The switch statement switches on only
69the OS portion of the standard target triplet; e.g., the <code>solaris2.8</code>
70in <code>sparc-sun-solaris2.8</code>.  If the new directory is named after the
71OS portion of the triplet (the default), then nothing needs to be changed.
72   </para>
73
74   <para>The first file to create in this directory, should be called
75<code>os_defines.h</code>.  This file contains basic macro definitions
76that are required to allow the C++ library to work with your C library.
77   </para>
78
79   <para>Several libstdc++ source files unconditionally define the macro
80<code>_POSIX_SOURCE</code>.  On many systems, defining this macro causes
81large portions of the C library header files to be eliminated
82at preprocessing time.  Therefore, you may have to <code>#undef</code> this
83macro, or define other macros (like <code>_LARGEFILE_SOURCE</code> or
84<code>__EXTENSIONS__</code>).  You won't know what macros to define or
85undefine at this point; you'll have to try compiling the library and
86seeing what goes wrong.  If you see errors about calling functions
87that have not been declared, look in your C library headers to see if
88the functions are declared there, and then figure out what macros you
89need to define.  You will need to add them to the
90<code>CPLUSPLUS_CPP_SPEC</code> macro in the GCC configuration file for your
91target.  It will not work to simply define these macros in
92<code>os_defines.h</code>.
93   </para>
94
95   <para>At this time, there are a few libstdc++-specific macros which may be
96defined:
97   </para>
98
99   <para><code>_GLIBCXX_USE_C99_CHECK</code> may be defined to 1 to check C99
100function declarations (which are not covered by specialization below)
101found in system headers against versions found in the library headers
102derived from the standard.
103   </para>
104
105   <para><code>_GLIBCXX_USE_C99_DYNAMIC</code> may be defined to an expression that
106yields 0 if and only if the system headers are exposing proper support
107for C99 functions (which are not covered by specialization below).  If
108defined, it must be 0 while bootstrapping the compiler/rebuilding the
109library.
110   </para>
111
112   <para><code>_GLIBCXX_USE_C99_LONG_LONG_CHECK</code> may be defined to 1 to check
113the set of C99 long long function declarations found in system headers
114against versions found in the library headers derived from the
115standard.
116
117   </para>
118   <para><code>_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC</code> may be defined to an
119expression that yields 0 if and only if the system headers are
120exposing proper support for the set of C99 long long functions.  If
121defined, it must be 0 while bootstrapping the compiler/rebuilding the
122library.
123   </para>
124   <para><code>_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC</code> may be defined to an
125expression that yields 0 if and only if the system headers
126are exposing proper support for the related set of macros.  If defined,
127it must be 0 while bootstrapping the compiler/rebuilding the library.
128   </para>
129   <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK</code> may be defined
130to 1 to check the related set of function declarations found in system
131headers against versions found in the library headers derived from
132the standard.
133   </para>
134   <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC</code> may be defined
135to an expression that yields 0 if and only if the system headers
136are exposing proper support for the related set of functions.  If defined,
137it must be 0 while bootstrapping the compiler/rebuilding the library.
138   </para>
139   <para>Finally, you should bracket the entire file in an include-guard, like
140this:
141   </para>
142
143<programlisting>
144
145#ifndef _GLIBCXX_OS_DEFINES
146#define _GLIBCXX_OS_DEFINES
147...
148#endif
149</programlisting>
150
151   <para>We recommend copying an existing <code>os_defines.h</code> to use as a
152starting point.
153   </para>
154</sect2>
155
156
157<sect2 id="internals.cpu">
158<title>CPU</title>
159
160<para>If you are porting to a new chip (as opposed to a new operating system
161running on an existing chip), you will need to create a new directory in the
162<code>config/cpu</code> hierarchy.  Much like the <link linkend="internals.os">Operating system</link> setup,
163there are no strict rules on how to organize the CPU configuration
164directory, but careful naming choices will allow the configury to find your
165setup files without explicit help.
166</para>
167
168   <para>We recommend that for a target triplet <code>&lt;CPU&gt;-&lt;vendor&gt;-&lt;OS&gt;</code>, you
169name your configuration directory <code>config/cpu/&lt;CPU&gt;</code>.  If you do this,
170the configury will find the directory by itself.  Otherwise you will need to
171edit the <code>configure.host</code> file and, in the switch statement that sets
172<code>cpu_include_dir</code>, add a pattern to handle your chip.
173   </para>
174
175   <para>Note that some chip families share a single configuration directory, for
176example, <code>alpha</code>, <code>alphaev5</code>, and <code>alphaev6</code> all use the
177<code>config/cpu/alpha</code> directory, and there is an entry in the
178<code>configure.host</code> switch statement to handle this.
179   </para>
180
181   <para>The <code>cpu_include_dir</code> sets default locations for the files controlling
182<link linkend="internals.thread_safety">Thread safety</link> and <link linkend="internals.numeric_limits">Numeric limits</link>, if the defaults are not
183appropriate for your chip.
184   </para>
185
186</sect2>
187
188
189<sect2 id="internals.char_types">
190<title>Character Types</title>
191
192<para>The library requires that you provide three header files to implement
193character classification, analogous to that provided by the C libraries
194<code>&lt;ctype.h&gt;</code> header.  You can model these on the files provided in
195<code>config/os/generic</code>.  However, these files will almost
196certainly need some modification.
197</para>
198
199   <para>The first file to write is <code>ctype_base.h</code>.  This file provides
200some very basic information about character classification.  The libstdc++
201library assumes that your C library implements <code>&lt;ctype.h&gt;</code> by using
202a table (indexed by character code) containing integers, where each of
203these integers is a bit-mask indicating whether the character is
204upper-case, lower-case, alphabetic, etc.  The <code>ctype_base.h</code>
205file gives the type of the integer, and the values of the various bit
206masks.  You will have to peer at your own <code>&lt;ctype.h&gt;</code> to figure out
207how to define the values required by this file.
208   </para>
209
210   <para>The <code>ctype_base.h</code> header file does not need include guards.
211It should contain a single <code>struct</code> definition called
212<code>ctype_base</code>.  This <code>struct</code> should contain two type
213declarations, and one enumeration declaration, like this example, taken
214from the IRIX configuration:
215   </para>
216
217<programlisting>
218  struct ctype_base
219     {
220       typedef unsigned int 	mask;
221       typedef int* 		__to_type;
222
223       enum
224       {
225	 space = _ISspace,
226	 print = _ISprint,
227	 cntrl = _IScntrl,
228	 upper = _ISupper,
229	 lower = _ISlower,
230	 alpha = _ISalpha,
231	 digit = _ISdigit,
232	 punct = _ISpunct,
233	 xdigit = _ISxdigit,
234	 alnum = _ISalnum,
235	 graph = _ISgraph
236       };
237     };
238</programlisting>
239
240<para>The <code>mask</code> type is the type of the elements in the table.  If your
241C library uses a table to map lower-case numbers to upper-case numbers,
242and vice versa, you should define <code>__to_type</code> to be the type of the
243elements in that table.  If you don't mind taking a minor performance
244penalty, or if your library doesn't implement <code>toupper</code> and
245<code>tolower</code> in this way, you can pick any pointer-to-integer type,
246but you must still define the type.
247</para>
248
249   <para>The enumeration should give definitions for all the values in the above
250example, using the values from your native <code>&lt;ctype.h&gt;</code>.  They can
251be given symbolically (as above), or numerically, if you prefer.  You do
252not have to include <code>&lt;ctype.h&gt;</code> in this header; it will always be
253included before <code>ctype_base.h</code> is included.
254   </para>
255
256   <para>The next file to write is <code>ctype_noninline.h</code>, which also does
257not require include guards.  This file defines a few member functions
258that will be included in <code>include/bits/locale_facets.h</code>.  The first
259function that must be written is the <code>ctype&lt;char&gt;::ctype</code>
260constructor.  Here is the IRIX example:
261   </para>
262
263<programlisting>
264ctype&lt;char&gt;::ctype(const mask* __table = 0, bool __del = false,
265	   size_t __refs = 0)
266       : _Ctype_nois&lt;char&gt;(__refs), _M_del(__table != 0 &amp;&amp; __del),
267	 _M_toupper(NULL),
268	 _M_tolower(NULL),
269	 _M_ctable(NULL),
270	 _M_table(!__table
271		  ? (const mask*) (__libc_attr._ctype_tbl-&gt;_class + 1)
272		  : __table)
273       { }
274</programlisting>
275
276<para>There are two parts of this that you might choose to alter. The first,
277and most important, is the line involving <code>__libc_attr</code>.  That is
278IRIX system-dependent code that gets the base of the table mapping
279character codes to attributes.  You need to substitute code that obtains
280the address of this table on your system.  If you want to use your
281operating system's tables to map upper-case letters to lower-case, and
282vice versa, you should initialize <code>_M_toupper</code> and
283<code>_M_tolower</code> with those tables, in similar fashion.
284</para>
285
286   <para>Now, you have to write two functions to convert from upper-case to
287lower-case, and vice versa.  Here are the IRIX versions:
288   </para>
289
290<programlisting>
291     char
292     ctype&lt;char&gt;::do_toupper(char __c) const
293     { return _toupper(__c); }
294
295     char
296     ctype&lt;char&gt;::do_tolower(char __c) const
297     { return _tolower(__c); }
298</programlisting>
299
300<para>Your C library provides equivalents to IRIX's <code>_toupper</code> and
301<code>_tolower</code>.  If you initialized <code>_M_toupper</code> and
302<code>_M_tolower</code> above, then you could use those tables instead.
303</para>
304
305   <para>Finally, you have to provide two utility functions that convert strings
306of characters.  The versions provided here will always work - but you
307could use specialized routines for greater performance if you have
308machinery to do that on your system:
309   </para>
310
311<programlisting>
312     const char*
313     ctype&lt;char&gt;::do_toupper(char* __low, const char* __high) const
314     {
315       while (__low &lt; __high)
316	 {
317	   *__low = do_toupper(*__low);
318	   ++__low;
319	 }
320       return __high;
321     }
322
323     const char*
324     ctype&lt;char&gt;::do_tolower(char* __low, const char* __high) const
325     {
326       while (__low &lt; __high)
327	 {
328	   *__low = do_tolower(*__low);
329	   ++__low;
330	 }
331       return __high;
332     }
333</programlisting>
334
335   <para>You must also provide the <code>ctype_inline.h</code> file, which
336contains a few more functions.  On most systems, you can just copy
337<code>config/os/generic/ctype_inline.h</code> and use it on your system.
338   </para>
339
340   <para>In detail, the functions provided test characters for particular
341properties; they are analogous to the functions like <code>isalpha</code> and
342<code>islower</code> provided by the C library.
343   </para>
344
345   <para>The first function is implemented like this on IRIX:
346   </para>
347
348<programlisting>
349     bool
350     ctype&lt;char&gt;::
351     is(mask __m, char __c) const throw()
352     { return (_M_table)[(unsigned char)(__c)] &amp; __m; }
353</programlisting>
354
355<para>The <code>_M_table</code> is the table passed in above, in the constructor.
356This is the table that contains the bitmasks for each character.  The
357implementation here should work on all systems.
358</para>
359
360   <para>The next function is:
361   </para>
362
363<programlisting>
364     const char*
365     ctype&lt;char&gt;::
366     is(const char* __low, const char* __high, mask* __vec) const throw()
367     {
368       while (__low &lt; __high)
369	 *__vec++ = (_M_table)[(unsigned char)(*__low++)];
370       return __high;
371     }
372</programlisting>
373
374<para>This function is similar; it copies the masks for all the characters
375from <code>__low</code> up until <code>__high</code> into the vector given by
376<code>__vec</code>.
377</para>
378
379   <para>The last two functions again are entirely generic:
380   </para>
381
382<programlisting>
383     const char*
384     ctype&lt;char&gt;::
385     scan_is(mask __m, const char* __low, const char* __high) const throw()
386     {
387       while (__low &lt; __high &amp;&amp; !this-&gt;is(__m, *__low))
388	 ++__low;
389       return __low;
390     }
391
392     const char*
393     ctype&lt;char&gt;::
394     scan_not(mask __m, const char* __low, const char* __high) const throw()
395     {
396       while (__low &lt; __high &amp;&amp; this-&gt;is(__m, *__low))
397	 ++__low;
398       return __low;
399     }
400</programlisting>
401
402</sect2>
403
404
405<sect2 id="internals.thread_safety">
406<title>Thread Safety</title>
407
408<para>The C++ library string functionality requires a couple of atomic
409operations to provide thread-safety.  If you don't take any special
410action, the library will use stub versions of these functions that are
411not thread-safe.  They will work fine, unless your applications are
412multi-threaded.
413</para>
414
415   <para>If you want to provide custom, safe, versions of these functions, there
416are two distinct approaches.  One is to provide a version for your CPU,
417using assembly language constructs.  The other is to use the
418thread-safety primitives in your operating system.  In either case, you
419make a file called <code>atomicity.h</code>, and the variable
420<code>ATOMICITYH</code> must point to this file.
421   </para>
422
423   <para>If you are using the assembly-language approach, put this code in
424<code>config/cpu/&lt;chip&gt;/atomicity.h</code>, where chip is the name of
425your processor (see <link linkend="internals.cpu">CPU</link>).  No additional changes are necessary to
426locate the file in this case; <code>ATOMICITYH</code> will be set by default.
427   </para>
428
429   <para>If you are using the operating system thread-safety primitives approach,
430you can also put this code in the same CPU directory, in which case no more
431work is needed to locate the file.  For examples of this approach,
432see the <code>atomicity.h</code> file for IRIX or IA64.
433   </para>
434
435   <para>Alternatively, if the primitives are more closely related to the OS
436than they are to the CPU, you can put the <code>atomicity.h</code> file in
437the <link linkend="internals.os">Operating system</link> directory instead.  In this case, you must
438edit <code>configure.host</code>, and in the switch statement that handles
439operating systems, override the <code>ATOMICITYH</code> variable to point to
440the appropriate <code>os_include_dir</code>.  For examples of this approach,
441see the <code>atomicity.h</code> file for AIX.
442   </para>
443
444   <para>With those bits out of the way, you have to actually write
445<code>atomicity.h</code> itself.  This file should be wrapped in an
446include guard named <code>_GLIBCXX_ATOMICITY_H</code>.  It should define one
447type, and two functions.
448   </para>
449
450   <para>The type is <code>_Atomic_word</code>.  Here is the version used on IRIX:
451   </para>
452
453<programlisting>
454typedef long _Atomic_word;
455</programlisting>
456
457<para>This type must be a signed integral type supporting atomic operations.
458If you're using the OS approach, use the same type used by your system's
459primitives.  Otherwise, use the type for which your CPU provides atomic
460primitives.
461</para>
462
463   <para>Then, you must provide two functions.  The bodies of these functions
464must be equivalent to those provided here, but using atomic operations:
465   </para>
466
467<programlisting>
468     static inline _Atomic_word
469     __attribute__ ((__unused__))
470     __exchange_and_add (_Atomic_word* __mem, int __val)
471     {
472       _Atomic_word __result = *__mem;
473       *__mem += __val;
474       return __result;
475     }
476
477     static inline void
478     __attribute__ ((__unused__))
479     __atomic_add (_Atomic_word* __mem, int __val)
480     {
481       *__mem += __val;
482     }
483</programlisting>
484
485</sect2>
486
487
488<sect2 id="internals.numeric_limits">
489<title>Numeric Limits</title>
490
491<para>The C++ library requires information about the fundamental data types,
492such as the minimum and maximum representable values of each type.
493You can define each of these values individually, but it is usually
494easiest just to indicate how many bits are used in each of the data
495types and let the library do the rest.  For information about the
496macros to define, see the top of <code>include/bits/std_limits.h</code>.
497</para>
498
499   <para>If you need to define any macros, you can do so in <code>os_defines.h</code>.
500However, if all operating systems for your CPU are likely to use the
501same values, you can provide a CPU-specific file instead so that you
502do not have to provide the same definitions for each operating system.
503To take that approach, create a new file called <code>cpu_limits.h</code> in
504your CPU configuration directory (see <link linkend="internals.cpu">CPU</link>).
505   </para>
506
507</sect2>
508
509
510<sect2 id="internals.libtool">
511<title>Libtool</title>
512
513<para>The C++ library is compiled, archived and linked with libtool.
514Explaining the full workings of libtool is beyond the scope of this
515document, but there are a few, particular bits that are necessary for
516porting.
517</para>
518
519   <para>Some parts of the libstdc++ library are compiled with the libtool
520<code>--tags CXX</code> option (the C++ definitions for libtool).  Therefore,
521<code>ltcf-cxx.sh</code> in the top-level directory needs to have the correct
522logic to compile and archive objects equivalent to the C version of libtool,
523<code>ltcf-c.sh</code>.  Some libtool targets have definitions for C but not
524for C++, or C++ definitions which have not been kept up to date.
525   </para>
526
527   <para>The C++ run-time library contains initialization code that needs to be
528run as the library is loaded.  Often, that requires linking in special
529object files when the C++ library is built as a shared library, or
530taking other system-specific actions.
531   </para>
532
533   <para>The libstdc++ library is linked with the C version of libtool, even
534though it is a C++ library.  Therefore, the C version of libtool needs to
535ensure that the run-time library initializers are run.  The usual way to
536do this is to build the library using <code>gcc -shared</code>.
537   </para>
538
539   <para>If you need to change how the library is linked, look at
540<code>ltcf-c.sh</code> in the top-level directory.  Find the switch statement
541that sets <code>archive_cmds</code>.  Here, adjust the setting for your
542operating system.
543   </para>
544
545
546</sect2>
547
548</sect1>
549