1<section xmlns="http://docbook.org/ns/docbook" version="5.0" 
2	 xml:id="appendix.porting.internals" xreflabel="Portin Internals">
3<?dbhtml filename="internals.html"?>
4
5<info><title>Porting to New Hardware or Operating Systems</title>
6  <keywordset>
7    <keyword>ISO C++</keyword>
8    <keyword>internals</keyword>
9  </keywordset>
10</info>
11
12
13
14<para>
15</para>
16
17
18<para>This document explains how to port libstdc++ (the GNU C++ library) to
19a new target.
20</para>
21
22   <para>In order to make the GNU C++ library (libstdc++) work with a new
23target, you must edit some configuration files and provide some new
24header files.  Unless this is done, libstdc++ will use generic
25settings which may not be correct for your target; even if they are
26correct, they will likely be inefficient.
27   </para>
28
29   <para>Before you get started, make sure that you have a working C library on
30your target.  The C library need not precisely comply with any
31particular standard, but should generally conform to the requirements
32imposed by the ANSI/ISO standard.
33   </para>
34
35   <para>In addition, you should try to verify that the C++ compiler generally
36works.  It is difficult to test the C++ compiler without a working
37library, but you should at least try some minimal test cases.
38   </para>
39
40   <para>(Note that what we think of as a "target," the library refers to as
41a "host."  The comment at the top of <code>configure.ac</code> explains why.)
42   </para>
43
44
45<section xml:id="internals.os"><info><title>Operating System</title></info>
46
47
48<para>If you are porting to a new operating system (as opposed to a new chip
49using an existing operating system), you will need to create a new
50directory in the <code>config/os</code> hierarchy.  For example, the IRIX
51configuration files are all in <code>config/os/irix</code>.  There is no set
52way to organize the OS configuration directory.  For example,
53<code>config/os/solaris/solaris-2.6</code> and
54<code>config/os/solaris/solaris-2.7</code> are used as configuration
55directories for these two versions of Solaris.  On the other hand, both
56Solaris 2.7 and Solaris 2.8 use the <code>config/os/solaris/solaris-2.7</code>
57directory.  The important information is that there needs to be a
58directory under <code>config/os</code> to store the files for your operating
59system.
60</para>
61
62   <para>You might have to change the <code>configure.host</code> file to ensure that
63your new directory is activated.  Look for the switch statement that sets
64<code>os_include_dir</code>, and add a pattern to handle your operating system
65if the default will not suffice.  The switch statement switches on only
66the OS portion of the standard target triplet; e.g., the <code>solaris2.8</code>
67in <code>sparc-sun-solaris2.8</code>.  If the new directory is named after the
68OS portion of the triplet (the default), then nothing needs to be changed.
69   </para>
70
71   <para>The first file to create in this directory, should be called
72<code>os_defines.h</code>.  This file contains basic macro definitions
73that are required to allow the C++ library to work with your C library.
74   </para>
75
76   <para>Several libstdc++ source files unconditionally define the macro
77<code>_POSIX_SOURCE</code>.  On many systems, defining this macro causes
78large portions of the C library header files to be eliminated
79at preprocessing time.  Therefore, you may have to <code>#undef</code> this
80macro, or define other macros (like <code>_LARGEFILE_SOURCE</code> or
81<code>__EXTENSIONS__</code>).  You won't know what macros to define or
82undefine at this point; you'll have to try compiling the library and
83seeing what goes wrong.  If you see errors about calling functions
84that have not been declared, look in your C library headers to see if
85the functions are declared there, and then figure out what macros you
86need to define.  You will need to add them to the
87<code>CPLUSPLUS_CPP_SPEC</code> macro in the GCC configuration file for your
88target.  It will not work to simply define these macros in
89<code>os_defines.h</code>.
90   </para>
91
92   <para>At this time, there are a few libstdc++-specific macros which may be
93defined:
94   </para>
95
96   <para><code>_GLIBCXX_USE_C99_CHECK</code> may be defined to 1 to check C99
97function declarations (which are not covered by specialization below)
98found in system headers against versions found in the library headers
99derived from the standard.
100   </para>
101
102   <para><code>_GLIBCXX_USE_C99_DYNAMIC</code> may be defined to an expression that
103yields 0 if and only if the system headers are exposing proper support
104for C99 functions (which are not covered by specialization below).  If
105defined, it must be 0 while bootstrapping the compiler/rebuilding the
106library.
107   </para>
108
109   <para><code>_GLIBCXX_USE_C99_LONG_LONG_CHECK</code> may be defined to 1 to check
110the set of C99 long long function declarations found in system headers
111against versions found in the library headers derived from the
112standard.
113
114   </para>
115   <para><code>_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC</code> may be defined to an
116expression that yields 0 if and only if the system headers are
117exposing proper support for the set of C99 long long functions.  If
118defined, it must be 0 while bootstrapping the compiler/rebuilding the
119library.
120   </para>
121   <para><code>_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC</code> may be defined to an
122expression that yields 0 if and only if the system headers
123are exposing proper support for the related set of macros.  If defined,
124it must be 0 while bootstrapping the compiler/rebuilding the library.
125   </para>
126   <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK</code> may be defined
127to 1 to check the related set of function declarations found in system
128headers against versions found in the library headers derived from
129the standard.
130   </para>
131   <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC</code> may be defined
132to an expression that yields 0 if and only if the system headers
133are exposing proper support for the related set of functions.  If defined,
134it must be 0 while bootstrapping the compiler/rebuilding the library.
135   </para>
136   <para><code>_GLIBCXX_NO_OBSOLETE_ISINF_ISNAN_DYNAMIC</code> may be defined
137to an expression that yields 0 if and only if the system headers
138are exposing non-standard <code>isinf(double)</code> and
139<code>isnan(double)</code> functions in the global namespace. Those functions
140should be detected automatically by the <code>configure</code> script when
141libstdc++ is built but if their presence depends on compilation flags or
142other macros the static configuration can be overridden.
143   </para>
144   <para>Finally, you should bracket the entire file in an include-guard, like
145this:
146   </para>
147
148<programlisting>
149
150#ifndef _GLIBCXX_OS_DEFINES
151#define _GLIBCXX_OS_DEFINES
152...
153#endif
154</programlisting>
155
156   <para>We recommend copying an existing <code>os_defines.h</code> to use as a
157starting point.
158   </para>
159</section>
160
161
162<section xml:id="internals.cpu"><info><title>CPU</title></info>
163
164
165<para>If you are porting to a new chip (as opposed to a new operating system
166running on an existing chip), you will need to create a new directory in the
167<code>config/cpu</code> hierarchy.  Much like the <link linkend="internals.os">Operating system</link> setup,
168there are no strict rules on how to organize the CPU configuration
169directory, but careful naming choices will allow the configury to find your
170setup files without explicit help.
171</para>
172
173   <para>We recommend that for a target triplet <code>&lt;CPU&gt;-&lt;vendor&gt;-&lt;OS&gt;</code>, you
174name your configuration directory <code>config/cpu/&lt;CPU&gt;</code>.  If you do this,
175the configury will find the directory by itself.  Otherwise you will need to
176edit the <code>configure.host</code> file and, in the switch statement that sets
177<code>cpu_include_dir</code>, add a pattern to handle your chip.
178   </para>
179
180   <para>Note that some chip families share a single configuration directory, for
181example, <code>alpha</code>, <code>alphaev5</code>, and <code>alphaev6</code> all use the
182<code>config/cpu/alpha</code> directory, and there is an entry in the
183<code>configure.host</code> switch statement to handle this.
184   </para>
185
186   <para>The <code>cpu_include_dir</code> sets default locations for the files controlling
187<link linkend="internals.thread_safety">Thread safety</link> and <link linkend="internals.numeric_limits">Numeric limits</link>, if the defaults are not
188appropriate for your chip.
189   </para>
190
191</section>
192
193
194<section xml:id="internals.char_types"><info><title>Character Types</title></info>
195
196
197<para>The library requires that you provide three header files to implement
198character classification, analogous to that provided by the C libraries
199<code>&lt;ctype.h&gt;</code> header.  You can model these on the files provided in
200<code>config/os/generic</code>.  However, these files will almost
201certainly need some modification.
202</para>
203
204   <para>The first file to write is <code>ctype_base.h</code>.  This file provides
205some very basic information about character classification.  The libstdc++
206library assumes that your C library implements <code>&lt;ctype.h&gt;</code> by using
207a table (indexed by character code) containing integers, where each of
208these integers is a bit-mask indicating whether the character is
209upper-case, lower-case, alphabetic, etc.  The <code>ctype_base.h</code>
210file gives the type of the integer, and the values of the various bit
211masks.  You will have to peer at your own <code>&lt;ctype.h&gt;</code> to figure out
212how to define the values required by this file.
213   </para>
214
215   <para>The <code>ctype_base.h</code> header file does not need include guards.
216It should contain a single <code>struct</code> definition called
217<code>ctype_base</code>.  This <code>struct</code> should contain two type
218declarations, and one enumeration declaration, like this example, taken
219from the IRIX configuration:
220   </para>
221
222<programlisting>
223  struct ctype_base
224     {
225       typedef unsigned int 	mask;
226       typedef int* 		__to_type;
227
228       enum
229       {
230	 space = _ISspace,
231	 print = _ISprint,
232	 cntrl = _IScntrl,
233	 upper = _ISupper,
234	 lower = _ISlower,
235	 alpha = _ISalpha,
236	 digit = _ISdigit,
237	 punct = _ISpunct,
238	 xdigit = _ISxdigit,
239	 alnum = _ISalnum,
240	 graph = _ISgraph
241       };
242     };
243</programlisting>
244
245<para>The <code>mask</code> type is the type of the elements in the table.  If your
246C library uses a table to map lower-case numbers to upper-case numbers,
247and vice versa, you should define <code>__to_type</code> to be the type of the
248elements in that table.  If you don't mind taking a minor performance
249penalty, or if your library doesn't implement <code>toupper</code> and
250<code>tolower</code> in this way, you can pick any pointer-to-integer type,
251but you must still define the type.
252</para>
253
254   <para>The enumeration should give definitions for all the values in the above
255example, using the values from your native <code>&lt;ctype.h&gt;</code>.  They can
256be given symbolically (as above), or numerically, if you prefer.  You do
257not have to include <code>&lt;ctype.h&gt;</code> in this header; it will always be
258included before <code>ctype_base.h</code> is included.
259   </para>
260
261   <para>The next file to write is <code>ctype_configure_char.cc</code>.
262The first function that must be written is the <code>ctype&lt;char&gt;::ctype</code> constructor.  Here is the IRIX example:
263   </para>
264
265<programlisting>
266ctype&lt;char&gt;::ctype(const mask* __table = 0, bool __del = false,
267	   size_t __refs = 0)
268       : _Ctype_nois&lt;char&gt;(__refs), _M_del(__table != 0 &amp;&amp; __del),
269	 _M_toupper(NULL),
270	 _M_tolower(NULL),
271	 _M_ctable(NULL),
272	 _M_table(!__table
273		  ? (const mask*) (__libc_attr._ctype_tbl-&gt;_class + 1)
274		  : __table)
275       { }
276</programlisting>
277
278<para>There are two parts of this that you might choose to alter. The first,
279and most important, is the line involving <code>__libc_attr</code>.  That is
280IRIX system-dependent code that gets the base of the table mapping
281character codes to attributes.  You need to substitute code that obtains
282the address of this table on your system.  If you want to use your
283operating system's tables to map upper-case letters to lower-case, and
284vice versa, you should initialize <code>_M_toupper</code> and
285<code>_M_tolower</code> with those tables, in similar fashion.
286</para>
287
288   <para>Now, you have to write two functions to convert from upper-case to
289lower-case, and vice versa.  Here are the IRIX versions:
290   </para>
291
292<programlisting>
293     char
294     ctype&lt;char&gt;::do_toupper(char __c) const
295     { return _toupper(__c); }
296
297     char
298     ctype&lt;char&gt;::do_tolower(char __c) const
299     { return _tolower(__c); }
300</programlisting>
301
302<para>Your C library provides equivalents to IRIX's <code>_toupper</code> and
303<code>_tolower</code>.  If you initialized <code>_M_toupper</code> and
304<code>_M_tolower</code> above, then you could use those tables instead.
305</para>
306
307   <para>Finally, you have to provide two utility functions that convert strings
308of characters.  The versions provided here will always work - but you
309could use specialized routines for greater performance if you have
310machinery to do that on your system:
311   </para>
312
313<programlisting>
314     const char*
315     ctype&lt;char&gt;::do_toupper(char* __low, const char* __high) const
316     {
317       while (__low &lt; __high)
318	 {
319	   *__low = do_toupper(*__low);
320	   ++__low;
321	 }
322       return __high;
323     }
324
325     const char*
326     ctype&lt;char&gt;::do_tolower(char* __low, const char* __high) const
327     {
328       while (__low &lt; __high)
329	 {
330	   *__low = do_tolower(*__low);
331	   ++__low;
332	 }
333       return __high;
334     }
335</programlisting>
336
337   <para>You must also provide the <code>ctype_inline.h</code> file, which
338contains a few more functions.  On most systems, you can just copy
339<code>config/os/generic/ctype_inline.h</code> and use it on your system.
340   </para>
341
342   <para>In detail, the functions provided test characters for particular
343properties; they are analogous to the functions like <code>isalpha</code> and
344<code>islower</code> provided by the C library.
345   </para>
346
347   <para>The first function is implemented like this on IRIX:
348   </para>
349
350<programlisting>
351     bool
352     ctype&lt;char&gt;::
353     is(mask __m, char __c) const throw()
354     { return (_M_table)[(unsigned char)(__c)] &amp; __m; }
355</programlisting>
356
357<para>The <code>_M_table</code> is the table passed in above, in the constructor.
358This is the table that contains the bitmasks for each character.  The
359implementation here should work on all systems.
360</para>
361
362   <para>The next function is:
363   </para>
364
365<programlisting>
366     const char*
367     ctype&lt;char&gt;::
368     is(const char* __low, const char* __high, mask* __vec) const throw()
369     {
370       while (__low &lt; __high)
371	 *__vec++ = (_M_table)[(unsigned char)(*__low++)];
372       return __high;
373     }
374</programlisting>
375
376<para>This function is similar; it copies the masks for all the characters
377from <code>__low</code> up until <code>__high</code> into the vector given by
378<code>__vec</code>.
379</para>
380
381   <para>The last two functions again are entirely generic:
382   </para>
383
384<programlisting>
385     const char*
386     ctype&lt;char&gt;::
387     scan_is(mask __m, const char* __low, const char* __high) const throw()
388     {
389       while (__low &lt; __high &amp;&amp; !this-&gt;is(__m, *__low))
390	 ++__low;
391       return __low;
392     }
393
394     const char*
395     ctype&lt;char&gt;::
396     scan_not(mask __m, const char* __low, const char* __high) const throw()
397     {
398       while (__low &lt; __high &amp;&amp; this-&gt;is(__m, *__low))
399	 ++__low;
400       return __low;
401     }
402</programlisting>
403
404</section>
405
406
407<section xml:id="internals.thread_safety"><info><title>Thread Safety</title></info>
408
409
410<para>The C++ library string functionality requires a couple of atomic
411operations to provide thread-safety.  If you don't take any special
412action, the library will use stub versions of these functions that are
413not thread-safe.  They will work fine, unless your applications are
414multi-threaded.
415</para>
416
417   <para>If you want to provide custom, safe, versions of these functions, there
418are two distinct approaches.  One is to provide a version for your CPU,
419using assembly language constructs.  The other is to use the
420thread-safety primitives in your operating system.  In either case, you
421make a file called <code>atomicity.h</code>, and the variable
422<code>ATOMICITYH</code> must point to this file.
423   </para>
424
425   <para>If you are using the assembly-language approach, put this code in
426<code>config/cpu/&lt;chip&gt;/atomicity.h</code>, where chip is the name of
427your processor (see <link linkend="internals.cpu">CPU</link>).  No additional changes are necessary to
428locate the file in this case; <code>ATOMICITYH</code> will be set by default.
429   </para>
430
431   <para>If you are using the operating system thread-safety primitives approach,
432you can also put this code in the same CPU directory, in which case no more
433work is needed to locate the file.  For examples of this approach,
434see the <code>atomicity.h</code> file for IRIX or IA64.
435   </para>
436
437   <para>Alternatively, if the primitives are more closely related to the OS
438than they are to the CPU, you can put the <code>atomicity.h</code> file in
439the <link linkend="internals.os">Operating system</link> directory instead.  In this case, you must
440edit <code>configure.host</code>, and in the switch statement that handles
441operating systems, override the <code>ATOMICITYH</code> variable to point to
442the appropriate <code>os_include_dir</code>.  For examples of this approach,
443see the <code>atomicity.h</code> file for AIX.
444   </para>
445
446   <para>With those bits out of the way, you have to actually write
447<code>atomicity.h</code> itself.  This file should be wrapped in an
448include guard named <code>_GLIBCXX_ATOMICITY_H</code>.  It should define one
449type, and two functions.
450   </para>
451
452   <para>The type is <code>_Atomic_word</code>.  Here is the version used on IRIX:
453   </para>
454
455<programlisting>
456typedef long _Atomic_word;
457</programlisting>
458
459<para>This type must be a signed integral type supporting atomic operations.
460If you're using the OS approach, use the same type used by your system's
461primitives.  Otherwise, use the type for which your CPU provides atomic
462primitives.
463</para>
464
465   <para>Then, you must provide two functions.  The bodies of these functions
466must be equivalent to those provided here, but using atomic operations:
467   </para>
468
469<programlisting>
470     static inline _Atomic_word
471     __attribute__ ((__unused__))
472     __exchange_and_add (_Atomic_word* __mem, int __val)
473     {
474       _Atomic_word __result = *__mem;
475       *__mem += __val;
476       return __result;
477     }
478
479     static inline void
480     __attribute__ ((__unused__))
481     __atomic_add (_Atomic_word* __mem, int __val)
482     {
483       *__mem += __val;
484     }
485</programlisting>
486
487</section>
488
489
490<section xml:id="internals.numeric_limits"><info><title>Numeric Limits</title></info>
491
492
493<para>The C++ library requires information about the fundamental data types,
494such as the minimum and maximum representable values of each type.
495You can define each of these values individually, but it is usually
496easiest just to indicate how many bits are used in each of the data
497types and let the library do the rest.  For information about the
498macros to define, see the top of <code>include/bits/std_limits.h</code>.
499</para>
500
501   <para>If you need to define any macros, you can do so in <code>os_defines.h</code>.
502However, if all operating systems for your CPU are likely to use the
503same values, you can provide a CPU-specific file instead so that you
504do not have to provide the same definitions for each operating system.
505To take that approach, create a new file called <code>cpu_limits.h</code> in
506your CPU configuration directory (see <link linkend="internals.cpu">CPU</link>).
507   </para>
508
509</section>
510
511
512<section xml:id="internals.libtool"><info><title>Libtool</title></info>
513
514
515<para>The C++ library is compiled, archived and linked with libtool.
516Explaining the full workings of libtool is beyond the scope of this
517document, but there are a few, particular bits that are necessary for
518porting.
519</para>
520
521   <para>Some parts of the libstdc++ library are compiled with the libtool
522<code>--tags CXX</code> option (the C++ definitions for libtool).  Therefore,
523<code>ltcf-cxx.sh</code> in the top-level directory needs to have the correct
524logic to compile and archive objects equivalent to the C version of libtool,
525<code>ltcf-c.sh</code>.  Some libtool targets have definitions for C but not
526for C++, or C++ definitions which have not been kept up to date.
527   </para>
528
529   <para>The C++ run-time library contains initialization code that needs to be
530run as the library is loaded.  Often, that requires linking in special
531object files when the C++ library is built as a shared library, or
532taking other system-specific actions.
533   </para>
534
535   <para>The libstdc++ library is linked with the C version of libtool, even
536though it is a C++ library.  Therefore, the C version of libtool needs to
537ensure that the run-time library initializers are run.  The usual way to
538do this is to build the library using <code>gcc -shared</code>.
539   </para>
540
541   <para>If you need to change how the library is linked, look at
542<code>ltcf-c.sh</code> in the top-level directory.  Find the switch statement
543that sets <code>archive_cmds</code>.  Here, adjust the setting for your
544operating system.
545   </para>
546
547
548</section>
549
550</section>
551