internals.xml revision 1.1.1.1.2.1
1<section xmlns="http://docbook.org/ns/docbook" version="5.0" 2 xml:id="appendix.porting.internals" xreflabel="Portin Internals"> 3<?dbhtml filename="internals.html"?> 4 5<info><title>Porting to New Hardware or Operating Systems</title> 6 <keywordset> 7 <keyword>ISO C++</keyword> 8 <keyword>internals</keyword> 9 </keywordset> 10</info> 11 12 13 14<para> 15</para> 16 17 18<para>This document explains how to port libstdc++ (the GNU C++ library) to 19a new target. 20</para> 21 22 <para>In order to make the GNU C++ library (libstdc++) work with a new 23target, you must edit some configuration files and provide some new 24header files. Unless this is done, libstdc++ will use generic 25settings which may not be correct for your target; even if they are 26correct, they will likely be inefficient. 27 </para> 28 29 <para>Before you get started, make sure that you have a working C library on 30your target. The C library need not precisely comply with any 31particular standard, but should generally conform to the requirements 32imposed by the ANSI/ISO standard. 33 </para> 34 35 <para>In addition, you should try to verify that the C++ compiler generally 36works. It is difficult to test the C++ compiler without a working 37library, but you should at least try some minimal test cases. 38 </para> 39 40 <para>(Note that what we think of as a "target," the library refers to as 41a "host." The comment at the top of <code>configure.ac</code> explains why.) 42 </para> 43 44 45<section xml:id="internals.os"><info><title>Operating System</title></info> 46 47 48<para>If you are porting to a new operating system (as opposed to a new chip 49using an existing operating system), you will need to create a new 50directory in the <code>config/os</code> hierarchy. For example, the IRIX 51configuration files are all in <code>config/os/irix</code>. There is no set 52way to organize the OS configuration directory. For example, 53<code>config/os/solaris/solaris-2.6</code> and 54<code>config/os/solaris/solaris-2.7</code> are used as configuration 55directories for these two versions of Solaris. On the other hand, both 56Solaris 2.7 and Solaris 2.8 use the <code>config/os/solaris/solaris-2.7</code> 57directory. The important information is that there needs to be a 58directory under <code>config/os</code> to store the files for your operating 59system. 60</para> 61 62 <para>You might have to change the <code>configure.host</code> file to ensure that 63your new directory is activated. Look for the switch statement that sets 64<code>os_include_dir</code>, and add a pattern to handle your operating system 65if the default will not suffice. The switch statement switches on only 66the OS portion of the standard target triplet; e.g., the <code>solaris2.8</code> 67in <code>sparc-sun-solaris2.8</code>. If the new directory is named after the 68OS portion of the triplet (the default), then nothing needs to be changed. 69 </para> 70 71 <para>The first file to create in this directory, should be called 72<code>os_defines.h</code>. This file contains basic macro definitions 73that are required to allow the C++ library to work with your C library. 74 </para> 75 76 <para>Several libstdc++ source files unconditionally define the macro 77<code>_POSIX_SOURCE</code>. On many systems, defining this macro causes 78large portions of the C library header files to be eliminated 79at preprocessing time. Therefore, you may have to <code>#undef</code> this 80macro, or define other macros (like <code>_LARGEFILE_SOURCE</code> or 81<code>__EXTENSIONS__</code>). You won't know what macros to define or 82undefine at this point; you'll have to try compiling the library and 83seeing what goes wrong. If you see errors about calling functions 84that have not been declared, look in your C library headers to see if 85the functions are declared there, and then figure out what macros you 86need to define. You will need to add them to the 87<code>CPLUSPLUS_CPP_SPEC</code> macro in the GCC configuration file for your 88target. It will not work to simply define these macros in 89<code>os_defines.h</code>. 90 </para> 91 92 <para>At this time, there are a few libstdc++-specific macros which may be 93defined: 94 </para> 95 96 <para><code>_GLIBCXX_USE_C99_CHECK</code> may be defined to 1 to check C99 97function declarations (which are not covered by specialization below) 98found in system headers against versions found in the library headers 99derived from the standard. 100 </para> 101 102 <para><code>_GLIBCXX_USE_C99_DYNAMIC</code> may be defined to an expression that 103yields 0 if and only if the system headers are exposing proper support 104for C99 functions (which are not covered by specialization below). If 105defined, it must be 0 while bootstrapping the compiler/rebuilding the 106library. 107 </para> 108 109 <para><code>_GLIBCXX_USE_C99_LONG_LONG_CHECK</code> may be defined to 1 to check 110the set of C99 long long function declarations found in system headers 111against versions found in the library headers derived from the 112standard. 113 114 </para> 115 <para><code>_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC</code> may be defined to an 116expression that yields 0 if and only if the system headers are 117exposing proper support for the set of C99 long long functions. If 118defined, it must be 0 while bootstrapping the compiler/rebuilding the 119library. 120 </para> 121 <para><code>_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC</code> may be defined to an 122expression that yields 0 if and only if the system headers 123are exposing proper support for the related set of macros. If defined, 124it must be 0 while bootstrapping the compiler/rebuilding the library. 125 </para> 126 <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK</code> may be defined 127to 1 to check the related set of function declarations found in system 128headers against versions found in the library headers derived from 129the standard. 130 </para> 131 <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC</code> may be defined 132to an expression that yields 0 if and only if the system headers 133are exposing proper support for the related set of functions. If defined, 134it must be 0 while bootstrapping the compiler/rebuilding the library. 135 </para> 136 <para>Finally, you should bracket the entire file in an include-guard, like 137this: 138 </para> 139 140<programlisting> 141 142#ifndef _GLIBCXX_OS_DEFINES 143#define _GLIBCXX_OS_DEFINES 144... 145#endif 146</programlisting> 147 148 <para>We recommend copying an existing <code>os_defines.h</code> to use as a 149starting point. 150 </para> 151</section> 152 153 154<section xml:id="internals.cpu"><info><title>CPU</title></info> 155 156 157<para>If you are porting to a new chip (as opposed to a new operating system 158running on an existing chip), you will need to create a new directory in the 159<code>config/cpu</code> hierarchy. Much like the <link linkend="internals.os">Operating system</link> setup, 160there are no strict rules on how to organize the CPU configuration 161directory, but careful naming choices will allow the configury to find your 162setup files without explicit help. 163</para> 164 165 <para>We recommend that for a target triplet <code><CPU>-<vendor>-<OS></code>, you 166name your configuration directory <code>config/cpu/<CPU></code>. If you do this, 167the configury will find the directory by itself. Otherwise you will need to 168edit the <code>configure.host</code> file and, in the switch statement that sets 169<code>cpu_include_dir</code>, add a pattern to handle your chip. 170 </para> 171 172 <para>Note that some chip families share a single configuration directory, for 173example, <code>alpha</code>, <code>alphaev5</code>, and <code>alphaev6</code> all use the 174<code>config/cpu/alpha</code> directory, and there is an entry in the 175<code>configure.host</code> switch statement to handle this. 176 </para> 177 178 <para>The <code>cpu_include_dir</code> sets default locations for the files controlling 179<link linkend="internals.thread_safety">Thread safety</link> and <link linkend="internals.numeric_limits">Numeric limits</link>, if the defaults are not 180appropriate for your chip. 181 </para> 182 183</section> 184 185 186<section xml:id="internals.char_types"><info><title>Character Types</title></info> 187 188 189<para>The library requires that you provide three header files to implement 190character classification, analogous to that provided by the C libraries 191<code><ctype.h></code> header. You can model these on the files provided in 192<code>config/os/generic</code>. However, these files will almost 193certainly need some modification. 194</para> 195 196 <para>The first file to write is <code>ctype_base.h</code>. This file provides 197some very basic information about character classification. The libstdc++ 198library assumes that your C library implements <code><ctype.h></code> by using 199a table (indexed by character code) containing integers, where each of 200these integers is a bit-mask indicating whether the character is 201upper-case, lower-case, alphabetic, etc. The <code>ctype_base.h</code> 202file gives the type of the integer, and the values of the various bit 203masks. You will have to peer at your own <code><ctype.h></code> to figure out 204how to define the values required by this file. 205 </para> 206 207 <para>The <code>ctype_base.h</code> header file does not need include guards. 208It should contain a single <code>struct</code> definition called 209<code>ctype_base</code>. This <code>struct</code> should contain two type 210declarations, and one enumeration declaration, like this example, taken 211from the IRIX configuration: 212 </para> 213 214<programlisting> 215 struct ctype_base 216 { 217 typedef unsigned int mask; 218 typedef int* __to_type; 219 220 enum 221 { 222 space = _ISspace, 223 print = _ISprint, 224 cntrl = _IScntrl, 225 upper = _ISupper, 226 lower = _ISlower, 227 alpha = _ISalpha, 228 digit = _ISdigit, 229 punct = _ISpunct, 230 xdigit = _ISxdigit, 231 alnum = _ISalnum, 232 graph = _ISgraph 233 }; 234 }; 235</programlisting> 236 237<para>The <code>mask</code> type is the type of the elements in the table. If your 238C library uses a table to map lower-case numbers to upper-case numbers, 239and vice versa, you should define <code>__to_type</code> to be the type of the 240elements in that table. If you don't mind taking a minor performance 241penalty, or if your library doesn't implement <code>toupper</code> and 242<code>tolower</code> in this way, you can pick any pointer-to-integer type, 243but you must still define the type. 244</para> 245 246 <para>The enumeration should give definitions for all the values in the above 247example, using the values from your native <code><ctype.h></code>. They can 248be given symbolically (as above), or numerically, if you prefer. You do 249not have to include <code><ctype.h></code> in this header; it will always be 250included before <code>ctype_base.h</code> is included. 251 </para> 252 253 <para>The next file to write is <code>ctype_configure_char.cc</code>. 254The first function that must be written is the <code>ctype<char>::ctype</code> constructor. Here is the IRIX example: 255 </para> 256 257<programlisting> 258ctype<char>::ctype(const mask* __table = 0, bool __del = false, 259 size_t __refs = 0) 260 : _Ctype_nois<char>(__refs), _M_del(__table != 0 && __del), 261 _M_toupper(NULL), 262 _M_tolower(NULL), 263 _M_ctable(NULL), 264 _M_table(!__table 265 ? (const mask*) (__libc_attr._ctype_tbl->_class + 1) 266 : __table) 267 { } 268</programlisting> 269 270<para>There are two parts of this that you might choose to alter. The first, 271and most important, is the line involving <code>__libc_attr</code>. That is 272IRIX system-dependent code that gets the base of the table mapping 273character codes to attributes. You need to substitute code that obtains 274the address of this table on your system. If you want to use your 275operating system's tables to map upper-case letters to lower-case, and 276vice versa, you should initialize <code>_M_toupper</code> and 277<code>_M_tolower</code> with those tables, in similar fashion. 278</para> 279 280 <para>Now, you have to write two functions to convert from upper-case to 281lower-case, and vice versa. Here are the IRIX versions: 282 </para> 283 284<programlisting> 285 char 286 ctype<char>::do_toupper(char __c) const 287 { return _toupper(__c); } 288 289 char 290 ctype<char>::do_tolower(char __c) const 291 { return _tolower(__c); } 292</programlisting> 293 294<para>Your C library provides equivalents to IRIX's <code>_toupper</code> and 295<code>_tolower</code>. If you initialized <code>_M_toupper</code> and 296<code>_M_tolower</code> above, then you could use those tables instead. 297</para> 298 299 <para>Finally, you have to provide two utility functions that convert strings 300of characters. The versions provided here will always work - but you 301could use specialized routines for greater performance if you have 302machinery to do that on your system: 303 </para> 304 305<programlisting> 306 const char* 307 ctype<char>::do_toupper(char* __low, const char* __high) const 308 { 309 while (__low < __high) 310 { 311 *__low = do_toupper(*__low); 312 ++__low; 313 } 314 return __high; 315 } 316 317 const char* 318 ctype<char>::do_tolower(char* __low, const char* __high) const 319 { 320 while (__low < __high) 321 { 322 *__low = do_tolower(*__low); 323 ++__low; 324 } 325 return __high; 326 } 327</programlisting> 328 329 <para>You must also provide the <code>ctype_inline.h</code> file, which 330contains a few more functions. On most systems, you can just copy 331<code>config/os/generic/ctype_inline.h</code> and use it on your system. 332 </para> 333 334 <para>In detail, the functions provided test characters for particular 335properties; they are analogous to the functions like <code>isalpha</code> and 336<code>islower</code> provided by the C library. 337 </para> 338 339 <para>The first function is implemented like this on IRIX: 340 </para> 341 342<programlisting> 343 bool 344 ctype<char>:: 345 is(mask __m, char __c) const throw() 346 { return (_M_table)[(unsigned char)(__c)] & __m; } 347</programlisting> 348 349<para>The <code>_M_table</code> is the table passed in above, in the constructor. 350This is the table that contains the bitmasks for each character. The 351implementation here should work on all systems. 352</para> 353 354 <para>The next function is: 355 </para> 356 357<programlisting> 358 const char* 359 ctype<char>:: 360 is(const char* __low, const char* __high, mask* __vec) const throw() 361 { 362 while (__low < __high) 363 *__vec++ = (_M_table)[(unsigned char)(*__low++)]; 364 return __high; 365 } 366</programlisting> 367 368<para>This function is similar; it copies the masks for all the characters 369from <code>__low</code> up until <code>__high</code> into the vector given by 370<code>__vec</code>. 371</para> 372 373 <para>The last two functions again are entirely generic: 374 </para> 375 376<programlisting> 377 const char* 378 ctype<char>:: 379 scan_is(mask __m, const char* __low, const char* __high) const throw() 380 { 381 while (__low < __high && !this->is(__m, *__low)) 382 ++__low; 383 return __low; 384 } 385 386 const char* 387 ctype<char>:: 388 scan_not(mask __m, const char* __low, const char* __high) const throw() 389 { 390 while (__low < __high && this->is(__m, *__low)) 391 ++__low; 392 return __low; 393 } 394</programlisting> 395 396</section> 397 398 399<section xml:id="internals.thread_safety"><info><title>Thread Safety</title></info> 400 401 402<para>The C++ library string functionality requires a couple of atomic 403operations to provide thread-safety. If you don't take any special 404action, the library will use stub versions of these functions that are 405not thread-safe. They will work fine, unless your applications are 406multi-threaded. 407</para> 408 409 <para>If you want to provide custom, safe, versions of these functions, there 410are two distinct approaches. One is to provide a version for your CPU, 411using assembly language constructs. The other is to use the 412thread-safety primitives in your operating system. In either case, you 413make a file called <code>atomicity.h</code>, and the variable 414<code>ATOMICITYH</code> must point to this file. 415 </para> 416 417 <para>If you are using the assembly-language approach, put this code in 418<code>config/cpu/<chip>/atomicity.h</code>, where chip is the name of 419your processor (see <link linkend="internals.cpu">CPU</link>). No additional changes are necessary to 420locate the file in this case; <code>ATOMICITYH</code> will be set by default. 421 </para> 422 423 <para>If you are using the operating system thread-safety primitives approach, 424you can also put this code in the same CPU directory, in which case no more 425work is needed to locate the file. For examples of this approach, 426see the <code>atomicity.h</code> file for IRIX or IA64. 427 </para> 428 429 <para>Alternatively, if the primitives are more closely related to the OS 430than they are to the CPU, you can put the <code>atomicity.h</code> file in 431the <link linkend="internals.os">Operating system</link> directory instead. In this case, you must 432edit <code>configure.host</code>, and in the switch statement that handles 433operating systems, override the <code>ATOMICITYH</code> variable to point to 434the appropriate <code>os_include_dir</code>. For examples of this approach, 435see the <code>atomicity.h</code> file for AIX. 436 </para> 437 438 <para>With those bits out of the way, you have to actually write 439<code>atomicity.h</code> itself. This file should be wrapped in an 440include guard named <code>_GLIBCXX_ATOMICITY_H</code>. It should define one 441type, and two functions. 442 </para> 443 444 <para>The type is <code>_Atomic_word</code>. Here is the version used on IRIX: 445 </para> 446 447<programlisting> 448typedef long _Atomic_word; 449</programlisting> 450 451<para>This type must be a signed integral type supporting atomic operations. 452If you're using the OS approach, use the same type used by your system's 453primitives. Otherwise, use the type for which your CPU provides atomic 454primitives. 455</para> 456 457 <para>Then, you must provide two functions. The bodies of these functions 458must be equivalent to those provided here, but using atomic operations: 459 </para> 460 461<programlisting> 462 static inline _Atomic_word 463 __attribute__ ((__unused__)) 464 __exchange_and_add (_Atomic_word* __mem, int __val) 465 { 466 _Atomic_word __result = *__mem; 467 *__mem += __val; 468 return __result; 469 } 470 471 static inline void 472 __attribute__ ((__unused__)) 473 __atomic_add (_Atomic_word* __mem, int __val) 474 { 475 *__mem += __val; 476 } 477</programlisting> 478 479</section> 480 481 482<section xml:id="internals.numeric_limits"><info><title>Numeric Limits</title></info> 483 484 485<para>The C++ library requires information about the fundamental data types, 486such as the minimum and maximum representable values of each type. 487You can define each of these values individually, but it is usually 488easiest just to indicate how many bits are used in each of the data 489types and let the library do the rest. For information about the 490macros to define, see the top of <code>include/bits/std_limits.h</code>. 491</para> 492 493 <para>If you need to define any macros, you can do so in <code>os_defines.h</code>. 494However, if all operating systems for your CPU are likely to use the 495same values, you can provide a CPU-specific file instead so that you 496do not have to provide the same definitions for each operating system. 497To take that approach, create a new file called <code>cpu_limits.h</code> in 498your CPU configuration directory (see <link linkend="internals.cpu">CPU</link>). 499 </para> 500 501</section> 502 503 504<section xml:id="internals.libtool"><info><title>Libtool</title></info> 505 506 507<para>The C++ library is compiled, archived and linked with libtool. 508Explaining the full workings of libtool is beyond the scope of this 509document, but there are a few, particular bits that are necessary for 510porting. 511</para> 512 513 <para>Some parts of the libstdc++ library are compiled with the libtool 514<code>--tags CXX</code> option (the C++ definitions for libtool). Therefore, 515<code>ltcf-cxx.sh</code> in the top-level directory needs to have the correct 516logic to compile and archive objects equivalent to the C version of libtool, 517<code>ltcf-c.sh</code>. Some libtool targets have definitions for C but not 518for C++, or C++ definitions which have not been kept up to date. 519 </para> 520 521 <para>The C++ run-time library contains initialization code that needs to be 522run as the library is loaded. Often, that requires linking in special 523object files when the C++ library is built as a shared library, or 524taking other system-specific actions. 525 </para> 526 527 <para>The libstdc++ library is linked with the C version of libtool, even 528though it is a C++ library. Therefore, the C version of libtool needs to 529ensure that the run-time library initializers are run. The usual way to 530do this is to build the library using <code>gcc -shared</code>. 531 </para> 532 533 <para>If you need to change how the library is linked, look at 534<code>ltcf-c.sh</code> in the top-level directory. Find the switch statement 535that sets <code>archive_cmds</code>. Here, adjust the setting for your 536operating system. 537 </para> 538 539 540</section> 541 542</section> 543