internals.xml revision 1.1.1.1
1<sect1 id="appendix.porting.internals" xreflabel="Portin Internals"> 2<?dbhtml filename="internals.html"?> 3 4<sect1info> 5 <keywordset> 6 <keyword> 7 ISO C++ 8 </keyword> 9 <keyword> 10 internals 11 </keyword> 12 </keywordset> 13</sect1info> 14 15<title>Porting to New Hardware or Operating Systems</title> 16 17<para> 18</para> 19 20 21<para>This document explains how to port libstdc++ (the GNU C++ library) to 22a new target. 23</para> 24 25 <para>In order to make the GNU C++ library (libstdc++) work with a new 26target, you must edit some configuration files and provide some new 27header files. Unless this is done, libstdc++ will use generic 28settings which may not be correct for your target; even if they are 29correct, they will likely be inefficient. 30 </para> 31 32 <para>Before you get started, make sure that you have a working C library on 33your target. The C library need not precisely comply with any 34particular standard, but should generally conform to the requirements 35imposed by the ANSI/ISO standard. 36 </para> 37 38 <para>In addition, you should try to verify that the C++ compiler generally 39works. It is difficult to test the C++ compiler without a working 40library, but you should at least try some minimal test cases. 41 </para> 42 43 <para>(Note that what we think of as a "target," the library refers to as 44a "host." The comment at the top of <code>configure.ac</code> explains why.) 45 </para> 46 47 48<sect2 id="internals.os"> 49<title>Operating System</title> 50 51<para>If you are porting to a new operating system (as opposed to a new chip 52using an existing operating system), you will need to create a new 53directory in the <code>config/os</code> hierarchy. For example, the IRIX 54configuration files are all in <code>config/os/irix</code>. There is no set 55way to organize the OS configuration directory. For example, 56<code>config/os/solaris/solaris-2.6</code> and 57<code>config/os/solaris/solaris-2.7</code> are used as configuration 58directories for these two versions of Solaris. On the other hand, both 59Solaris 2.7 and Solaris 2.8 use the <code>config/os/solaris/solaris-2.7</code> 60directory. The important information is that there needs to be a 61directory under <code>config/os</code> to store the files for your operating 62system. 63</para> 64 65 <para>You might have to change the <code>configure.host</code> file to ensure that 66your new directory is activated. Look for the switch statement that sets 67<code>os_include_dir</code>, and add a pattern to handle your operating system 68if the default will not suffice. The switch statement switches on only 69the OS portion of the standard target triplet; e.g., the <code>solaris2.8</code> 70in <code>sparc-sun-solaris2.8</code>. If the new directory is named after the 71OS portion of the triplet (the default), then nothing needs to be changed. 72 </para> 73 74 <para>The first file to create in this directory, should be called 75<code>os_defines.h</code>. This file contains basic macro definitions 76that are required to allow the C++ library to work with your C library. 77 </para> 78 79 <para>Several libstdc++ source files unconditionally define the macro 80<code>_POSIX_SOURCE</code>. On many systems, defining this macro causes 81large portions of the C library header files to be eliminated 82at preprocessing time. Therefore, you may have to <code>#undef</code> this 83macro, or define other macros (like <code>_LARGEFILE_SOURCE</code> or 84<code>__EXTENSIONS__</code>). You won't know what macros to define or 85undefine at this point; you'll have to try compiling the library and 86seeing what goes wrong. If you see errors about calling functions 87that have not been declared, look in your C library headers to see if 88the functions are declared there, and then figure out what macros you 89need to define. You will need to add them to the 90<code>CPLUSPLUS_CPP_SPEC</code> macro in the GCC configuration file for your 91target. It will not work to simply define these macros in 92<code>os_defines.h</code>. 93 </para> 94 95 <para>At this time, there are a few libstdc++-specific macros which may be 96defined: 97 </para> 98 99 <para><code>_GLIBCXX_USE_C99_CHECK</code> may be defined to 1 to check C99 100function declarations (which are not covered by specialization below) 101found in system headers against versions found in the library headers 102derived from the standard. 103 </para> 104 105 <para><code>_GLIBCXX_USE_C99_DYNAMIC</code> may be defined to an expression that 106yields 0 if and only if the system headers are exposing proper support 107for C99 functions (which are not covered by specialization below). If 108defined, it must be 0 while bootstrapping the compiler/rebuilding the 109library. 110 </para> 111 112 <para><code>_GLIBCXX_USE_C99_LONG_LONG_CHECK</code> may be defined to 1 to check 113the set of C99 long long function declarations found in system headers 114against versions found in the library headers derived from the 115standard. 116 117 </para> 118 <para><code>_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC</code> may be defined to an 119expression that yields 0 if and only if the system headers are 120exposing proper support for the set of C99 long long functions. If 121defined, it must be 0 while bootstrapping the compiler/rebuilding the 122library. 123 </para> 124 <para><code>_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC</code> may be defined to an 125expression that yields 0 if and only if the system headers 126are exposing proper support for the related set of macros. If defined, 127it must be 0 while bootstrapping the compiler/rebuilding the library. 128 </para> 129 <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK</code> may be defined 130to 1 to check the related set of function declarations found in system 131headers against versions found in the library headers derived from 132the standard. 133 </para> 134 <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC</code> may be defined 135to an expression that yields 0 if and only if the system headers 136are exposing proper support for the related set of functions. If defined, 137it must be 0 while bootstrapping the compiler/rebuilding the library. 138 </para> 139 <para>Finally, you should bracket the entire file in an include-guard, like 140this: 141 </para> 142 143<programlisting> 144 145#ifndef _GLIBCXX_OS_DEFINES 146#define _GLIBCXX_OS_DEFINES 147... 148#endif 149</programlisting> 150 151 <para>We recommend copying an existing <code>os_defines.h</code> to use as a 152starting point. 153 </para> 154</sect2> 155 156 157<sect2 id="internals.cpu"> 158<title>CPU</title> 159 160<para>If you are porting to a new chip (as opposed to a new operating system 161running on an existing chip), you will need to create a new directory in the 162<code>config/cpu</code> hierarchy. Much like the <link linkend="internals.os">Operating system</link> setup, 163there are no strict rules on how to organize the CPU configuration 164directory, but careful naming choices will allow the configury to find your 165setup files without explicit help. 166</para> 167 168 <para>We recommend that for a target triplet <code><CPU>-<vendor>-<OS></code>, you 169name your configuration directory <code>config/cpu/<CPU></code>. If you do this, 170the configury will find the directory by itself. Otherwise you will need to 171edit the <code>configure.host</code> file and, in the switch statement that sets 172<code>cpu_include_dir</code>, add a pattern to handle your chip. 173 </para> 174 175 <para>Note that some chip families share a single configuration directory, for 176example, <code>alpha</code>, <code>alphaev5</code>, and <code>alphaev6</code> all use the 177<code>config/cpu/alpha</code> directory, and there is an entry in the 178<code>configure.host</code> switch statement to handle this. 179 </para> 180 181 <para>The <code>cpu_include_dir</code> sets default locations for the files controlling 182<link linkend="internals.thread_safety">Thread safety</link> and <link linkend="internals.numeric_limits">Numeric limits</link>, if the defaults are not 183appropriate for your chip. 184 </para> 185 186</sect2> 187 188 189<sect2 id="internals.char_types"> 190<title>Character Types</title> 191 192<para>The library requires that you provide three header files to implement 193character classification, analogous to that provided by the C libraries 194<code><ctype.h></code> header. You can model these on the files provided in 195<code>config/os/generic</code>. However, these files will almost 196certainly need some modification. 197</para> 198 199 <para>The first file to write is <code>ctype_base.h</code>. This file provides 200some very basic information about character classification. The libstdc++ 201library assumes that your C library implements <code><ctype.h></code> by using 202a table (indexed by character code) containing integers, where each of 203these integers is a bit-mask indicating whether the character is 204upper-case, lower-case, alphabetic, etc. The <code>ctype_base.h</code> 205file gives the type of the integer, and the values of the various bit 206masks. You will have to peer at your own <code><ctype.h></code> to figure out 207how to define the values required by this file. 208 </para> 209 210 <para>The <code>ctype_base.h</code> header file does not need include guards. 211It should contain a single <code>struct</code> definition called 212<code>ctype_base</code>. This <code>struct</code> should contain two type 213declarations, and one enumeration declaration, like this example, taken 214from the IRIX configuration: 215 </para> 216 217<programlisting> 218 struct ctype_base 219 { 220 typedef unsigned int mask; 221 typedef int* __to_type; 222 223 enum 224 { 225 space = _ISspace, 226 print = _ISprint, 227 cntrl = _IScntrl, 228 upper = _ISupper, 229 lower = _ISlower, 230 alpha = _ISalpha, 231 digit = _ISdigit, 232 punct = _ISpunct, 233 xdigit = _ISxdigit, 234 alnum = _ISalnum, 235 graph = _ISgraph 236 }; 237 }; 238</programlisting> 239 240<para>The <code>mask</code> type is the type of the elements in the table. If your 241C library uses a table to map lower-case numbers to upper-case numbers, 242and vice versa, you should define <code>__to_type</code> to be the type of the 243elements in that table. If you don't mind taking a minor performance 244penalty, or if your library doesn't implement <code>toupper</code> and 245<code>tolower</code> in this way, you can pick any pointer-to-integer type, 246but you must still define the type. 247</para> 248 249 <para>The enumeration should give definitions for all the values in the above 250example, using the values from your native <code><ctype.h></code>. They can 251be given symbolically (as above), or numerically, if you prefer. You do 252not have to include <code><ctype.h></code> in this header; it will always be 253included before <code>ctype_base.h</code> is included. 254 </para> 255 256 <para>The next file to write is <code>ctype_noninline.h</code>, which also does 257not require include guards. This file defines a few member functions 258that will be included in <code>include/bits/locale_facets.h</code>. The first 259function that must be written is the <code>ctype<char>::ctype</code> 260constructor. Here is the IRIX example: 261 </para> 262 263<programlisting> 264ctype<char>::ctype(const mask* __table = 0, bool __del = false, 265 size_t __refs = 0) 266 : _Ctype_nois<char>(__refs), _M_del(__table != 0 && __del), 267 _M_toupper(NULL), 268 _M_tolower(NULL), 269 _M_ctable(NULL), 270 _M_table(!__table 271 ? (const mask*) (__libc_attr._ctype_tbl->_class + 1) 272 : __table) 273 { } 274</programlisting> 275 276<para>There are two parts of this that you might choose to alter. The first, 277and most important, is the line involving <code>__libc_attr</code>. That is 278IRIX system-dependent code that gets the base of the table mapping 279character codes to attributes. You need to substitute code that obtains 280the address of this table on your system. If you want to use your 281operating system's tables to map upper-case letters to lower-case, and 282vice versa, you should initialize <code>_M_toupper</code> and 283<code>_M_tolower</code> with those tables, in similar fashion. 284</para> 285 286 <para>Now, you have to write two functions to convert from upper-case to 287lower-case, and vice versa. Here are the IRIX versions: 288 </para> 289 290<programlisting> 291 char 292 ctype<char>::do_toupper(char __c) const 293 { return _toupper(__c); } 294 295 char 296 ctype<char>::do_tolower(char __c) const 297 { return _tolower(__c); } 298</programlisting> 299 300<para>Your C library provides equivalents to IRIX's <code>_toupper</code> and 301<code>_tolower</code>. If you initialized <code>_M_toupper</code> and 302<code>_M_tolower</code> above, then you could use those tables instead. 303</para> 304 305 <para>Finally, you have to provide two utility functions that convert strings 306of characters. The versions provided here will always work - but you 307could use specialized routines for greater performance if you have 308machinery to do that on your system: 309 </para> 310 311<programlisting> 312 const char* 313 ctype<char>::do_toupper(char* __low, const char* __high) const 314 { 315 while (__low < __high) 316 { 317 *__low = do_toupper(*__low); 318 ++__low; 319 } 320 return __high; 321 } 322 323 const char* 324 ctype<char>::do_tolower(char* __low, const char* __high) const 325 { 326 while (__low < __high) 327 { 328 *__low = do_tolower(*__low); 329 ++__low; 330 } 331 return __high; 332 } 333</programlisting> 334 335 <para>You must also provide the <code>ctype_inline.h</code> file, which 336contains a few more functions. On most systems, you can just copy 337<code>config/os/generic/ctype_inline.h</code> and use it on your system. 338 </para> 339 340 <para>In detail, the functions provided test characters for particular 341properties; they are analogous to the functions like <code>isalpha</code> and 342<code>islower</code> provided by the C library. 343 </para> 344 345 <para>The first function is implemented like this on IRIX: 346 </para> 347 348<programlisting> 349 bool 350 ctype<char>:: 351 is(mask __m, char __c) const throw() 352 { return (_M_table)[(unsigned char)(__c)] & __m; } 353</programlisting> 354 355<para>The <code>_M_table</code> is the table passed in above, in the constructor. 356This is the table that contains the bitmasks for each character. The 357implementation here should work on all systems. 358</para> 359 360 <para>The next function is: 361 </para> 362 363<programlisting> 364 const char* 365 ctype<char>:: 366 is(const char* __low, const char* __high, mask* __vec) const throw() 367 { 368 while (__low < __high) 369 *__vec++ = (_M_table)[(unsigned char)(*__low++)]; 370 return __high; 371 } 372</programlisting> 373 374<para>This function is similar; it copies the masks for all the characters 375from <code>__low</code> up until <code>__high</code> into the vector given by 376<code>__vec</code>. 377</para> 378 379 <para>The last two functions again are entirely generic: 380 </para> 381 382<programlisting> 383 const char* 384 ctype<char>:: 385 scan_is(mask __m, const char* __low, const char* __high) const throw() 386 { 387 while (__low < __high && !this->is(__m, *__low)) 388 ++__low; 389 return __low; 390 } 391 392 const char* 393 ctype<char>:: 394 scan_not(mask __m, const char* __low, const char* __high) const throw() 395 { 396 while (__low < __high && this->is(__m, *__low)) 397 ++__low; 398 return __low; 399 } 400</programlisting> 401 402</sect2> 403 404 405<sect2 id="internals.thread_safety"> 406<title>Thread Safety</title> 407 408<para>The C++ library string functionality requires a couple of atomic 409operations to provide thread-safety. If you don't take any special 410action, the library will use stub versions of these functions that are 411not thread-safe. They will work fine, unless your applications are 412multi-threaded. 413</para> 414 415 <para>If you want to provide custom, safe, versions of these functions, there 416are two distinct approaches. One is to provide a version for your CPU, 417using assembly language constructs. The other is to use the 418thread-safety primitives in your operating system. In either case, you 419make a file called <code>atomicity.h</code>, and the variable 420<code>ATOMICITYH</code> must point to this file. 421 </para> 422 423 <para>If you are using the assembly-language approach, put this code in 424<code>config/cpu/<chip>/atomicity.h</code>, where chip is the name of 425your processor (see <link linkend="internals.cpu">CPU</link>). No additional changes are necessary to 426locate the file in this case; <code>ATOMICITYH</code> will be set by default. 427 </para> 428 429 <para>If you are using the operating system thread-safety primitives approach, 430you can also put this code in the same CPU directory, in which case no more 431work is needed to locate the file. For examples of this approach, 432see the <code>atomicity.h</code> file for IRIX or IA64. 433 </para> 434 435 <para>Alternatively, if the primitives are more closely related to the OS 436than they are to the CPU, you can put the <code>atomicity.h</code> file in 437the <link linkend="internals.os">Operating system</link> directory instead. In this case, you must 438edit <code>configure.host</code>, and in the switch statement that handles 439operating systems, override the <code>ATOMICITYH</code> variable to point to 440the appropriate <code>os_include_dir</code>. For examples of this approach, 441see the <code>atomicity.h</code> file for AIX. 442 </para> 443 444 <para>With those bits out of the way, you have to actually write 445<code>atomicity.h</code> itself. This file should be wrapped in an 446include guard named <code>_GLIBCXX_ATOMICITY_H</code>. It should define one 447type, and two functions. 448 </para> 449 450 <para>The type is <code>_Atomic_word</code>. Here is the version used on IRIX: 451 </para> 452 453<programlisting> 454typedef long _Atomic_word; 455</programlisting> 456 457<para>This type must be a signed integral type supporting atomic operations. 458If you're using the OS approach, use the same type used by your system's 459primitives. Otherwise, use the type for which your CPU provides atomic 460primitives. 461</para> 462 463 <para>Then, you must provide two functions. The bodies of these functions 464must be equivalent to those provided here, but using atomic operations: 465 </para> 466 467<programlisting> 468 static inline _Atomic_word 469 __attribute__ ((__unused__)) 470 __exchange_and_add (_Atomic_word* __mem, int __val) 471 { 472 _Atomic_word __result = *__mem; 473 *__mem += __val; 474 return __result; 475 } 476 477 static inline void 478 __attribute__ ((__unused__)) 479 __atomic_add (_Atomic_word* __mem, int __val) 480 { 481 *__mem += __val; 482 } 483</programlisting> 484 485</sect2> 486 487 488<sect2 id="internals.numeric_limits"> 489<title>Numeric Limits</title> 490 491<para>The C++ library requires information about the fundamental data types, 492such as the minimum and maximum representable values of each type. 493You can define each of these values individually, but it is usually 494easiest just to indicate how many bits are used in each of the data 495types and let the library do the rest. For information about the 496macros to define, see the top of <code>include/bits/std_limits.h</code>. 497</para> 498 499 <para>If you need to define any macros, you can do so in <code>os_defines.h</code>. 500However, if all operating systems for your CPU are likely to use the 501same values, you can provide a CPU-specific file instead so that you 502do not have to provide the same definitions for each operating system. 503To take that approach, create a new file called <code>cpu_limits.h</code> in 504your CPU configuration directory (see <link linkend="internals.cpu">CPU</link>). 505 </para> 506 507</sect2> 508 509 510<sect2 id="internals.libtool"> 511<title>Libtool</title> 512 513<para>The C++ library is compiled, archived and linked with libtool. 514Explaining the full workings of libtool is beyond the scope of this 515document, but there are a few, particular bits that are necessary for 516porting. 517</para> 518 519 <para>Some parts of the libstdc++ library are compiled with the libtool 520<code>--tags CXX</code> option (the C++ definitions for libtool). Therefore, 521<code>ltcf-cxx.sh</code> in the top-level directory needs to have the correct 522logic to compile and archive objects equivalent to the C version of libtool, 523<code>ltcf-c.sh</code>. Some libtool targets have definitions for C but not 524for C++, or C++ definitions which have not been kept up to date. 525 </para> 526 527 <para>The C++ run-time library contains initialization code that needs to be 528run as the library is loaded. Often, that requires linking in special 529object files when the C++ library is built as a shared library, or 530taking other system-specific actions. 531 </para> 532 533 <para>The libstdc++ library is linked with the C version of libtool, even 534though it is a C++ library. Therefore, the C version of libtool needs to 535ensure that the run-time library initializers are run. The usual way to 536do this is to build the library using <code>gcc -shared</code>. 537 </para> 538 539 <para>If you need to change how the library is linked, look at 540<code>ltcf-c.sh</code> in the top-level directory. Find the switch statement 541that sets <code>archive_cmds</code>. Here, adjust the setting for your 542operating system. 543 </para> 544 545 546</sect2> 547 548</sect1> 549