1Notes on GCC's Native Language Support 2 3GCC's Native Language Support (NLS) is relatively new and 4experimental, so NLS is currently disabled by default. 5 6The main reason for it being buggy is, that GCC does not set the 7locale categories correctly. Currently only LC_MESSAGES is set if the 8system supports it and else nothing. To work correctly, GCC would have 9to also set the character set used by the terminal by either setting 10LC_CTYPE together with LC_MESSAGES or LC_ALL if LC_MESSAGES is 11not supported. 12 13This would change the behaviour of GCC in quite a few places because 14a number of standard C functions and macros change their behaviour 15depending on the locale. These necessary changes have been done in the 16development version, but these changes are beyond the scope 17of a maintenance release such as this. It is therefore recommended that 18you leave it disabled. 19 20If you still want to enable the feature, use configure's --enable-nls 21option to enable it. Eventually, NLS will be enabled by default, and 22you'll need --disable-nls to disable it. You must enable NLS in order 23to make a GCC distribution. 24 25By and large, only diagnostic messages have been internationalized. 26Some work remains in other areas; for example, GCC does not yet allow 27non-ASCII letters in identifiers. 28 29Not all of GCC's diagnostic messages have been internationalized. 30Programs like `enquire' and `genattr' are not internationalized, as 31their users are GCC maintainers who typically need to be able to read 32English anyway; internationalizing them would thus entail needless 33work for the human translators. And no one has yet gotten around to 34internationalizing the messages in the C++ compiler, or in the 35specialized MIPS-specific programs mips-tdump and mips-tfile. 36 37The GCC library should not contain any messages that need 38internationalization, because it operates below the 39internationalization library. 40 41Currently, the only language translation supplied is en_UK (British English). 42 43Unlike some other GNU programs, the GCC sources contain few instances 44of explicit translation calls like _("string"). Instead, the 45diagnostic printing routines automatically translate their arguments. 46For example, GCC source code should not contain calls like `error 47(_("unterminated comment"))'; it should contain calls like `error 48("unterminated comment")' instead, as it is the `error' function's 49responsibility to translate the message before the user sees it. 50 51By convention, any function parameter in the GCC sources whose name 52ends in `msgid' is expected to be a message requiring translation. 53For example, the `error' function's first parameter is named `msgid'. 54GCC's exgettext script uses this convention to determine which 55function parameter strings need to be translated. The exgettext 56script also assumes that any occurrence of `%eMSGID}' on a source 57line, where MSGID does not contain `%' or `}', corresponds to a 58message MSGID that requires translation; this is needed to identify 59diagnostics in GCC spec strings. 60 61If you enable NLS and modify source files, you'll need to use a 62special version of the GNU gettext package to propagate the 63modifications to the translation tables. Apply the following patch 64(use `patch -p0') to GNU gettext 0.10.35, which you can retrieve from: 65 66ftp://alpha.gnu.org/gnu/gettext-0.10.35.tar.gz 67 68This patch has been submitted to the GNU gettext maintainer, so 69eventually we shouldn't need this special gettext version. 70 71This patch is free software; you can redistribute it and/or modify 72it under the terms of the GNU General Public License as published by 73the Free Software Foundation; either version 2, or (at your option) 74any later version. 75 76This patch is distributed in the hope that it will be useful, 77but WITHOUT ANY WARRANTY; without even the implied warranty of 78MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 79GNU General Public License for more details. 80 81You should have received a copy of the GNU General Public License 82along with this patch; see the file COPYING. If not, write to 83the Free Software Foundation, 59 Temple Place - Suite 330, 84Boston, MA 02111-1307, USA. 85 861998-07-26 Paul Eggert <eggert@twinsun.com> 87 88 * po/Makefile.in.in (maintainer-clean): Remove cat-id-tbl.c and 89 stamp-cat-id. 90 911998-07-24 Paul Eggert <eggert@twinsun.com> 92 93 * po/Makefile.in.in (cat-id-tbl.o): Depend on 94 $(top_srcdir)/intl/libgettext.h, not ../intl/libgettext.h. 95 961998-07-20 Paul Eggert <eggert@twinsun.com> 97 98 * po/Makefile.in.in (.po.pox, all-yes, $(srcdir)/cat-id-tbl.c, 99 $(srcdir)/stamp-cat-id, update-po): Prepend `$(srcdir)/' to 100 files built in the source directory; this is needed for 101 VPATH-based make in Solaris 2.6. 102 1031998-07-17 Paul Eggert <eggert@twinsun.com> 104 105 Add support for user-specified argument numbers for keywords. 106 Extract all strings from a keyword arg, not just the first one. 107 Handle parenthesized commas inside keyword args correctly. 108 Warn about nested keywords. 109 110 * doc/gettext.texi: Document --keyword=id:argnum. 111 112 * src/xgettext.c (scan_c_file): 113 Warn about nested keywords, e.g. _(_("xxx")). 114 Warn also about not-yet-implemented but allowed nesting, e.g. 115 dcgettext(..._("xxx")..., "yyy"). 116 Get all strings in a keyword arg, not just the first one. 117 Handle parenthesized commas inside keyword args correctly. 118 119 * src/xget-lex.h (enum xgettext_token_type_ty): 120 Replace xgettext_token_type_keyword1 and 121 xgettext_token_type_keyword2 with just plain 122 xgettext_token_type_keyword; it now has argnum value. 123 Add xgettext_token_type_rp. 124 (struct xgettext_token_ty): Add argnum member. 125 line_number and file_name are now also set for 126 xgettext_token_type_keyword. 127 (xgettext_lex_keyword): Arg is const char *. 128 129 * src/xget-lex.c: Include "hash.h". 130 (enum token_type_ty): Add token_type_rp. 131 (keywords): Now a hash table. 132 (phase5_get): Return token_type_rp for ')'. 133 (xgettext_lex, xgettext_lex_keyword): Add support for keyword argnums. 134 (xgettext_lex): Return xgettext_token_type_rp for ')'. 135 Report keyword argnum, line number, and file name back to caller. 136 1371998-07-09 Paul Eggert <eggert@twinsun.com> 138 139 * intl/Makefile.in (uninstall): 140 Do nothing unless $(PACKAGE) is gettext. 141 142=================================================================== 143RCS file: doc/gettext.texi,v 144retrieving revision 0.10.35.0 145retrieving revision 0.10.35.1 146diff -pu -r0.10.35.0 -r0.10.35.1 147--- doc/gettext.texi 1998/05/01 05:53:32 0.10.35.0 148+++ doc/gettext.texi 1998/07/18 00:25:15 0.10.35.1 149@@ -1854,13 +1854,19 @@ List of directories searched for input f 150 Join messages with existing file. 151 152 @item -k @var{word} 153-@itemx --keyword[=@var{word}] 154-Additonal keyword to be looked for (without @var{word} means not to 155+@itemx --keyword[=@var{keywordspec}] 156+Additonal keyword to be looked for (without @var{keywordspec} means not to 157 use default keywords). 158 159-The default keywords, which are always looked for if not explicitly 160-disabled, are @code{gettext}, @code{dgettext}, @code{dcgettext} and 161-@code{gettext_noop}. 162+If @var{keywordspec} is a C identifer @var{id}, @code{xgettext} looks 163+for strings in the first argument of each call to the function or macro 164+@var{id}. If @var{keywordspec} is of the form 165+@samp{@var{id}:@var{argnum}}, @code{xgettext} looks for strings in the 166+@var{argnum}th argument of the call. 167+ 168+The default keyword specifications, which are always looked for if not 169+explicitly disabled, are @code{gettext}, @code{dgettext:2}, 170+@code{dcgettext:2} and @code{gettext_noop}. 171 172 @item -m [@var{string}] 173 @itemx --msgstr-prefix[=@var{string}] 174=================================================================== 175RCS file: intl/Makefile.in,v 176retrieving revision 0.10.35.0 177retrieving revision 0.10.35.1 178diff -pu -r0.10.35.0 -r0.10.35.1 179--- intl/Makefile.in 1998/04/27 21:53:18 0.10.35.0 180+++ intl/Makefile.in 1998/07/09 21:39:18 0.10.35.1 181@@ -143,10 +143,14 @@ install-data: all 182 installcheck: 183 184 uninstall: 185- dists="$(DISTFILES.common)"; \ 186- for file in $$dists; do \ 187- rm -f $(gettextsrcdir)/$$file; \ 188- done 189+ if test "$(PACKAGE)" = "gettext"; then \ 190+ dists="$(DISTFILES.common)"; \ 191+ for file in $$dists; do \ 192+ rm -f $(gettextsrcdir)/$$file; \ 193+ done 194+ else \ 195+ : ; \ 196+ fi 197 198 info dvi: 199 200=================================================================== 201RCS file: src/xget-lex.c,v 202retrieving revision 0.10.35.0 203retrieving revision 0.10.35.1 204diff -pu -r0.10.35.0 -r0.10.35.1 205--- src/xget-lex.c 1998/07/09 22:49:48 0.10.35.0 206+++ src/xget-lex.c 1998/07/18 00:25:15 0.10.35.1 207@@ -33,6 +33,7 @@ 208 #include "error.h" 209 #include "system.h" 210 #include "libgettext.h" 211+#include "hash.h" 212 #include "str-list.h" 213 #include "xget-lex.h" 214 215@@ -83,6 +84,7 @@ enum token_type_ty 216 token_type_eoln, 217 token_type_hash, 218 token_type_lp, 219+ token_type_rp, 220 token_type_comma, 221 token_type_name, 222 token_type_number, 223@@ -109,7 +111,7 @@ static FILE *fp; 224 static int trigraphs; 225 static int cplusplus_comments; 226 static string_list_ty *comment; 227-static string_list_ty *keywords; 228+static hash_table keywords; 229 static int default_keywords = 1; 230 231 /* These are for tracking whether comments count as immediately before 232@@ -941,6 +943,10 @@ phase5_get (tp) 233 tp->type = token_type_lp; 234 return; 235 236+ case ')': 237+ tp->type = token_type_rp; 238+ return; 239+ 240 case ',': 241 tp->type = token_type_comma; 242 return; 243@@ -1179,6 +1185,7 @@ xgettext_lex (tp) 244 while (1) 245 { 246 token_ty token; 247+ void *keyword_value; 248 249 phase8_get (&token); 250 switch (token.type) 251@@ -1213,17 +1220,20 @@ xgettext_lex (tp) 252 if (default_keywords) 253 { 254 xgettext_lex_keyword ("gettext"); 255- xgettext_lex_keyword ("dgettext"); 256- xgettext_lex_keyword ("dcgettext"); 257+ xgettext_lex_keyword ("dgettext:2"); 258+ xgettext_lex_keyword ("dcgettext:2"); 259 xgettext_lex_keyword ("gettext_noop"); 260 default_keywords = 0; 261 } 262 263- if (string_list_member (keywords, token.string)) 264- { 265- tp->type = (strcmp (token.string, "dgettext") == 0 266- || strcmp (token.string, "dcgettext") == 0) 267- ? xgettext_token_type_keyword2 : xgettext_token_type_keyword1; 268+ if (find_entry (&keywords, token.string, strlen (token.string), 269+ &keyword_value) 270+ == 0) 271+ { 272+ tp->type = xgettext_token_type_keyword; 273+ tp->argnum = (int) keyword_value; 274+ tp->line_number = token.line_number; 275+ tp->file_name = logical_file_name; 276 } 277 else 278 tp->type = xgettext_token_type_symbol; 279@@ -1236,6 +1246,12 @@ xgettext_lex (tp) 280 tp->type = xgettext_token_type_lp; 281 return; 282 283+ case token_type_rp: 284+ last_non_comment_line = newline_count; 285+ 286+ tp->type = xgettext_token_type_rp; 287+ return; 288+ 289 case token_type_comma: 290 last_non_comment_line = newline_count; 291 292@@ -1263,16 +1279,32 @@ xgettext_lex (tp) 293 294 void 295 xgettext_lex_keyword (name) 296- char *name; 297+ const char *name; 298 { 299 if (name == NULL) 300 default_keywords = 0; 301 else 302 { 303- if (keywords == NULL) 304- keywords = string_list_alloc (); 305+ int argnum; 306+ size_t len; 307+ const char *sp; 308+ 309+ if (keywords.table == NULL) 310+ init_hash (&keywords, 100); 311+ 312+ sp = strchr (name, ':'); 313+ if (sp) 314+ { 315+ len = sp - name; 316+ argnum = atoi (sp + 1); 317+ } 318+ else 319+ { 320+ len = strlen (name); 321+ argnum = 1; 322+ } 323 324- string_list_append_unique (keywords, name); 325+ insert_entry (&keywords, name, len, (void *) argnum); 326 } 327 } 328 329=================================================================== 330RCS file: src/xget-lex.h,v 331retrieving revision 0.10.35.0 332retrieving revision 0.10.35.1 333diff -pu -r0.10.35.0 -r0.10.35.1 334--- src/xget-lex.h 1998/07/09 22:49:48 0.10.35.0 335+++ src/xget-lex.h 1998/07/18 00:25:15 0.10.35.1 336@@ -23,9 +23,9 @@ Foundation, Inc., 59 Temple Place - Suit 337 enum xgettext_token_type_ty 338 { 339 xgettext_token_type_eof, 340- xgettext_token_type_keyword1, 341- xgettext_token_type_keyword2, 342+ xgettext_token_type_keyword, 343 xgettext_token_type_lp, 344+ xgettext_token_type_rp, 345 xgettext_token_type_comma, 346 xgettext_token_type_string_literal, 347 xgettext_token_type_symbol 348@@ -37,8 +37,14 @@ struct xgettext_token_ty 349 { 350 xgettext_token_type_ty type; 351 352- /* These 3 are only set for xgettext_token_type_string_literal. */ 353+ /* This 1 is set only for xgettext_token_type_keyword. */ 354+ int argnum; 355+ 356+ /* This 1 is set only for xgettext_token_type_string_literal. */ 357 char *string; 358+ 359+ /* These 2 are set only for xgettext_token_type_keyword and 360+ xgettext_token_type_string_literal. */ 361 int line_number; 362 char *file_name; 363 }; 364@@ -50,7 +56,7 @@ void xgettext_lex PARAMS ((xgettext_toke 365 const char *xgettext_lex_comment PARAMS ((size_t __n)); 366 void xgettext_lex_comment_reset PARAMS ((void)); 367 /* void xgettext_lex_filepos PARAMS ((char **, int *)); FIXME needed? */ 368-void xgettext_lex_keyword PARAMS ((char *__name)); 369+void xgettext_lex_keyword PARAMS ((const char *__name)); 370 void xgettext_lex_cplusplus PARAMS ((void)); 371 void xgettext_lex_trigraphs PARAMS ((void)); 372 373=================================================================== 374RCS file: src/xgettext.c,v 375retrieving revision 0.10.35.0 376retrieving revision 0.10.35.1 377diff -pu -r0.10.35.0 -r0.10.35.1 378--- src/xgettext.c 1998/07/09 22:49:48 0.10.35.0 379+++ src/xgettext.c 1998/07/18 00:25:15 0.10.35.1 380@@ -835,6 +835,8 @@ scan_c_file(filename, mlp, is_cpp_file) 381 int is_cpp_file; 382 { 383 int state; 384+ int commas_to_skip; /* defined only when in states 1 and 2 */ 385+ int paren_nesting; /* defined only when in state 2 */ 386 387 /* Inform scanner whether we have C++ files or not. */ 388 if (is_cpp_file) 389@@ -854,63 +856,79 @@ scan_c_file(filename, mlp, is_cpp_file) 390 { 391 xgettext_token_ty token; 392 393- /* A simple state machine is used to do the recognising: 394+ /* A state machine is used to do the recognising: 395 State 0 = waiting for something to happen 396- State 1 = seen one of our keywords with string in first parameter 397- State 2 = was in state 1 and now saw a left paren 398- State 3 = seen one of our keywords with string in second parameter 399- State 4 = was in state 3 and now saw a left paren 400- State 5 = waiting for comma after being in state 4 401- State 6 = saw comma after being in state 5 */ 402+ State 1 = seen one of our keywords 403+ State 2 = waiting for part of an argument */ 404 xgettext_lex (&token); 405 switch (token.type) 406 { 407- case xgettext_token_type_keyword1: 408+ case xgettext_token_type_keyword: 409+ if (!extract_all && state == 2) 410+ { 411+ if (commas_to_skip == 0) 412+ { 413+ error (0, 0, 414+ _("%s:%d: warning: keyword nested in keyword arg"), 415+ token.file_name, token.line_number); 416+ continue; 417+ } 418+ 419+ /* Here we should nest properly, but this would require a 420+ potentially unbounded stack. We haven't run across an 421+ example that needs this functionality yet. For now, 422+ we punt and forget the outer keyword. */ 423+ error (0, 0, 424+ _("%s:%d: warning: keyword between outer keyword and its arg"), 425+ token.file_name, token.line_number); 426+ } 427+ commas_to_skip = token.argnum - 1; 428 state = 1; 429 continue; 430 431- case xgettext_token_type_keyword2: 432- state = 3; 433- continue; 434- 435 case xgettext_token_type_lp: 436 switch (state) 437 { 438 case 1: 439+ paren_nesting = 0; 440 state = 2; 441 break; 442- case 3: 443- state = 4; 444+ case 2: 445+ paren_nesting++; 446 break; 447- default: 448- state = 0; 449 } 450 continue; 451 452+ case xgettext_token_type_rp: 453+ if (state == 2 && paren_nesting != 0) 454+ paren_nesting--; 455+ else 456+ state = 0; 457+ continue; 458+ 459 case xgettext_token_type_comma: 460- state = state == 5 ? 6 : 0; 461+ if (state == 2 && commas_to_skip != 0) 462+ commas_to_skip -= paren_nesting == 0; 463+ else 464+ state = 0; 465 continue; 466 467 case xgettext_token_type_string_literal: 468- if (extract_all || state == 2 || state == 6) 469- { 470- remember_a_message (mlp, &token); 471- state = 0; 472- } 473+ if (extract_all || (state == 2 && commas_to_skip == 0)) 474+ remember_a_message (mlp, &token); 475 else 476 { 477 free (token.string); 478- state = (state == 4 || state == 5) ? 5 : 0; 479+ state = state == 2 ? 2 : 0; 480 } 481 continue; 482 483 case xgettext_token_type_symbol: 484- state = (state == 4 || state == 5) ? 5 : 0; 485+ state = state == 2 ? 2 : 0; 486 continue; 487 488 default: 489- state = 0; 490- continue; 491+ abort (); 492 493 case xgettext_token_type_eof: 494 break; 495=================================================================== 496RCS file: po/Makefile.in.in,v 497retrieving revision 0.10.35.0 498retrieving revision 0.10.35.5 499diff -u -r0.10.35.0 -r0.10.35.5 500--- po/Makefile.in.in 1998/07/20 20:20:38 0.10.35.0 501+++ po/Makefile.in.in 1998/07/26 09:07:52 0.10.35.5 502@@ -62,7 +62,7 @@ 503 $(COMPILE) $< 504 505 .po.pox: 506- $(MAKE) $(PACKAGE).pot 507+ $(MAKE) $(srcdir)/$(PACKAGE).pot 508 $(MSGMERGE) $< $(srcdir)/$(PACKAGE).pot -o $*.pox 509 510 .po.mo: 511@@ -79,7 +79,7 @@ 512 513 all: all-@USE_NLS@ 514 515-all-yes: cat-id-tbl.c $(CATALOGS) 516+all-yes: $(srcdir)/cat-id-tbl.c $(CATALOGS) 517 all-no: 518 519 $(srcdir)/$(PACKAGE).pot: $(POTFILES) 520@@ -90,8 +90,8 @@ 521 || ( rm -f $(srcdir)/$(PACKAGE).pot \ 522 && mv $(PACKAGE).po $(srcdir)/$(PACKAGE).pot ) 523 524-$(srcdir)/cat-id-tbl.c: stamp-cat-id; @: 525-$(srcdir)/stamp-cat-id: $(PACKAGE).pot 526+$(srcdir)/cat-id-tbl.c: $(srcdir)/stamp-cat-id; @: 527+$(srcdir)/stamp-cat-id: $(srcdir)/$(PACKAGE).pot 528 rm -f cat-id-tbl.tmp 529 sed -f ../intl/po2tbl.sed $(srcdir)/$(PACKAGE).pot \ 530 | sed -e "s/@PACKAGE NAME@/$(PACKAGE)/" > cat-id-tbl.tmp 531@@ -180,7 +180,8 @@ 532 533 check: all 534 535-cat-id-tbl.o: ../intl/libgettext.h 536+cat-id-tbl.o: $(srcdir)/cat-id-tbl.c $(top_srcdir)/intl/libgettext.h 537+ $(COMPILE) $(srcdir)/cat-id-tbl.c 538 539 dvi info tags TAGS ID: 540 541@@ -196,7 +197,7 @@ 542 maintainer-clean: distclean 543 @echo "This command is intended for maintainers to use;" 544 @echo "it deletes files that may require special tools to rebuild." 545- rm -f $(GMOFILES) 546+ rm -f $(GMOFILES) cat-id-tbl.c stamp-cat-id 547 548 distdir = ../$(PACKAGE)-$(VERSION)/$(subdir) 549 dist distdir: update-po $(DISTFILES) 550@@ -207,7 +208,7 @@ 551 done 552 553 update-po: Makefile 554- $(MAKE) $(PACKAGE).pot 555+ $(MAKE) $(srcdir)/$(PACKAGE).pot 556 PATH=`pwd`/../src:$$PATH; \ 557 cd $(srcdir); \ 558 catalogs='$(CATALOGS)'; \ 559