1<HTML> 2<HEAD> 3<!-- This HTML file has been created by texi2html 1.52b 4 from gettext.texi on 29 December 2011 --> 5 6<META HTTP-EQUIV="content-type" CONTENT="text/html; charset=UTF-8"> 7<TITLE>GNU gettext utilities - 10 Producing Binary MO Files</TITLE> 8</HEAD> 9<BODY> 10Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_9.html">previous</A>, <A HREF="gettext_11.html">next</A>, <A HREF="gettext_25.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>. 11<P><HR><P> 12 13 14<H1><A NAME="SEC156" HREF="gettext_toc.html#TOC156">10 Producing Binary MO Files</A></H1> 15 16 17 18<H2><A NAME="SEC157" HREF="gettext_toc.html#TOC157">10.1 Invoking the <CODE>msgfmt</CODE> Program</A></H2> 19 20<P> 21<A NAME="IDX914"></A> 22<A NAME="IDX915"></A> 23 24<PRE> 25msgfmt [<VAR>option</VAR>] <VAR>filename</VAR>.po ... 26</PRE> 27 28<P> 29<A NAME="IDX916"></A> 30The <CODE>msgfmt</CODE> programs generates a binary message catalog from a textual 31translation description. 32 33</P> 34 35 36<H3><A NAME="SEC158" HREF="gettext_toc.html#TOC158">10.1.1 Input file location</A></H3> 37 38<DL COMPACT> 39 40<DT><SAMP>‘<VAR>filename</VAR>.po ...’</SAMP> 41<DD> 42<DT><SAMP>‘-D <VAR>directory</VAR>’</SAMP> 43<DD> 44<DT><SAMP>‘--directory=<VAR>directory</VAR>’</SAMP> 45<DD> 46<A NAME="IDX917"></A> 47<A NAME="IDX918"></A> 48Add <VAR>directory</VAR> to the list of directories. Source files are 49searched relative to this list of directories. The resulting <TT>‘.po’</TT> 50file will be written relative to the current directory, though. 51 52</DL> 53 54<P> 55If an input file is <SAMP>‘-’</SAMP>, standard input is read. 56 57</P> 58 59 60<H3><A NAME="SEC159" HREF="gettext_toc.html#TOC159">10.1.2 Operation mode</A></H3> 61 62<DL COMPACT> 63 64<DT><SAMP>‘-j’</SAMP> 65<DD> 66<DT><SAMP>‘--java’</SAMP> 67<DD> 68<A NAME="IDX919"></A> 69<A NAME="IDX920"></A> 70<A NAME="IDX921"></A> 71Java mode: generate a Java <CODE>ResourceBundle</CODE> class. 72 73<DT><SAMP>‘--java2’</SAMP> 74<DD> 75<A NAME="IDX922"></A> 76Like --java, and assume Java2 (JDK 1.2 or higher). 77 78<DT><SAMP>‘--csharp’</SAMP> 79<DD> 80<A NAME="IDX923"></A> 81<A NAME="IDX924"></A> 82C# mode: generate a .NET .dll file containing a subclass of 83<CODE>GettextResourceSet</CODE>. 84 85<DT><SAMP>‘--csharp-resources’</SAMP> 86<DD> 87<A NAME="IDX925"></A> 88<A NAME="IDX926"></A> 89C# resources mode: generate a .NET <TT>‘.resources’</TT> file. 90 91<DT><SAMP>‘--tcl’</SAMP> 92<DD> 93<A NAME="IDX927"></A> 94<A NAME="IDX928"></A> 95Tcl mode: generate a tcl/msgcat <TT>‘.msg’</TT> file. 96 97<DT><SAMP>‘--qt’</SAMP> 98<DD> 99<A NAME="IDX929"></A> 100<A NAME="IDX930"></A> 101Qt mode: generate a Qt <TT>‘.qm’</TT> file. 102 103</DL> 104 105 106 107<H3><A NAME="SEC160" HREF="gettext_toc.html#TOC160">10.1.3 Output file location</A></H3> 108 109<DL COMPACT> 110 111<DT><SAMP>‘-o <VAR>file</VAR>’</SAMP> 112<DD> 113<DT><SAMP>‘--output-file=<VAR>file</VAR>’</SAMP> 114<DD> 115<A NAME="IDX931"></A> 116<A NAME="IDX932"></A> 117Write output to specified file. 118 119<DT><SAMP>‘--strict’</SAMP> 120<DD> 121<A NAME="IDX933"></A> 122Direct the program to work strictly following the Uniforum/Sun 123implementation. Currently this only affects the naming of the output 124file. If this option is not given the name of the output file is the 125same as the domain name. If the strict Uniforum mode is enabled the 126suffix <TT>‘.mo’</TT> is added to the file name if it is not already 127present. 128 129We find this behaviour of Sun's implementation rather silly and so by 130default this mode is <EM>not</EM> selected. 131 132</DL> 133 134<P> 135If the output <VAR>file</VAR> is <SAMP>‘-’</SAMP>, output is written to standard output. 136 137</P> 138 139 140<H3><A NAME="SEC161" HREF="gettext_toc.html#TOC161">10.1.4 Output file location in Java mode</A></H3> 141 142<DL COMPACT> 143 144<DT><SAMP>‘-r <VAR>resource</VAR>’</SAMP> 145<DD> 146<DT><SAMP>‘--resource=<VAR>resource</VAR>’</SAMP> 147<DD> 148<A NAME="IDX934"></A> 149<A NAME="IDX935"></A> 150Specify the resource name. 151 152<DT><SAMP>‘-l <VAR>locale</VAR>’</SAMP> 153<DD> 154<DT><SAMP>‘--locale=<VAR>locale</VAR>’</SAMP> 155<DD> 156<A NAME="IDX936"></A> 157<A NAME="IDX937"></A> 158Specify the locale name, either a language specification of the form <VAR>ll</VAR> 159or a combined language and country specification of the form <VAR>ll_CC</VAR>. 160 161<DT><SAMP>‘-d <VAR>directory</VAR>’</SAMP> 162<DD> 163<A NAME="IDX938"></A> 164Specify the base directory of classes directory hierarchy. 165 166</DL> 167 168<P> 169The class name is determined by appending the locale name to the resource name, 170separated with an underscore. The <SAMP>‘-d’</SAMP> option is mandatory. The class 171is written under the specified directory. 172 173</P> 174 175 176<H3><A NAME="SEC162" HREF="gettext_toc.html#TOC162">10.1.5 Output file location in C# mode</A></H3> 177 178<DL COMPACT> 179 180<DT><SAMP>‘-r <VAR>resource</VAR>’</SAMP> 181<DD> 182<DT><SAMP>‘--resource=<VAR>resource</VAR>’</SAMP> 183<DD> 184<A NAME="IDX939"></A> 185<A NAME="IDX940"></A> 186Specify the resource name. 187 188<DT><SAMP>‘-l <VAR>locale</VAR>’</SAMP> 189<DD> 190<DT><SAMP>‘--locale=<VAR>locale</VAR>’</SAMP> 191<DD> 192<A NAME="IDX941"></A> 193<A NAME="IDX942"></A> 194Specify the locale name, either a language specification of the form <VAR>ll</VAR> 195or a combined language and country specification of the form <VAR>ll_CC</VAR>. 196 197<DT><SAMP>‘-d <VAR>directory</VAR>’</SAMP> 198<DD> 199<A NAME="IDX943"></A> 200Specify the base directory for locale dependent <TT>‘.dll’</TT> files. 201 202</DL> 203 204<P> 205The <SAMP>‘-l’</SAMP> and <SAMP>‘-d’</SAMP> options are mandatory. The <TT>‘.dll’</TT> file is 206written in a subdirectory of the specified directory whose name depends on the 207locale. 208 209</P> 210 211 212<H3><A NAME="SEC163" HREF="gettext_toc.html#TOC163">10.1.6 Output file location in Tcl mode</A></H3> 213 214<DL COMPACT> 215 216<DT><SAMP>‘-l <VAR>locale</VAR>’</SAMP> 217<DD> 218<DT><SAMP>‘--locale=<VAR>locale</VAR>’</SAMP> 219<DD> 220<A NAME="IDX944"></A> 221<A NAME="IDX945"></A> 222Specify the locale name, either a language specification of the form <VAR>ll</VAR> 223or a combined language and country specification of the form <VAR>ll_CC</VAR>. 224 225<DT><SAMP>‘-d <VAR>directory</VAR>’</SAMP> 226<DD> 227<A NAME="IDX946"></A> 228Specify the base directory of <TT>‘.msg’</TT> message catalogs. 229 230</DL> 231 232<P> 233The <SAMP>‘-l’</SAMP> and <SAMP>‘-d’</SAMP> options are mandatory. The <TT>‘.msg’</TT> file is 234written in the specified directory. 235 236</P> 237 238 239<H3><A NAME="SEC164" HREF="gettext_toc.html#TOC164">10.1.7 Input file syntax</A></H3> 240 241<DL COMPACT> 242 243<DT><SAMP>‘-P’</SAMP> 244<DD> 245<DT><SAMP>‘--properties-input’</SAMP> 246<DD> 247<A NAME="IDX947"></A> 248<A NAME="IDX948"></A> 249Assume the input files are Java ResourceBundles in Java <CODE>.properties</CODE> 250syntax, not in PO file syntax. 251 252<DT><SAMP>‘--stringtable-input’</SAMP> 253<DD> 254<A NAME="IDX949"></A> 255Assume the input files are NeXTstep/GNUstep localized resource files in 256<CODE>.strings</CODE> syntax, not in PO file syntax. 257 258</DL> 259 260 261 262<H3><A NAME="SEC165" HREF="gettext_toc.html#TOC165">10.1.8 Input file interpretation</A></H3> 263 264<DL COMPACT> 265 266<DT><SAMP>‘-c’</SAMP> 267<DD> 268<DT><SAMP>‘--check’</SAMP> 269<DD> 270<A NAME="IDX950"></A> 271<A NAME="IDX951"></A> 272Perform all the checks implied by <CODE>--check-format</CODE>, <CODE>--check-header</CODE>, 273<CODE>--check-domain</CODE>. 274 275<DT><SAMP>‘--check-format’</SAMP> 276<DD> 277<A NAME="IDX952"></A> 278<A NAME="IDX953"></A> 279Check language dependent format strings. 280 281If the string represents a format string used in a 282<CODE>printf</CODE>-like function both strings should have the same number of 283<SAMP>‘%’</SAMP> format specifiers, with matching types. If the flag 284<CODE>c-format</CODE> or <CODE>possible-c-format</CODE> appears in the special 285comment <KBD>#,</KBD> for this entry a check is performed. For example, the 286check will diagnose using <SAMP>‘%.*s’</SAMP> against <SAMP>‘%s’</SAMP>, or <SAMP>‘%d’</SAMP> 287against <SAMP>‘%s’</SAMP>, or <SAMP>‘%d’</SAMP> against <SAMP>‘%x’</SAMP>. It can even handle 288positional parameters. 289 290Normally the <CODE>xgettext</CODE> program automatically decides whether a 291string is a format string or not. This algorithm is not perfect, 292though. It might regard a string as a format string though it is not 293used in a <CODE>printf</CODE>-like function and so <CODE>msgfmt</CODE> might report 294errors where there are none. 295 296To solve this problem the programmer can dictate the decision to the 297<CODE>xgettext</CODE> program (see section <A HREF="gettext_15.html#SEC248">15.3.1 C Format Strings</A>). The translator should not 298consider removing the flag from the <KBD>#,</KBD> line. This "fix" would be 299reversed again as soon as <CODE>msgmerge</CODE> is called the next time. 300 301<DT><SAMP>‘--check-header’</SAMP> 302<DD> 303<A NAME="IDX954"></A> 304Verify presence and contents of the header entry. See section <A HREF="gettext_6.html#SEC44">6.2 Filling in the Header Entry</A>, 305for a description of the various fields in the header entry. 306 307<DT><SAMP>‘--check-domain’</SAMP> 308<DD> 309<A NAME="IDX955"></A> 310Check for conflicts between domain directives and the <CODE>--output-file</CODE> 311option 312 313<DT><SAMP>‘-C’</SAMP> 314<DD> 315<DT><SAMP>‘--check-compatibility’</SAMP> 316<DD> 317<A NAME="IDX956"></A> 318<A NAME="IDX957"></A> 319<A NAME="IDX958"></A> 320Check that GNU msgfmt behaves like X/Open msgfmt. This will give an error 321when attempting to use the GNU extensions. 322 323<DT><SAMP>‘--check-accelerators[=<VAR>char</VAR>]’</SAMP> 324<DD> 325<A NAME="IDX959"></A> 326<A NAME="IDX960"></A> 327<A NAME="IDX961"></A> 328<A NAME="IDX962"></A> 329Check presence of keyboard accelerators for menu items. This is based on 330the convention used in some GUIs that a keyboard accelerator in a menu 331item string is designated by an immediately preceding <SAMP>‘&’</SAMP> character. 332Sometimes a keyboard accelerator is also called "keyboard mnemonic". 333This check verifies that if the untranslated string has exactly one 334<SAMP>‘&’</SAMP> character, the translated string has exactly one <SAMP>‘&’</SAMP> as well. 335If this option is given with a <VAR>char</VAR> argument, this <VAR>char</VAR> should 336be a non-alphanumeric character and is used as keyboard accelerator mark 337instead of <SAMP>‘&’</SAMP>. 338 339<DT><SAMP>‘-f’</SAMP> 340<DD> 341<DT><SAMP>‘--use-fuzzy’</SAMP> 342<DD> 343<A NAME="IDX963"></A> 344<A NAME="IDX964"></A> 345<A NAME="IDX965"></A> 346Use fuzzy entries in output. Note that using this option is usually wrong, 347because fuzzy messages are exactly those which have not been validated by 348a human translator. 349 350</DL> 351 352 353 354<H3><A NAME="SEC166" HREF="gettext_toc.html#TOC166">10.1.9 Output details</A></H3> 355 356<DL COMPACT> 357 358<DT><SAMP>‘-a <VAR>number</VAR>’</SAMP> 359<DD> 360<DT><SAMP>‘--alignment=<VAR>number</VAR>’</SAMP> 361<DD> 362<A NAME="IDX966"></A> 363<A NAME="IDX967"></A> 364Align strings to <VAR>number</VAR> bytes (default: 1). 365 366<DT><SAMP>‘--no-hash’</SAMP> 367<DD> 368<A NAME="IDX968"></A> 369Don't include a hash table in the binary file. Lookup will be more expensive 370at run time (binary search instead of hash table lookup). 371 372</DL> 373 374 375 376<H3><A NAME="SEC167" HREF="gettext_toc.html#TOC167">10.1.10 Informative output</A></H3> 377 378<DL COMPACT> 379 380<DT><SAMP>‘-h’</SAMP> 381<DD> 382<DT><SAMP>‘--help’</SAMP> 383<DD> 384<A NAME="IDX969"></A> 385<A NAME="IDX970"></A> 386Display this help and exit. 387 388<DT><SAMP>‘-V’</SAMP> 389<DD> 390<DT><SAMP>‘--version’</SAMP> 391<DD> 392<A NAME="IDX971"></A> 393<A NAME="IDX972"></A> 394Output version information and exit. 395 396<DT><SAMP>‘--statistics’</SAMP> 397<DD> 398<A NAME="IDX973"></A> 399Print statistics about translations. 400 401<DT><SAMP>‘-v’</SAMP> 402<DD> 403<DT><SAMP>‘--verbose’</SAMP> 404<DD> 405<A NAME="IDX974"></A> 406<A NAME="IDX975"></A> 407Increase verbosity level. 408 409</DL> 410 411 412 413<H2><A NAME="SEC168" HREF="gettext_toc.html#TOC168">10.2 Invoking the <CODE>msgunfmt</CODE> Program</A></H2> 414 415<P> 416<A NAME="IDX976"></A> 417<A NAME="IDX977"></A> 418 419<PRE> 420msgunfmt [<VAR>option</VAR>] [<VAR>file</VAR>]... 421</PRE> 422 423<P> 424<A NAME="IDX978"></A> 425The <CODE>msgunfmt</CODE> program converts a binary message catalog to a 426Uniforum style .po file. 427 428</P> 429 430 431<H3><A NAME="SEC169" HREF="gettext_toc.html#TOC169">10.2.1 Operation mode</A></H3> 432 433<DL COMPACT> 434 435<DT><SAMP>‘-j’</SAMP> 436<DD> 437<DT><SAMP>‘--java’</SAMP> 438<DD> 439<A NAME="IDX979"></A> 440<A NAME="IDX980"></A> 441<A NAME="IDX981"></A> 442Java mode: input is a Java <CODE>ResourceBundle</CODE> class. 443 444<DT><SAMP>‘--csharp’</SAMP> 445<DD> 446<A NAME="IDX982"></A> 447<A NAME="IDX983"></A> 448C# mode: input is a .NET .dll file containing a subclass of 449<CODE>GettextResourceSet</CODE>. 450 451<DT><SAMP>‘--csharp-resources’</SAMP> 452<DD> 453<A NAME="IDX984"></A> 454<A NAME="IDX985"></A> 455C# resources mode: input is a .NET <TT>‘.resources’</TT> file. 456 457<DT><SAMP>‘--tcl’</SAMP> 458<DD> 459<A NAME="IDX986"></A> 460<A NAME="IDX987"></A> 461Tcl mode: input is a tcl/msgcat <TT>‘.msg’</TT> file. 462 463</DL> 464 465 466 467<H3><A NAME="SEC170" HREF="gettext_toc.html#TOC170">10.2.2 Input file location</A></H3> 468 469<DL COMPACT> 470 471<DT><SAMP>‘<VAR>file</VAR> ...’</SAMP> 472<DD> 473Input .mo files. 474 475</DL> 476 477<P> 478If no input <VAR>file</VAR> is given or if it is <SAMP>‘-’</SAMP>, standard input is read. 479 480</P> 481 482 483<H3><A NAME="SEC171" HREF="gettext_toc.html#TOC171">10.2.3 Input file location in Java mode</A></H3> 484 485<DL COMPACT> 486 487<DT><SAMP>‘-r <VAR>resource</VAR>’</SAMP> 488<DD> 489<DT><SAMP>‘--resource=<VAR>resource</VAR>’</SAMP> 490<DD> 491<A NAME="IDX988"></A> 492<A NAME="IDX989"></A> 493Specify the resource name. 494 495<DT><SAMP>‘-l <VAR>locale</VAR>’</SAMP> 496<DD> 497<DT><SAMP>‘--locale=<VAR>locale</VAR>’</SAMP> 498<DD> 499<A NAME="IDX990"></A> 500<A NAME="IDX991"></A> 501Specify the locale name, either a language specification of the form <VAR>ll</VAR> 502or a combined language and country specification of the form <VAR>ll_CC</VAR>. 503 504</DL> 505 506<P> 507The class name is determined by appending the locale name to the resource name, 508separated with an underscore. The class is located using the <CODE>CLASSPATH</CODE>. 509 510</P> 511 512 513<H3><A NAME="SEC172" HREF="gettext_toc.html#TOC172">10.2.4 Input file location in C# mode</A></H3> 514 515<DL COMPACT> 516 517<DT><SAMP>‘-r <VAR>resource</VAR>’</SAMP> 518<DD> 519<DT><SAMP>‘--resource=<VAR>resource</VAR>’</SAMP> 520<DD> 521<A NAME="IDX992"></A> 522<A NAME="IDX993"></A> 523Specify the resource name. 524 525<DT><SAMP>‘-l <VAR>locale</VAR>’</SAMP> 526<DD> 527<DT><SAMP>‘--locale=<VAR>locale</VAR>’</SAMP> 528<DD> 529<A NAME="IDX994"></A> 530<A NAME="IDX995"></A> 531Specify the locale name, either a language specification of the form <VAR>ll</VAR> 532or a combined language and country specification of the form <VAR>ll_CC</VAR>. 533 534<DT><SAMP>‘-d <VAR>directory</VAR>’</SAMP> 535<DD> 536<A NAME="IDX996"></A> 537Specify the base directory for locale dependent <TT>‘.dll’</TT> files. 538 539</DL> 540 541<P> 542The <SAMP>‘-l’</SAMP> and <SAMP>‘-d’</SAMP> options are mandatory. The <TT>‘.msg’</TT> file is 543located in a subdirectory of the specified directory whose name depends on the 544locale. 545 546</P> 547 548 549<H3><A NAME="SEC173" HREF="gettext_toc.html#TOC173">10.2.5 Input file location in Tcl mode</A></H3> 550 551<DL COMPACT> 552 553<DT><SAMP>‘-l <VAR>locale</VAR>’</SAMP> 554<DD> 555<DT><SAMP>‘--locale=<VAR>locale</VAR>’</SAMP> 556<DD> 557<A NAME="IDX997"></A> 558<A NAME="IDX998"></A> 559Specify the locale name, either a language specification of the form <VAR>ll</VAR> 560or a combined language and country specification of the form <VAR>ll_CC</VAR>. 561 562<DT><SAMP>‘-d <VAR>directory</VAR>’</SAMP> 563<DD> 564<A NAME="IDX999"></A> 565Specify the base directory of <TT>‘.msg’</TT> message catalogs. 566 567</DL> 568 569<P> 570The <SAMP>‘-l’</SAMP> and <SAMP>‘-d’</SAMP> options are mandatory. The <TT>‘.msg’</TT> file is 571located in the specified directory. 572 573</P> 574 575 576<H3><A NAME="SEC174" HREF="gettext_toc.html#TOC174">10.2.6 Output file location</A></H3> 577 578<DL COMPACT> 579 580<DT><SAMP>‘-o <VAR>file</VAR>’</SAMP> 581<DD> 582<DT><SAMP>‘--output-file=<VAR>file</VAR>’</SAMP> 583<DD> 584<A NAME="IDX1000"></A> 585<A NAME="IDX1001"></A> 586Write output to specified file. 587 588</DL> 589 590<P> 591The results are written to standard output if no output file is specified 592or if it is <SAMP>‘-’</SAMP>. 593 594</P> 595 596 597<H3><A NAME="SEC175" HREF="gettext_toc.html#TOC175">10.2.7 Output details</A></H3> 598 599<DL COMPACT> 600 601<DT><SAMP>‘--force-po’</SAMP> 602<DD> 603<A NAME="IDX1002"></A> 604Always write an output file even if it contains no message. 605 606<DT><SAMP>‘-i’</SAMP> 607<DD> 608<DT><SAMP>‘--indent’</SAMP> 609<DD> 610<A NAME="IDX1003"></A> 611<A NAME="IDX1004"></A> 612Write the .po file using indented style. 613 614<DT><SAMP>‘--strict’</SAMP> 615<DD> 616<A NAME="IDX1005"></A> 617Write out a strict Uniforum conforming PO file. Note that this 618Uniforum format should be avoided because it doesn't support the 619GNU extensions. 620 621<DT><SAMP>‘-p’</SAMP> 622<DD> 623<DT><SAMP>‘--properties-output’</SAMP> 624<DD> 625<A NAME="IDX1006"></A> 626<A NAME="IDX1007"></A> 627Write out a Java ResourceBundle in Java <CODE>.properties</CODE> syntax. Note 628that this file format doesn't support plural forms and silently drops 629obsolete messages. 630 631<DT><SAMP>‘--stringtable-output’</SAMP> 632<DD> 633<A NAME="IDX1008"></A> 634Write out a NeXTstep/GNUstep localized resource file in <CODE>.strings</CODE> syntax. 635Note that this file format doesn't support plural forms. 636 637<DT><SAMP>‘-w <VAR>number</VAR>’</SAMP> 638<DD> 639<DT><SAMP>‘--width=<VAR>number</VAR>’</SAMP> 640<DD> 641<A NAME="IDX1009"></A> 642<A NAME="IDX1010"></A> 643Set the output page width. Long strings in the output files will be 644split across multiple lines in order to ensure that each line's width 645(= number of screen columns) is less or equal to the given <VAR>number</VAR>. 646 647<DT><SAMP>‘--no-wrap’</SAMP> 648<DD> 649<A NAME="IDX1011"></A> 650Do not break long message lines. Message lines whose width exceeds the 651output page width will not be split into several lines. Only file reference 652lines which are wider than the output page width will be split. 653 654<DT><SAMP>‘-s’</SAMP> 655<DD> 656<DT><SAMP>‘--sort-output’</SAMP> 657<DD> 658<A NAME="IDX1012"></A> 659<A NAME="IDX1013"></A> 660<A NAME="IDX1014"></A> 661Generate sorted output. Note that using this option makes it much harder 662for the translator to understand each message's context. 663 664</DL> 665 666 667 668<H3><A NAME="SEC176" HREF="gettext_toc.html#TOC176">10.2.8 Informative output</A></H3> 669 670<DL COMPACT> 671 672<DT><SAMP>‘-h’</SAMP> 673<DD> 674<DT><SAMP>‘--help’</SAMP> 675<DD> 676<A NAME="IDX1015"></A> 677<A NAME="IDX1016"></A> 678Display this help and exit. 679 680<DT><SAMP>‘-V’</SAMP> 681<DD> 682<DT><SAMP>‘--version’</SAMP> 683<DD> 684<A NAME="IDX1017"></A> 685<A NAME="IDX1018"></A> 686Output version information and exit. 687 688<DT><SAMP>‘-v’</SAMP> 689<DD> 690<DT><SAMP>‘--verbose’</SAMP> 691<DD> 692<A NAME="IDX1019"></A> 693<A NAME="IDX1020"></A> 694Increase verbosity level. 695 696</DL> 697 698 699 700<H2><A NAME="SEC177" HREF="gettext_toc.html#TOC177">10.3 The Format of GNU MO Files</A></H2> 701<P> 702<A NAME="IDX1021"></A> 703<A NAME="IDX1022"></A> 704 705</P> 706<P> 707The format of the generated MO files is best described by a picture, 708which appears below. 709 710</P> 711<P> 712<A NAME="IDX1023"></A> 713The first two words serve the identification of the file. The magic 714number will always signal GNU MO files. The number is stored in the 715byte order of the generating machine, so the magic number really is 716two numbers: <CODE>0x950412de</CODE> and <CODE>0xde120495</CODE>. The second 717word describes the current revision of the file format. For now the 718revision is 0. This might change in future versions, and ensures 719that the readers of MO files can distinguish new formats from old 720ones, so that both can be handled correctly. The version is kept 721separate from the magic number, instead of using different magic 722numbers for different formats, mainly because <TT>‘/etc/magic’</TT> is 723not updated often. It might be better to have magic separated from 724internal format version identification. 725 726</P> 727<P> 728Follow a number of pointers to later tables in the file, allowing 729for the extension of the prefix part of MO files without having to 730recompile programs reading them. This might become useful for later 731inserting a few flag bits, indication about the charset used, new 732tables, or other things. 733 734</P> 735<P> 736Then, at offset <VAR>O</VAR> and offset <VAR>T</VAR> in the picture, two tables 737of string descriptors can be found. In both tables, each string 738descriptor uses two 32 bits integers, one for the string length, 739another for the offset of the string in the MO file, counting in bytes 740from the start of the file. The first table contains descriptors 741for the original strings, and is sorted so the original strings 742are in increasing lexicographical order. The second table contains 743descriptors for the translated strings, and is parallel to the first 744table: to find the corresponding translation one has to access the 745array slot in the second array with the same index. 746 747</P> 748<P> 749Having the original strings sorted enables the use of simple binary 750search, for when the MO file does not contain an hashing table, or 751for when it is not practical to use the hashing table provided in 752the MO file. This also has another advantage, as the empty string 753in a PO file GNU <CODE>gettext</CODE> is usually <EM>translated</EM> into 754some system information attached to that particular MO file, and the 755empty string necessarily becomes the first in both the original and 756translated tables, making the system information very easy to find. 757 758</P> 759<P> 760<A NAME="IDX1024"></A> 761The size <VAR>S</VAR> of the hash table can be zero. In this case, the 762hash table itself is not contained in the MO file. Some people might 763prefer this because a precomputed hashing table takes disk space, and 764does not win <EM>that</EM> much speed. The hash table contains indices 765to the sorted array of strings in the MO file. Conflict resolution is 766done by double hashing. The precise hashing algorithm used is fairly 767dependent on GNU <CODE>gettext</CODE> code, and is not documented here. 768 769</P> 770<P> 771As for the strings themselves, they follow the hash file, and each 772is terminated with a <KBD>NUL</KBD>, and this <KBD>NUL</KBD> is not counted in 773the length which appears in the string descriptor. The <CODE>msgfmt</CODE> 774program has an option selecting the alignment for MO file strings. 775With this option, each string is separately aligned so it starts at 776an offset which is a multiple of the alignment value. On some RISC 777machines, a correct alignment will speed things up. 778 779</P> 780<P> 781<A NAME="IDX1025"></A> 782Contexts are stored by storing the concatenation of the context, a 783<KBD>EOT</KBD> byte, and the original string, instead of the original string. 784 785</P> 786<P> 787<A NAME="IDX1026"></A> 788Plural forms are stored by letting the plural of the original string 789follow the singular of the original string, separated through a 790<KBD>NUL</KBD> byte. The length which appears in the string descriptor 791includes both. However, only the singular of the original string 792takes part in the hash table lookup. The plural variants of the 793translation are all stored consecutively, separated through a 794<KBD>NUL</KBD> byte. Here also, the length in the string descriptor 795includes all of them. 796 797</P> 798<P> 799Nothing prevents a MO file from having embedded <KBD>NUL</KBD>s in strings. 800However, the program interface currently used already presumes 801that strings are <KBD>NUL</KBD> terminated, so embedded <KBD>NUL</KBD>s are 802somewhat useless. But the MO file format is general enough so other 803interfaces would be later possible, if for example, we ever want to 804implement wide characters right in MO files, where <KBD>NUL</KBD> bytes may 805accidentally appear. (No, we don't want to have wide characters in MO 806files. They would make the file unnecessarily large, and the 807<SAMP>‘wchar_t’</SAMP> type being platform dependent, MO files would be 808platform dependent as well.) 809 810</P> 811<P> 812This particular issue has been strongly debated in the GNU 813<CODE>gettext</CODE> development forum, and it is expectable that MO file 814format will evolve or change over time. It is even possible that many 815formats may later be supported concurrently. But surely, we have to 816start somewhere, and the MO file format described here is a good start. 817Nothing is cast in concrete, and the format may later evolve fairly 818easily, so we should feel comfortable with the current approach. 819 820</P> 821 822<PRE> 823 byte 824 +------------------------------------------+ 825 0 | magic number = 0x950412de | 826 | | 827 4 | file format revision = 0 | 828 | | 829 8 | number of strings | == N 830 | | 831 12 | offset of table with original strings | == O 832 | | 833 16 | offset of table with translation strings | == T 834 | | 835 20 | size of hashing table | == S 836 | | 837 24 | offset of hashing table | == H 838 | | 839 . . 840 . (possibly more entries later) . 841 . . 842 | | 843 O | length & offset 0th string ----------------. 844 O + 8 | length & offset 1st string ------------------. 845 ... ... | | 846O + ((N-1)*8)| length & offset (N-1)th string | | | 847 | | | | 848 T | length & offset 0th translation ---------------. 849 T + 8 | length & offset 1st translation -----------------. 850 ... ... | | | | 851T + ((N-1)*8)| length & offset (N-1)th translation | | | | | 852 | | | | | | 853 H | start hash table | | | | | 854 ... ... | | | | 855 H + S * 4 | end hash table | | | | | 856 | | | | | | 857 | NUL terminated 0th string <----------------' | | | 858 | | | | | 859 | NUL terminated 1st string <------------------' | | 860 | | | | 861 ... ... | | 862 | | | | 863 | NUL terminated 0th translation <---------------' | 864 | | | 865 | NUL terminated 1st translation <-----------------' 866 | | 867 ... ... 868 | | 869 +------------------------------------------+ 870</PRE> 871 872<P><HR><P> 873Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_9.html">previous</A>, <A HREF="gettext_11.html">next</A>, <A HREF="gettext_25.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>. 874</BODY> 875</HTML> 876