README revision 244601
1207753Smm 2207753SmmXZ Utils 3207753Smm======== 4207753Smm 5207753Smm 0. Overview 6207753Smm 1. Documentation 7207753Smm 1.1. Overall documentation 8244601Smm 1.2. Documentation for command-line tools 9207753Smm 1.3. Documentation for liblzma 10207753Smm 2. Version numbering 11207753Smm 3. Reporting bugs 12213700Smm 4. Translating the xz tool 13213700Smm 5. Other implementations of the .xz format 14213700Smm 6. Contact information 15207753Smm 16207753Smm 17207753Smm0. Overview 18207753Smm----------- 19207753Smm 20244601Smm XZ Utils provide a general-purpose data-compression library plus 21244601Smm command-line tools. The native file format is the .xz format, but 22207753Smm also the legacy .lzma format is supported. The .xz format supports 23244601Smm multiple compression algorithms, which are called "filters" in the 24207753Smm context of XZ Utils. The primary filter is currently LZMA2. With 25207753Smm typical files, XZ Utils create about 30 % smaller files than gzip. 26207753Smm 27207753Smm To ease adapting support for the .xz format into existing applications 28207753Smm and scripts, the API of liblzma is somewhat similar to the API of the 29244601Smm popular zlib library. For the same reason, the command-line tool xz 30244601Smm has a command-line syntax similar to that of gzip. 31207753Smm 32244601Smm When aiming for the highest compression ratio, the LZMA2 encoder uses 33207753Smm a lot of CPU time and may use, depending on the settings, even 34244601Smm hundreds of megabytes of RAM. However, in fast modes, the LZMA2 encoder 35207753Smm competes with bzip2 in compression speed, RAM usage, and compression 36207753Smm ratio. 37207753Smm 38207753Smm LZMA2 is reasonably fast to decompress. It is a little slower than 39207753Smm gzip, but a lot faster than bzip2. Being fast to decompress means 40207753Smm that the .xz format is especially nice when the same file will be 41207753Smm decompressed very many times (usually on different computers), which 42207753Smm is the case e.g. when distributing software packages. In such 43207753Smm situations, it's not too bad if the compression takes some time, 44207753Smm since that needs to be done only once to benefit many people. 45207753Smm 46207753Smm With some file types, combining (or "chaining") LZMA2 with an 47244601Smm additional filter can improve the compression ratio. A filter chain may 48244601Smm contain up to four filters, although usually only one or two are used. 49207753Smm For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2 50207753Smm in the filter chain can improve compression ratio of executable files. 51207753Smm 52207753Smm Since the .xz format allows adding new filter IDs, it is possible that 53207753Smm some day there will be a filter that is, for example, much faster to 54207753Smm compress than LZMA2 (but probably with worse compression ratio). 55207753Smm Similarly, it is possible that some day there is a filter that will 56207753Smm compress better than LZMA2. 57207753Smm 58207753Smm XZ Utils doesn't support multithreaded compression or decompression 59207753Smm yet. It has been planned though and taken into account when designing 60207753Smm the .xz file format. 61207753Smm 62207753Smm 63207753Smm1. Documentation 64207753Smm---------------- 65207753Smm 66207753Smm1.1. Overall documentation 67207753Smm 68207753Smm README This file 69207753Smm 70207753Smm INSTALL.generic Generic install instructions for those not familiar 71207753Smm with packages using GNU Autotools 72207753Smm INSTALL Installation instructions specific to XZ Utils 73207753Smm PACKAGERS Information to packagers of XZ Utils 74207753Smm 75207753Smm COPYING XZ Utils copyright and license information 76207753Smm COPYING.GPLv2 GNU General Public License version 2 77207753Smm COPYING.GPLv3 GNU General Public License version 3 78207753Smm COPYING.LGPLv2.1 GNU Lesser General Public License version 2.1 79207753Smm 80207753Smm AUTHORS The main authors of XZ Utils 81207753Smm THANKS Incomplete list of people who have helped making 82207753Smm this software 83207753Smm NEWS User-visible changes between XZ Utils releases 84207753Smm ChangeLog Detailed list of changes (commit log) 85207753Smm TODO Known bugs and some sort of to-do list 86207753Smm 87207753Smm Note that only some of the above files are included in binary 88207753Smm packages. 89207753Smm 90207753Smm 91244601Smm1.2. Documentation for command-line tools 92207753Smm 93244601Smm The command-line tools are documented as man pages. In source code 94207753Smm releases (and possibly also in some binary packages), the man pages 95207753Smm are also provided in plain text (ASCII only) and PDF formats in the 96207753Smm directory "doc/man" to make the man pages more accessible to those 97207753Smm whose operating system doesn't provide an easy way to view man pages. 98207753Smm 99207753Smm 100207753Smm1.3. Documentation for liblzma 101207753Smm 102207753Smm The liblzma API headers include short docs about each function 103207753Smm and data type as Doxygen tags. These docs should be quite OK as 104207753Smm a quick reference. 105207753Smm 106207753Smm I have planned to write a bunch of very well documented example 107207753Smm programs, which (due to comments) should work as a tutorial to 108207753Smm various features of liblzma. No such example programs have been 109207753Smm written yet. 110207753Smm 111207753Smm For now, if you have never used liblzma, libbzip2, or zlib, I 112244601Smm recommend learning the *basics* of the zlib API. Once you know that, 113244601Smm it should be easier to learn liblzma. 114207753Smm 115207753Smm http://zlib.net/manual.html 116207753Smm http://zlib.net/zlib_how.html 117207753Smm 118207753Smm 119207753Smm2. Version numbering 120207753Smm-------------------- 121207753Smm 122207753Smm The version number format of XZ Utils is X.Y.ZS: 123207753Smm 124207753Smm - X is the major version. When this is incremented, the library 125207753Smm API and ABI break. 126207753Smm 127244601Smm - Y is the minor version. It is incremented when new features 128244601Smm are added without breaking the existing API or ABI. An even Y 129244601Smm indicates a stable release and an odd Y indicates unstable 130244601Smm (alpha or beta version). 131207753Smm 132244601Smm - Z is the revision. This has a different meaning for stable and 133207753Smm unstable releases: 134244601Smm 135207753Smm * Stable: Z is incremented when bugs get fixed without adding 136244601Smm any new features. This is intended to be convenient for 137244601Smm downstream distributors that want bug fixes but don't want 138244601Smm any new features to minimize the risk of introducing new bugs. 139244601Smm 140207753Smm * Unstable: Z is just a counter. API or ABI of features added 141207753Smm in earlier unstable releases having the same X.Y may break. 142207753Smm 143207753Smm - S indicates stability of the release. It is missing from the 144244601Smm stable releases, where Y is an even number. When Y is odd, S 145207753Smm is either "alpha" or "beta" to make it very clear that such 146207753Smm versions are not stable releases. The same X.Y.Z combination is 147244601Smm not used for more than one stability level, i.e. after X.Y.Zalpha, 148207753Smm the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta. 149207753Smm 150207753Smm 151207753Smm3. Reporting bugs 152207753Smm----------------- 153207753Smm 154207753Smm Naturally it is easiest for me if you already know what causes the 155207753Smm unexpected behavior. Even better if you have a patch to propose. 156207753Smm However, quite often the reason for unexpected behavior is unknown, 157207753Smm so here are a few things to do before sending a bug report: 158207753Smm 159207753Smm 1. Try to create a small example how to reproduce the issue. 160207753Smm 161207753Smm 2. Compile XZ Utils with debugging code using configure switches 162207753Smm --enable-debug and, if possible, --disable-shared. If you are 163207753Smm using GCC, use CFLAGS='-O0 -ggdb3'. Don't strip the resulting 164207753Smm binaries. 165207753Smm 166207753Smm 3. Turn on core dumps. The exact command depends on your shell; 167207753Smm for example in GNU bash it is done with "ulimit -c unlimited", 168207753Smm and in tcsh with "limit coredumpsize unlimited". 169207753Smm 170207753Smm 4. Try to reproduce the suspected bug. If you get "assertion failed" 171207753Smm message, be sure to include the complete message in your bug 172207753Smm report. If the application leaves a coredump, get a backtrace 173207753Smm using gdb: 174207753Smm $ gdb /path/to/app-binary # Load the app to the debugger. 175207753Smm (gdb) core core # Open the coredump. 176207753Smm (gdb) bt # Print the backtrace. Copy & paste to bug report. 177207753Smm (gdb) quit # Quit gdb. 178207753Smm 179207753Smm Report your bug via email or IRC (see Contact information below). 180207753Smm Don't send core dump files or any executables. If you have a small 181207753Smm example file(s) (total size less than 256 KiB), please include 182207753Smm it/them as an attachment. If you have bigger test files, put them 183244601Smm online somewhere and include a URL to the file(s) in the bug report. 184207753Smm 185207753Smm Always include the exact version number of XZ Utils in the bug report. 186207753Smm If you are using a snapshot from the git repository, use "git describe" 187207753Smm to get the exact snapshot version. If you are using XZ Utils shipped 188207753Smm in an operating system distribution, mention the distribution name, 189207753Smm distribution version, and exact xz package version; if you cannot 190207753Smm repeat the bug with the code compiled from unpatched source code, 191207753Smm you probably need to report a bug to your distribution's bug tracking 192207753Smm system. 193207753Smm 194207753Smm 195213700Smm4. Translating the xz tool 196213700Smm-------------------------- 197213700Smm 198213700Smm The messages from the xz tool have been translated into a few 199213700Smm languages. Before starting to translate into a new language, ask 200244601Smm the author whether someone else hasn't already started working on it. 201213700Smm 202213700Smm Test your translation. Testing includes comparing the translated 203213700Smm output to the original English version by running the same commands 204213700Smm in both your target locale and with LC_ALL=C. Ask someone to 205213700Smm proof-read and test the translation. 206213700Smm 207213700Smm Testing can be done e.g. by installing xz into a temporary directory: 208213700Smm 209213700Smm ./configure --disable-shared --prefix=/tmp/xz-test 210213700Smm # <Edit the .po file in the po directory.> 211213700Smm make -C po update-po 212213700Smm make install 213213700Smm bash debug/translations.bash | less 214213700Smm bash debug/translations.bash | less -S # For --list outputs 215213700Smm 216213700Smm Repeat the above as needed (no need to re-run configure though). 217213700Smm 218213700Smm Note especially the following: 219213700Smm 220213700Smm - The output of --help and --long-help must look nice on 221244601Smm an 80-column terminal. It's OK to add extra lines if needed. 222213700Smm 223213700Smm - In contrast, don't add extra lines to error messages and such. 224213700Smm They are often preceded with e.g. a filename on the same line, 225213700Smm so you have no way to predict where to put a \n. Let the terminal 226213700Smm do the wrapping even if it looks ugly. Adding new lines will be 227213700Smm even uglier in the generic case even if it looks nice in a few 228213700Smm limited examples. 229213700Smm 230213700Smm - Be careful with column alignment in tables and table-like output 231213700Smm (--list, --list --verbose --verbose, --info-memory, --help, and 232213700Smm --long-help): 233213700Smm 234213700Smm * All descriptions of options in --help should start in the 235213700Smm same column (but it doesn't need to be the same column as 236213700Smm in the English messages; just be consistent if you change it). 237213700Smm Check that both --help and --long-help look OK, since they 238213700Smm share several strings. 239213700Smm 240213700Smm * --list --verbose and --info-memory print lines that have 241213700Smm the format "Description: %s". If you need a longer 242213700Smm description, you can put extra space between the colon 243213700Smm and %s. Then you may need to add extra space to other 244213700Smm strings too so that the result as a whole looks good (all 245213700Smm values start at the same column). 246213700Smm 247213700Smm * The columns of the actual tables in --list --verbose --verbose 248213700Smm should be aligned properly. Abbreviate if necessary. It might 249213700Smm be good to keep at least 2 or 3 spaces between column headings 250213700Smm and avoid spaces in the headings so that the columns stand out 251213700Smm better, but this is a matter of opinion. Do what you think 252213700Smm looks best. 253213700Smm 254213700Smm - Be careful to put a period at the end of a sentence when the 255213700Smm original version has it, and don't put it when the original 256213700Smm doesn't have it. Similarly, be careful with \n characters 257213700Smm at the beginning and end of the strings. 258213700Smm 259213700Smm - Read the TRANSLATORS comments that have been extracted from the 260213700Smm source code and included in xz.pot. If they suggest testing the 261213700Smm translation with some type of command, do it. If testing needs 262213700Smm input files, use e.g. tests/files/good-*.xz. 263213700Smm 264213700Smm - When updating the translation, read the fuzzy (modified) strings 265213700Smm carefully, and don't mark them as updated before you actually 266213700Smm have updated them. Reading through the unchanged messages can be 267213700Smm good too; sometimes you may find a better wording for them. 268213700Smm 269213700Smm - If you find language problems in the original English strings, 270213700Smm feel free to suggest improvements. Ask if something is unclear. 271213700Smm 272213700Smm - The translated messages should be understandable (sometimes this 273213700Smm may be a problem with the original English messages too). Don't 274213700Smm make a direct word-by-word translation from English especially if 275213700Smm the result doesn't sound good in your language. 276213700Smm 277213700Smm In short, take your time and pay attention to the details. Making 278213700Smm a good translation is not a quick and trivial thing to do. The 279213700Smm translated xz should look as polished as the English version. 280213700Smm 281213700Smm 282213700Smm5. Other implementations of the .xz format 283207753Smm------------------------------------------ 284207753Smm 285207753Smm 7-Zip and the p7zip port of 7-Zip support the .xz format starting 286207753Smm from the version 9.00alpha. 287207753Smm 288207753Smm http://7-zip.org/ 289207753Smm http://p7zip.sourceforge.net/ 290207753Smm 291207753Smm XZ Embedded is a limited implementation written for use in the Linux 292207753Smm kernel, but it is also suitable for other embedded use. 293207753Smm 294207753Smm http://tukaani.org/xz/embedded.html 295207753Smm 296207753Smm 297213700Smm6. Contact information 298207753Smm---------------------- 299207753Smm 300207753Smm If you have questions, bug reports, patches etc. related to XZ Utils, 301207753Smm contact Lasse Collin <lasse.collin@tukaani.org> (in Finnish or English). 302207753Smm I'm sometimes slow at replying. If you haven't got a reply within two 303207753Smm weeks, assume that your email has got lost and resend it or use IRC. 304207753Smm 305207753Smm You can find me also from #tukaani on Freenode; my nick is Larhzu. 306207753Smm The channel tends to be pretty quiet, so just ask your question and 307207753Smm someone may wake up. 308207753Smm 309