grep.texi revision 53564
1\input texinfo @c -*-texinfo-*- 2@c %**start of header 3@setfilename grep.info 4@settitle grep, print lines matching a pattern 5@c %**end of header 6 7@c This file has the new style title page commands. 8@c Run `makeinfo' rather than `texinfo-format-buffer'. 9 10@c smallbook 11 12@c tex 13@c \overfullrule=0pt 14@c end tex 15 16@include version.texi 17 18@c Combine indices. 19@syncodeindex ky cp 20@syncodeindex pg cp 21@syncodeindex tp cp 22 23@defcodeindex op 24@syncodeindex op fn 25 26@ifinfo 27@direntry 28* grep: (grep). print lines matching a pattern. 29@end direntry 30This file documents @sc{grep}, a pattern matching engine. 31 32 33Published by the Free Software Foundation, 3459 Temple Place - Suite 330 35Boston, MA 02111-1307, USA 36 37Copyright (C) 1998 Free Software Foundation, Inc. 38 39Permission is granted to make and distribute verbatim copies of 40this manual provided the copyright notice and this permission notice 41are preserved on all copies. 42 43@ignore 44Permission is granted to process this file through TeX and print the 45results, provided the printed document carries copying permission 46notice identical to this one except for the removal of this paragraph 47(this paragraph not being relevant to the printed manual). 48 49@end ignore 50Permission is granted to copy and distribute modified versions of this 51manual under the conditions for verbatim copying, provided that the entire 52resulting derived work is distributed under the terms of a permission 53notice identical to this one. 54 55Permission is granted to copy and distribute translations of this manual 56into another language, under the above conditions for modified versions, 57except that this permission notice may be stated in a translation approved 58by the Foundation. 59@end ifinfo 60 61@setchapternewpage off 62 63@titlepage 64@title grep, searching for a pattern 65@subtitle version @value{VERSION}, @value{UPDATED} 66@author Alain Magloire et al. 67 68@page 69@vskip 0pt plus 1filll 70Copyright @copyright{} 1998 Free Software Foundation, Inc. 71 72@sp 2 73Published by the Free Software Foundation, @* 7459 Temple Place - Suite 330, @* 75Boston, MA 02111-1307, USA 76 77Permission is granted to make and distribute verbatim copies of 78this manual provided the copyright notice and this permission notice 79are preserved on all copies. 80 81Permission is granted to copy and distribute modified versions of this 82manual under the conditions for verbatim copying, provided that the entire 83resulting derived work is distributed under the terms of a permission 84notice identical to this one. 85 86Permission is granted to copy and distribute translations of this manual 87into another language, under the above conditions for modified versions, 88except that this permission notice may be stated in a translation approved 89by the Foundation. 90 91@end titlepage 92@page 93 94 95@node Top, Introduction, (dir), (dir) 96@comment node-name, next, previous, up 97 98@ifinfo 99This document was produced for version @value{VERSION} of @sc{GNU} @sc{grep}. 100@end ifinfo 101 102@menu 103* Introduction:: Introduction. 104* Invoking:: Invoking @sc{grep}; description of options. 105* Diagnostics:: Exit status returned by @sc{grep}. 106* Grep Programs:: @sc{grep} programs. 107* Regular Expressions:: Regular Expressions. 108* Reporting Bugs:: Reporting Bugs. 109* Concept Index:: A menu with all the topics in this manual. 110* Index:: A menu with all @sc{grep} commands 111 and command-line options. 112@end menu 113 114 115@node Introduction, Invoking, Top, Top 116@comment node-name, next, previous, up 117@chapter Introduction 118 119@cindex Searching for a pattern. 120@sc{grep} searches the input files for lines containing a match to a given 121pattern list. When it finds a match in a line, it copies the line to standard 122output (by default), or does whatever other sort of output you have requested 123with options. @sc{grep} expects to do the matching on text. 124Since newline is also a separator for the list of patterns, there 125is no way to match newline characters in a text. 126 127@node Invoking, Diagnostics, Introduction, Top 128@comment node-name, next, previous, up 129@chapter Invoking @sc{grep} 130 131@sc{grep} comes with a rich set of options from POSIX.2 and GNU extensions. 132 133@table @samp 134 135@item -c 136@itemx --count 137@opindex -c 138@opindex -count 139@cindex counting lines 140Suppress normal output; instead print a count of matching 141lines for each input file. With the @samp{-v}, @samp{--revert-match} option, 142count non-matching lines. 143 144@item -e @var{pattern} 145@itemx --regexp=@var{pattern} 146@opindex -e 147@opindex --regexp=@var{pattern} 148@cindex pattern list 149Use @var{pattern} as the pattern; useful to protect patterns 150beginning with a @samp{-}. 151 152@item -f @var{file} 153@itemx --file=@var{file} 154@opindex -f 155@opindex --file 156@cindex pattern from file 157Obtain patterns from @var{file}, one per line. The empty 158file contains zero patterns, and therefore matches nothing. 159 160@item -i 161@itemx --ignore-case 162@opindex -i 163@opindex --ignore-case 164@cindex case insensitive search 165Ignore case distinctions in both the pattern and the input files. 166 167@item -l 168@itemx --files-with-matches 169@opindex -l 170@opindex --files-with-matches 171@cindex names of matching files 172Suppress normal output; instead print the name of each input 173file from which output would normally have been printed. 174The scanning of every file will stop on the first match. 175 176@item -n 177@itemx --line-number 178@opindex -n 179@opindex --line-number 180@cindex line numbering 181Prefix each line of output with the line number within its input file. 182 183@item -q 184@itemx --quiet 185@itemx --silent 186@opindex -q 187@opindex --quiet 188@opindex --silent 189@cindex quiet, silent 190Quiet; suppress normal output. The scanning of every file will stop on 191the first match. Also see the @samp{-s} or @samp{--no-messages} option. 192 193@item -s 194@itemx --no-messages 195@opindex -s 196@opindex --no-messages 197@cindex suppress error messages 198Suppress error messages about nonexistent or unreadable files. 199Portability note: unlike GNU @sc{grep}, BSD @sc{grep} does not comply 200with POSIX.2, because BSD @sc{grep} lacks a @samp{-q} option and its 201@samp{-s} option behaves like GNU @sc{grep}'s @samp{-q} option. Shell 202scripts intended to be portable to BSD @sc{grep} should avoid both 203@samp{-q} and @samp{-s} and should redirect 204output to @file{/dev/null} instead. 205 206@item -v 207@itemx --revert-match 208@opindex -v 209@opindex --revert-match 210@cindex revert matching 211@cindex print non-matching lines 212Invert the sense of matching, to select non-matching lines. 213 214@item -x 215@itemx --line-regexp 216@opindex -x 217@opindex --line-regexp 218@cindex match the whole line 219Select only those matches that exactly match the whole line. 220 221@end table 222 223@section GNU Extensions 224 225@table @samp 226 227@item -A @var{num} 228@itemx --after-context=@var{num} 229@opindex -A 230@opindex --after-context 231@cindex after context 232@cindex context lines, after match 233Print @var{num} lines of trailing context after matching lines. 234 235@item -B @var{num} 236@itemx --before-context=@var{num} 237@opindex -B 238@opindex --before-context 239@cindex before context 240@cindex context lines, before match 241Print @var{num} lines of leading context before matching lines. 242 243@item -C 244@itemx --context@var{[=num]} 245@opindex -C 246@opindex --context 247@cindex context 248Print @var{num} lines (default 2) of output context. 249 250 251@item -NUM 252@opindex -NUM 253Same as @samp{--context=@var{num}} lines of leading and trailing 254context. However, grep will never print any given line more than once. 255 256 257@item -V 258@itemx --version 259@opindex -V 260@opindex --version 261@cindex Version, printing 262Print the version number of @sc{grep} to the standard output stream. 263This version number should be included in all bug reports. 264 265@item --help 266@opindex --help 267@cindex Usage summary, printing 268Print a usage message briefly summarizing these command-line options 269and the bug-reporting address, then exit. 270 271@item -b 272@itemx --byte-offset 273@opindex -b 274@opindex --byte-offset 275@cindex byte offset 276Print the byte offset within the input file before each line of output. 277When @sc{grep} runs on MS-DOS or MS-Windows, the printed byte offsets 278depend on whether the @samp{-u} (@samp{--unix-byte-offsets}) option is 279used; see below. 280 281@item -d @var{action} 282@itemx --directories=@var{action} 283@opindex -d 284@opindex --directories 285@cindex directory search 286If an input file is a directory, use @var{action} to process it. 287By default, @var{action} is @samp{read}, which means that directories are 288read just as if they were ordinary files (some operating systems 289and filesystems disallow this, and will cause @sc{grep} to print error 290messages for every directory). If @var{action} is @samp{skip}, 291directories are silently skipped. If @var{action} is @samp{recurse}, 292@sc{grep} reads all files under each directory, recursively; this is 293equivalent to the @samp{-r} option. 294 295@item -h 296@itemx --no-filename 297@opindex -h 298@opindex --no-filename 299@cindex no filename prefix 300Suppress the prefixing of filenames on output when multiple files are searched. 301 302@item -L 303@itemx --files-without-match 304@opindex -L 305@opindex --files-without-match 306@cindex files which don't match 307Suppress normal output; instead print the name of each input 308file from which no output would normally have been printed. 309The scanning of every file will stop on the first match. 310 311@item -a 312@itemx --text 313@opindex -a 314@opindex --text 315@cindex suppress binary data 316@cindex binary files 317Do not suppress output lines that contain binary data. 318Normally, if the first few bytes of a file indicate 319that the file contains binary data, grep outputs only a 320message saying that the file matches the pattern. This 321option causes grep to act as if the file is a text 322file, even if it would otherwise be treated as binary. 323@emph{Warning:} the result might be binary garbage 324printed to the terminal, which can have nasty 325side-effects if the terminal driver interprets some of 326it as commands. 327 328@item -w 329@itemx --word-regexp 330@opindex -w 331@opindex --word-regexp 332@cindex matching whole words 333Select only those lines containing matches that form 334whole words. The test is that the matching substring 335must either be at the beginning of the line, or preceded 336by a non-word constituent character. Similarly, 337it must be either at the end of the line or followed by 338a non-word constituent character. Word-constituent 339characters are letters, digits, and the underscore. 340 341@item -r 342@itemx --recursive 343@opindex -r 344@opindex --recursive 345@cindex recursive search 346@cindex searching directory trees 347For each directory mentioned in the command line, read and process all 348files in that directory, recursively. This is the same as the @samp{-d 349recurse} option. 350 351@item -y 352@opindex -y 353@cindex case insensitive search, obsolete option 354Obsolete synonym for @samp{-i}. 355 356@item -U 357@itemx --binary 358@opindex -U 359@opindex --binary 360@cindex DOS/Windows binary files 361@cindex binary files, DOS/Windows 362Treat the file(s) as binary. By default, under MS-DOS 363and MS-Windows, @sc{grep} guesses the file type by looking 364at the contents of the first 32KB read from the file. 365If @sc{grep} decides the file is a text file, it strips the 366CR characters from the original file contents (to make 367regular expressions with @code{^} and @code{$} work correctly). 368Specifying @samp{-U} overrules this guesswork, causing all 369files to be read and passed to the matching mechanism 370verbatim; if the file is a text file with CR/LF pairs 371at the end of each line, this will cause some regular 372expressions to fail. This option is only supported on 373MS-DOS and MS-Windows. 374 375@item -u 376@itemx --unix-byte-offsets 377@opindex -u 378@opindex --unix-byte-offsets 379@cindex DOS byte offsets 380@cindex byte offsets, on DOS/Windows 381Report Unix-style byte offsets. This switch causes 382@sc{grep} to report byte offsets as if the file were Unix style 383text file, i.e. the byte offsets ignore the CR characters which were 384stripped off. This will produce results identical to running @sc{grep} on 385a Unix machine. This option has no effect unless @samp{-b} 386option is also used; it is only supported on MS-DOS and 387MS-Windows. 388 389@end table 390 391Several additional options control which variant of the @sc{grep} 392matching engine is used. @xref{Grep Programs}. 393 394@sc{grep} uses the environment variable @var{LANG} to 395provide internationalization support, if compiled with this feature. 396 397@node Diagnostics, Grep Programs, Invoking, Top 398@comment node-name, next, previous, up 399@chapter Diagnostics 400Normally, exit status is 0 if matches were found, and 1 if no matches 401were found (the @samp{-v} option inverts the sense of the exit status). 402Exit status is 2 if there were syntax errors in the pattern, 403inaccessible input files, or other system errors. 404 405@node Grep Programs, Regular Expressions, Diagnostics, Top 406@comment node-name, next, previous, up 407@chapter @sc{grep} programs 408 409@sc{grep} searches the named input files (or standard input if no 410files are named, or the file name @file{-} is given) for lines containing 411a match to the given pattern. By default, @sc{grep} prints the matching lines. 412There are three major variants of @sc{grep}, controlled by the following options. 413 414@table @samp 415 416@item -G 417@itemx --basic-regexp 418@opindex -G 419@opindex --basic-regexp 420@cindex matching basic regular expressions 421Interpret pattern as a basic regular expression. This is the default. 422 423@item -E 424@item --extended-regexp 425@opindex -E 426@opindex --extended-regexp 427@cindex matching extended regular expressions 428Interpret pattern as an extended regular expression. 429 430 431@item -F 432@itemx --fixed-strings 433@opindex -F 434@opindex --fixed-strings 435@cindex matching fixed strings 436Interpret pattern as a list of fixed strings, separated 437by newlines, any of which is to be matched. 438 439@end table 440 441In addition, two variant programs @sc{egrep} and @sc{fgrep} are available. 442@sc{egrep} is similar (but not identical) to @samp{grep -E}, and 443is compatible with the historical Unix @sc{egrep}. @sc{fgrep} is the 444same as @samp{grep -F}. 445 446@node Regular Expressions, Reporting Bugs, Grep Programs, Top 447@comment node-name, next, previous, up 448@chapter Regular Expressions 449@cindex regular expressions 450 451A @dfn{regular expression} is a pattern that describes a set of strings. 452Regular expressions are constructed analogously to arithmetic expressions, 453by using various operators to combine smaller expressions. 454@sc{grep} understands two different versions of regular expression 455syntax: ``basic'' and ``extended''. In GNU @sc{grep}, there is no 456difference in available functionality using either syntax. 457In other implementations, basic regular expressions are less powerful. 458The following description applies to extended regular expressions; 459differences for basic regular expressions are summarized afterwards. 460 461The fundamental building blocks are the regular expressions that match 462a single character. Most characters, including all letters and digits, 463are regular expressions that match themselves. Any metacharacter 464with special meaning may be quoted by preceding it with a backslash. 465A list of characters enclosed by @samp{[} and @samp{]} matches any 466single character in that list; if the first character of the list is the 467caret @samp{^}, then it 468matches any character @strong{not} in the list. For example, the regular 469expression @samp{[0123456789]} matches any single digit. 470A range of @sc{ascii} characters may be specified by giving the first 471and last characters, separated by a hyphen. Finally, certain named 472classes of characters are predefined. Their names are self explanatory, 473and they are : 474 475@cindex classes of characters 476@cindex character classes 477@table @samp 478 479@item [:alnum:] 480@opindex alnum 481@cindex alphanumeric characters 482Any of [:digit:] or [:alpha:] 483 484@item [:alpha:] 485@opindex alpha 486@cindex alphabetic characters 487Any local-specific or one of the @sc{ascii} letters:@* 488@code{a b c d e f g h i j k l m n o p q r s t u v w x y z},@* 489@code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}. 490 491@item [:cntrl:] 492@opindex cntrl 493@cindex control characters 494Any of @code{BEL}, @code{BS}, @code{CR}, @code{FF}, @code{HT}, 495@code{NL}, or @code{VT}. 496 497@item [:digit:] 498@opindex digit 499@cindex digit characters 500@cindex numeric characters 501Any one of @code{0 1 2 3 4 5 6 7 8 9}. 502 503@item [:graph:] 504@opindex graph 505@cindex graphic characters 506Anything that is not a @samp{[:alphanum:]} or @samp{[:punct:]}. 507 508@item [:lower:] 509@opindex lower 510@cindex lower-case alphabetic characters 511Any one of @code{a b c d e f g h i j k l m n o p q r s t u v w x y z}. 512 513@item [:print:] 514@opindex print 515@cindex printable characters 516Any character from the @samp{[:space:]} class, and any character that is 517@strong{not} in the @samp{[:isgraph:]} class. 518 519@item [:punct:] 520@opindex punct 521@cindex punctuation characters 522Any one of @code{!@: " #% & ' ( ) ; < = > ?@: [ \ ] * + , - .@: / : ^ _ @{ | @}}. 523 524 525@item [:space:] 526@opindex space 527@cindex space characters 528@cindex whitespace characters 529Any one of @code{CR FF HT NL VT SPACE}. 530 531@item [:upper:] 532@opindex upper 533@cindex upper-case alphabetic characters 534Any one of @code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}. 535 536@item [:xdigit:] 537@opindex xdigit 538@cindex xdigit class 539@cindex hexadecimal digits 540Any one of @code{a b c d e f A B C D E F 0 1 2 3 4 5 6 7 8 9}. 541 542@end table 543For example, @samp{[[:alnum:]]} means @samp{[0-9A-Za-z]}, except the latter 544form is dependent upon the @sc{ascii} character encoding, whereas the 545former is portable. (Note that the brackets in these class names are 546part of the symbolic names, and must be included in addition to 547the brackets delimiting the bracket list). Most metacharacters lose 548their special meaning inside lists. To include a literal @samp{]}, place it 549first in the list. Similarly, to include a literal @samp{^}, place it anywhere 550but first. Finally, to include a literal @samp{-}, place it last. 551 552The period @samp{.} matches any single character. The symbol @samp{\w} 553is a synonym for @samp{[[:alnum:]]} and @samp{\W} is a synonym for 554@samp{[^[:alnum]]}. 555 556The caret @samp{^} and the dollar sign @samp{$} are metacharacters that 557respectively match the empty string at the beginning and end 558of a line. The symbols @samp{\<} and @samp{\>} respectively match the 559empty string at the beginning and end of a word. The symbol 560@samp{\b} matches the empty string at the edge of a word, and @samp{\B} 561matches the empty string provided it's not at the edge of a word. 562 563A regular expression may be followed by one of several 564repetition operators: 565 566 567@table @samp 568 569@item ? 570@opindex ? 571@cindex question mark 572@cindex match sub-expression at most once 573The preceding item is optional and will be matched at most once. 574 575@item * 576@opindex * 577@cindex asterisk 578@cindex match sub-expression zero or more times 579The preceding item will be matched zero or more times. 580 581@item + 582@opindex + 583@cindex plus sign 584The preceding item will be matched one or more times. 585 586@item @{@var{n}@} 587@opindex @{n@} 588@cindex braces, one argument 589@cindex match sub-expression n times 590The preceding item is matched exactly @var{n} times. 591 592@item @{@var{n},@} 593@opindex @{n,@} 594@cindex braces, second argument omitted 595@cindex match sub-expression n or more times 596The preceding item is matched n or more times. 597 598@item @{,@var{m}@} 599@opindex @{,m@} 600@cindex braces, first argument omitted 601@cindex match sub-expression at most m times 602The preceding item is optional and is matched at most @var{m} times. 603 604@item @{@var{n},@var{m}@} 605@opindex @{n,m@} 606@cindex braces, two arguments 607The preceding item is matched at least @var{n} times, but not more than 608@var{m} times. 609 610@end table 611 612Two regular expressions may be concatenated; the resulting regular 613expression matches any string formed by concatenating two substrings 614that respectively match the concatenated subexpressions. 615 616Two regular expressions may be joined by the infix operator @samp{|}; the 617resulting regular expression matches any string matching either 618subexpression. 619 620Repetition takes precedence over concatenation, which in turn 621takes precedence over alternation. A whole subexpression may be 622enclosed in parentheses to override these precedence rules. 623 624The backreference @samp{\@var{n}}, where @var{n} is a single digit, matches the 625substring previously matched by the @var{n}th parenthesized subexpression 626of the regular expression. 627 628@cindex basic regular expressions 629In basic regular expressions the metacharacters @samp{?}, @samp{+}, 630@samp{@{}, @samp{|}, @samp{(}, and @samp{)} lose their special meaning; 631instead use the backslashed versions @samp{\?}, @samp{\+}, @samp{\@{}, 632@samp{\|}, @samp{\(}, and @samp{\)}. 633 634In @sc{egrep} the metacharacter @samp{@{} loses its special meaning; 635instead use @samp{\@{}. This not true for @samp{grep -E}. 636 637 638@node Reporting Bugs, Concept Index, Regular Expressions, Top 639@comment node-name, next, previous, up 640@chapter Reporting bugs 641 642@cindex Bugs, reporting 643Email bug reports to @email{bug-gnu-utils@@gnu.org}. 644Be sure to include the word ``grep'' somewhere in the ``Subject:'' field. 645 646Large repetition counts in the @samp{@{m,n@}} construct may cause 647@sc{grep} to use lots of memory. In addition, certain other 648obscure regular expressions require exponential time and 649space, and may cause grep to run out of memory. 650Backreferences are very slow, and may require exponential time. 651 652@page 653@node Concept Index , Index, Reporting Bugs, Top 654@comment node-name, next, previous, up 655@unnumbered Concept Index 656 657This is a general index of all issues discussed in this manual, with the 658exception of the @sc{grep} commands and command-line options. 659 660@printindex cp 661 662@page 663@node Index, , Concept Index, Top 664@unnumbered Index 665 666This is an alphabetical list of all @sc{grep} commands and command-line 667options. 668 669@printindex fn 670 671@contents 672@bye 673