grep.texi revision 53564
1\input texinfo  @c -*-texinfo-*-
2@c %**start of header
3@setfilename grep.info
4@settitle grep, print lines matching a pattern
5@c %**end of header
6
7@c This file has the new style title page commands.
8@c Run `makeinfo' rather than `texinfo-format-buffer'.
9
10@c smallbook
11
12@c tex
13@c \overfullrule=0pt
14@c end tex
15
16@include version.texi
17
18@c Combine indices.
19@syncodeindex ky cp
20@syncodeindex pg cp
21@syncodeindex tp cp
22
23@defcodeindex op
24@syncodeindex op fn
25
26@ifinfo
27@direntry
28* grep: (grep).                   print lines matching a pattern.
29@end direntry
30This file documents @sc{grep}, a pattern matching engine.
31
32
33Published by the Free Software Foundation,
3459 Temple Place - Suite 330
35Boston, MA 02111-1307, USA
36
37Copyright (C) 1998 Free Software Foundation, Inc.
38
39Permission is granted to make and distribute verbatim copies of
40this manual provided the copyright notice and this permission notice
41are preserved on all copies.
42
43@ignore
44Permission is granted to process this file through TeX and print the
45results, provided the printed document carries copying permission
46notice identical to this one except for the removal of this paragraph
47(this paragraph not being relevant to the printed manual).
48
49@end ignore
50Permission is granted to copy and distribute modified versions of this
51manual under the conditions for verbatim copying, provided that the entire
52resulting derived work is distributed under the terms of a permission
53notice identical to this one.
54
55Permission is granted to copy and distribute translations of this manual
56into another language, under the above conditions for modified versions,
57except that this permission notice may be stated in a translation approved
58by the Foundation.
59@end ifinfo
60
61@setchapternewpage off
62
63@titlepage
64@title grep, searching for a pattern
65@subtitle version @value{VERSION}, @value{UPDATED}
66@author Alain Magloire et al.
67
68@page
69@vskip 0pt plus 1filll
70Copyright @copyright{} 1998 Free Software Foundation, Inc.
71
72@sp 2
73Published by the Free Software Foundation, @*
7459 Temple Place - Suite 330, @*
75Boston, MA 02111-1307, USA
76
77Permission is granted to make and distribute verbatim copies of
78this manual provided the copyright notice and this permission notice
79are preserved on all copies.
80
81Permission is granted to copy and distribute modified versions of this
82manual under the conditions for verbatim copying, provided that the entire
83resulting derived work is distributed under the terms of a permission
84notice identical to this one.
85
86Permission is granted to copy and distribute translations of this manual
87into another language, under the above conditions for modified versions,
88except that this permission notice may be stated in a translation approved
89by the Foundation.
90
91@end titlepage
92@page
93
94
95@node Top, Introduction, (dir), (dir)
96@comment  node-name,  next,  previous,  up
97
98@ifinfo
99This document was produced for version @value{VERSION} of @sc{GNU} @sc{grep}.
100@end ifinfo
101
102@menu
103* Introduction::                Introduction.
104* Invoking::                    Invoking @sc{grep}; description of options.
105* Diagnostics::                 Exit status returned by @sc{grep}.
106* Grep Programs::               @sc{grep} programs.
107* Regular Expressions::         Regular Expressions.
108* Reporting Bugs::              Reporting Bugs.
109* Concept Index::               A menu with all the topics in this manual.
110* Index::                       A menu with all @sc{grep} commands
111                                 and command-line options.
112@end menu
113
114
115@node Introduction,  Invoking, Top, Top
116@comment  node-name,  next,  previous,  up
117@chapter Introduction
118
119@cindex Searching for a pattern.
120@sc{grep} searches the input files for lines containing a match to a given
121pattern list.  When it finds a match in a line, it copies the line to standard
122output (by default), or does whatever other sort of output you have requested
123with options.  @sc{grep} expects to do the matching on text.
124Since newline is also a separator for the list of patterns, there
125is no way to match newline characters in a text.
126
127@node Invoking, Diagnostics, Introduction, Top
128@comment  node-name,  next,  previous,  up
129@chapter Invoking @sc{grep}
130
131@sc{grep} comes with a rich set of options from POSIX.2 and GNU extensions.
132
133@table @samp
134
135@item -c
136@itemx --count
137@opindex -c
138@opindex -count
139@cindex counting lines
140Suppress normal output; instead print a count of matching
141lines for each input file.  With the @samp{-v}, @samp{--revert-match} option,
142count non-matching lines.
143
144@item -e @var{pattern}
145@itemx --regexp=@var{pattern}
146@opindex -e
147@opindex --regexp=@var{pattern}
148@cindex pattern list
149Use @var{pattern} as the pattern; useful to protect  patterns
150beginning with a @samp{-}.
151
152@item  -f @var{file}
153@itemx --file=@var{file}
154@opindex  -f
155@opindex  --file 
156@cindex pattern from file
157Obtain patterns from @var{file}, one  per  line.   The  empty
158file contains zero patterns, and therefore matches nothing.
159
160@item -i
161@itemx --ignore-case
162@opindex -i
163@opindex --ignore-case
164@cindex case insensitive search
165Ignore case distinctions in both the  pattern  and  the input files.
166
167@item -l
168@itemx --files-with-matches
169@opindex -l
170@opindex --files-with-matches
171@cindex names of matching files
172Suppress normal output; instead print the name of  each input
173file  from which output would normally have been printed.
174The scanning of every file will stop on the first match.
175
176@item -n
177@itemx --line-number
178@opindex -n
179@opindex --line-number
180@cindex line numbering
181Prefix each line of output with the line number  within its input file.
182
183@item -q
184@itemx --quiet
185@itemx --silent
186@opindex -q
187@opindex --quiet
188@opindex --silent
189@cindex quiet, silent
190Quiet; suppress normal output.  The scanning of every file will  stop on
191the first match.  Also see the @samp{-s} or @samp{--no-messages} option.
192
193@item -s
194@itemx --no-messages
195@opindex -s
196@opindex --no-messages
197@cindex suppress error messages
198Suppress error messages about nonexistent or unreadable files.
199Portability  note:  unlike  GNU @sc{grep}, BSD @sc{grep} does not comply
200with POSIX.2, because BSD @sc{grep} lacks a @samp{-q}  option and its
201@samp{-s} option behaves like GNU @sc{grep}'s @samp{-q} option.  Shell
202scripts intended to be portable to BSD @sc{grep} should avoid both
203@samp{-q} and @samp{-s} and should redirect
204output to @file{/dev/null} instead.
205
206@item -v
207@itemx --revert-match
208@opindex -v
209@opindex --revert-match
210@cindex revert matching
211@cindex print non-matching lines
212Invert the sense of matching,  to  select  non-matching lines.
213
214@item -x
215@itemx --line-regexp
216@opindex -x
217@opindex --line-regexp
218@cindex match the whole line
219Select only those matches that exactly match the  whole line.
220
221@end table
222
223@section GNU Extensions
224
225@table @samp
226
227@item -A @var{num}
228@itemx --after-context=@var{num}
229@opindex -A
230@opindex --after-context
231@cindex after context
232@cindex context lines, after match
233Print @var{num} lines of trailing context after matching lines.
234
235@item -B @var{num}
236@itemx --before-context=@var{num}
237@opindex -B
238@opindex --before-context
239@cindex before context
240@cindex context lines, before match
241Print @var{num} lines of leading context before matching lines.
242
243@item -C
244@itemx --context@var{[=num]}
245@opindex -C
246@opindex --context
247@cindex context
248Print @var{num} lines (default 2) of output context.
249
250
251@item -NUM
252@opindex -NUM
253Same as @samp{--context=@var{num}} lines  of  leading  and  trailing
254context.  However, grep will never print any given line more than once.
255
256
257@item -V
258@itemx --version
259@opindex -V
260@opindex --version
261@cindex Version, printing
262Print the version number of @sc{grep} to the standard output stream.
263This  version  number  should  be  included  in all bug reports.
264
265@item --help
266@opindex --help
267@cindex Usage summary, printing
268Print a usage message briefly summarizing these command-line options
269and the bug-reporting address, then exit.
270
271@item -b
272@itemx --byte-offset
273@opindex -b
274@opindex --byte-offset
275@cindex byte offset
276Print the byte offset within the input file before each line of output.
277When @sc{grep} runs on MS-DOS or MS-Windows, the printed byte offsets
278depend on whether the @samp{-u} (@samp{--unix-byte-offsets}) option is
279used; see below.
280
281@item -d @var{action}
282@itemx --directories=@var{action}
283@opindex -d 
284@opindex --directories
285@cindex directory search
286If an input file is a directory, use @var{action} to  process it.
287By  default, @var{action} is @samp{read}, which means that directories are
288read just  as  if  they  were  ordinary files (some operating systems
289and filesystems disallow this, and will cause @sc{grep} to print error
290messages for every directory).  If @var{action} is @samp{skip},
291directories are silently skipped.  If @var{action} is @samp{recurse},
292@sc{grep} reads all files under each directory, recursively; this is
293equivalent to the @samp{-r} option.
294
295@item -h
296@itemx --no-filename
297@opindex -h
298@opindex --no-filename
299@cindex no filename prefix
300Suppress the prefixing of filenames on output when multiple files are searched.
301
302@item -L
303@itemx --files-without-match
304@opindex -L
305@opindex --files-without-match
306@cindex files which don't match
307Suppress normal output; instead print the name of  each input
308file  from  which  no output would normally have been printed.
309The  scanning of every file will  stop  on  the  first match.
310
311@item -a
312@itemx --text
313@opindex -a
314@opindex --text
315@cindex suppress binary data
316@cindex binary files
317Do not suppress output lines that contain binary  data.
318Normally,  if  the  first  few bytes of a file indicate
319that the file contains binary data, grep outputs only a
320message saying that the file matches the pattern.  This
321option causes grep to act as if  the  file  is  a  text
322file, even if it would otherwise be treated as binary.
323@emph{Warning:}  the  result  might  be  binary garbage
324printed  to  the  terminal,  which  can  have  nasty
325side-effects if the terminal driver interprets some of
326it as commands.
327
328@item -w
329@itemx --word-regexp
330@opindex -w
331@opindex --word-regexp
332@cindex matching whole words
333Select only those lines containing  matches  that  form
334whole  words.   The test is that the matching substring
335must either be at the beginning of the  line,  or  preceded 
336by a non-word constituent character.  Similarly,
337it must be either at the end of the line or followed by
338a  non-word  constituent  character.   Word-constituent
339characters are letters, digits, and the underscore.
340
341@item -r
342@itemx --recursive
343@opindex -r
344@opindex --recursive
345@cindex recursive search
346@cindex searching directory trees
347For each directory mentioned in the command line, read and process all
348files in that directory, recursively.  This is the same as the @samp{-d
349recurse} option.
350
351@item -y
352@opindex -y
353@cindex case insensitive search, obsolete option
354Obsolete synonym for @samp{-i}.
355
356@item -U
357@itemx --binary
358@opindex -U
359@opindex --binary
360@cindex DOS/Windows binary files
361@cindex binary files, DOS/Windows
362Treat the file(s) as binary.  By default, under  MS-DOS
363and  MS-Windows, @sc{grep} guesses the file type by looking
364at the contents of the first 32KB read from  the  file.
365If @sc{grep} decides the file is a text file, it strips the
366CR characters from the original file contents (to  make
367regular  expressions  with  @code{^} and @code{$} work correctly).
368Specifying @samp{-U} overrules this guesswork, causing all
369files  to  be read and passed to the matching mechanism
370verbatim; if the file is a text file with  CR/LF  pairs
371at  the  end of each line, this will cause some regular
372expressions to fail.  This option is only supported  on
373MS-DOS and MS-Windows.
374
375@item -u
376@itemx --unix-byte-offsets
377@opindex -u
378@opindex --unix-byte-offsets
379@cindex DOS byte offsets
380@cindex byte offsets, on DOS/Windows
381Report Unix-style byte  offsets.   This  switch  causes
382@sc{grep} to report byte offsets as if the file were Unix style
383text file, i.e. the byte offsets ignore the CR characters which were
384stripped  off.  This  will produce results identical to running @sc{grep} on
385a Unix machine.  This option has no  effect  unless  @samp{-b}
386option is also used; it is only supported on MS-DOS and
387MS-Windows.
388
389@end table
390
391Several additional options control which variant of the @sc{grep}
392matching engine is used.  @xref{Grep Programs}.
393
394@sc{grep} uses the environment variable @var{LANG} to
395provide internationalization support, if compiled with this feature.
396
397@node Diagnostics, Grep Programs, Invoking, Top
398@comment  node-name,  next,  previous,  up
399@chapter Diagnostics
400Normally, exit status is 0 if matches were found, and 1 if no matches
401were found (the @samp{-v} option inverts the sense of the exit status).
402Exit status is 2  if  there  were  syntax errors  in  the  pattern,
403inaccessible input files, or other system errors.
404
405@node Grep Programs, Regular Expressions, Diagnostics, Top
406@comment  node-name,  next,  previous,  up
407@chapter @sc{grep} programs
408
409@sc{grep} searches the named input files (or standard input if no
410files are named, or the file name @file{-} is given) for lines containing
411a match to the given pattern.  By default, @sc{grep} prints the matching lines.
412There are three major variants of @sc{grep}, controlled by the following options.
413
414@table @samp
415
416@item -G
417@itemx --basic-regexp
418@opindex -G
419@opindex --basic-regexp
420@cindex matching basic regular expressions
421Interpret pattern as a basic  regular  expression.  This is the default.
422
423@item -E
424@item --extended-regexp
425@opindex -E
426@opindex --extended-regexp
427@cindex matching extended regular expressions
428Interpret pattern as  an  extended  regular  expression.
429
430
431@item -F
432@itemx --fixed-strings
433@opindex -F
434@opindex --fixed-strings
435@cindex matching fixed strings
436Interpret pattern as a list of fixed strings, separated
437by newlines, any of which is to be matched.
438
439@end table
440
441In addition, two variant programs @sc{egrep} and @sc{fgrep} are available.
442@sc{egrep} is similar (but not identical) to @samp{grep -E}, and
443is compatible with the historical Unix @sc{egrep}.  @sc{fgrep} is  the
444same as @samp{grep -F}.
445
446@node Regular Expressions, Reporting Bugs, Grep Programs, Top
447@comment  node-name,  next,  previous,  up
448@chapter Regular Expressions
449@cindex regular expressions
450
451A @dfn{regular expression} is a pattern that describes  a  set  of strings.
452Regular expressions are constructed analogously to arithmetic expressions,
453by using various operators  to  combine smaller expressions.
454@sc{grep} understands two different versions of  regular  expression
455syntax:   ``basic''  and  ``extended''.  In GNU @sc{grep}, there is no
456difference  in  available  functionality  using either  syntax.
457In  other  implementations,  basic regular expressions are less powerful.
458The  following  description applies  to  extended  regular  expressions;
459differences for basic regular expressions are summarized afterwards.
460
461The fundamental building blocks are the regular  expressions that  match
462a single character.  Most characters, including all letters and digits,
463are regular expressions  that  match themselves.  Any  metacharacter
464with special meaning may be quoted by preceding it with a backslash.
465A list of characters enclosed by @samp{[} and @samp{]} matches any
466single character in that list; if the first character of the list is the
467caret @samp{^}, then it 
468matches any character @strong{not} in the list.  For example, the regular
469expression @samp{[0123456789]} matches any single digit.
470A range of @sc{ascii} characters  may be  specified  by  giving  the  first
471and  last characters, separated by a hyphen.  Finally, certain  named
472classes  of characters  are  predefined.   Their names are self explanatory,
473and  they  are :
474
475@cindex classes of characters
476@cindex character classes
477@table @samp
478
479@item [:alnum:]
480@opindex alnum 
481@cindex  alphanumeric characters
482Any of [:digit:] or [:alpha:]
483
484@item [:alpha:]
485@opindex alpha
486@cindex alphabetic characters
487Any local-specific or one of the @sc{ascii} letters:@*
488@code{a b c d e f g h i j k l m n o p q r s t u v w x y z},@*
489@code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
490
491@item [:cntrl:]
492@opindex cntrl
493@cindex control characters
494Any of @code{BEL}, @code{BS}, @code{CR}, @code{FF}, @code{HT},
495@code{NL}, or @code{VT}.
496
497@item [:digit:]
498@opindex digit
499@cindex digit characters
500@cindex numeric characters
501Any one of @code{0 1 2 3 4 5 6 7 8 9}.
502
503@item [:graph:]
504@opindex graph
505@cindex graphic characters
506Anything that is not a @samp{[:alphanum:]} or @samp{[:punct:]}.
507
508@item [:lower:]
509@opindex lower
510@cindex lower-case alphabetic characters
511Any one of @code{a b c d e f g h i j k l m n o p q r s t u v w x y z}.
512
513@item [:print:]
514@opindex print
515@cindex printable characters
516Any character from the @samp{[:space:]} class, and any character that is
517@strong{not} in the @samp{[:isgraph:]} class.
518
519@item [:punct:]
520@opindex punct
521@cindex punctuation characters
522Any one of @code{!@: " #% & ' ( ) ; < = > ?@: [ \ ] * + , - .@: / : ^ _ @{ | @}}.
523
524
525@item [:space:]
526@opindex space
527@cindex space characters
528@cindex whitespace characters
529Any one of @code{CR FF HT NL VT SPACE}.
530
531@item [:upper:]
532@opindex upper
533@cindex upper-case alphabetic characters
534Any one of @code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
535
536@item [:xdigit:]
537@opindex xdigit
538@cindex xdigit class
539@cindex hexadecimal digits
540Any one of @code{a b c d e f A B C D E F 0 1 2 3 4 5 6 7 8 9}.
541
542@end table
543For example, @samp{[[:alnum:]]} means @samp{[0-9A-Za-z]}, except the latter
544form is dependent upon the @sc{ascii} character  encoding,  whereas  the
545former  is portable.  (Note that the brackets in these class names are
546part of the symbolic names, and must  be  included in  addition  to
547the brackets delimiting the bracket list).  Most metacharacters lose
548their special meaning inside lists.  To include a literal @samp{]}, place it
549first in the list.  Similarly, to include a literal @samp{^}, place it anywhere
550but  first.  Finally, to include a literal @samp{-}, place it last.
551
552The period @samp{.} matches any single character.  The symbol @samp{\w}
553is a synonym for @samp{[[:alnum:]]} and @samp{\W} is a synonym for
554@samp{[^[:alnum]]}.
555
556The caret @samp{^} and the dollar sign @samp{$} are metacharacters that
557respectively match the empty string at the beginning and end
558of a line.  The symbols @samp{\<} and  @samp{\>} respectively match the
559empty string at the beginning and end of a word.  The symbol
560@samp{\b} matches the empty string at the edge of a  word,  and  @samp{\B}
561matches  the empty string provided it's not at the edge of a word.
562
563A regular expression may  be  followed  by  one  of  several
564repetition operators:
565
566
567@table @samp
568
569@item ?
570@opindex ?
571@cindex question mark
572@cindex match sub-expression at most once
573The preceding item is optional and will be matched at most once.
574
575@item *
576@opindex *
577@cindex asterisk
578@cindex match sub-expression zero or more times
579The preceding item will be matched zero or more times.
580
581@item +
582@opindex +
583@cindex plus sign 
584The preceding item will be matched one or more times.
585
586@item @{@var{n}@}
587@opindex @{n@}
588@cindex braces, one argument
589@cindex match sub-expression n times
590The preceding item is matched exactly @var{n} times.
591
592@item @{@var{n},@}
593@opindex @{n,@}
594@cindex braces, second argument omitted
595@cindex match sub-expression n or more times
596The preceding item is matched n or more times.
597
598@item @{,@var{m}@}
599@opindex @{,m@}
600@cindex braces, first argument omitted
601@cindex match sub-expression at most m times
602The preceding item is optional and is matched at most @var{m} times.
603
604@item @{@var{n},@var{m}@}
605@opindex @{n,m@}
606@cindex braces, two arguments
607The preceding item is matched at least @var{n} times, but not more than
608@var{m} times.
609
610@end table
611
612Two regular expressions may be concatenated;  the  resulting regular
613expression matches any string formed by concatenating two substrings
614that respectively match the  concatenated subexpressions.
615
616Two regular expressions may be joined by the infix  operator @samp{|}; the
617resulting  regular  expression  matches  any string matching either
618subexpression.
619
620Repetition takes precedence  over  concatenation,  which  in turn
621takes precedence over alternation.  A whole subexpression may be
622enclosed in parentheses to override  these  precedence rules.
623
624The backreference @samp{\@var{n}}, where @var{n} is a single digit, matches the
625substring previously matched by the @var{n}th parenthesized subexpression
626of the regular expression.
627
628@cindex basic regular expressions
629In basic regular expressions the metacharacters @samp{?}, @samp{+},
630@samp{@{}, @samp{|}, @samp{(}, and @samp{)} lose their special meaning;
631instead use the backslashed versions @samp{\?}, @samp{\+}, @samp{\@{},
632@samp{\|}, @samp{\(}, and @samp{\)}.
633
634In @sc{egrep} the metacharacter  @samp{@{}  loses  its  special  meaning;
635instead use @samp{\@{}.  This not true for @samp{grep -E}.
636
637
638@node Reporting Bugs, Concept Index, Regular Expressions, Top
639@comment  node-name,  next,  previous,  up
640@chapter Reporting bugs
641
642@cindex Bugs, reporting
643Email bug reports to @email{bug-gnu-utils@@gnu.org}.
644Be sure to include the word ``grep'' somewhere in the ``Subject:'' field.
645
646Large repetition counts in the  @samp{@{m,n@}}  construct  may  cause
647@sc{grep}  to  use  lots  of  memory.  In addition, certain other
648obscure regular expressions  require  exponential  time  and
649space, and may cause grep to run out of memory.
650Backreferences are very slow, and  may  require  exponential time.
651
652@page
653@node Concept Index , Index, Reporting Bugs, Top
654@comment node-name,  next,  previous,  up
655@unnumbered Concept Index
656
657This is a general index of all issues discussed in this manual, with the
658exception of the @sc{grep} commands and command-line options.
659
660@printindex cp
661
662@page
663@node Index, , Concept Index, Top
664@unnumbered Index
665
666This is an alphabetical list of all @sc{grep} commands and command-line
667options.
668
669@printindex fn
670
671@contents
672@bye
673