1<html>
2<head>
3<title>pcregrep specification</title>
4</head>
5<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
6<h1>pcregrep man page</h1>
7<p>
8Return to the <a href="index.html">PCRE index page</a>.
9</p>
10<p>
11This page is part of the PCRE HTML documentation. It was generated automatically
12from the original man page. If there is any nonsense in it, please consult the
13man page, in case the conversion went wrong.
14<br>
15<ul>
16<li><a name="TOC1" href="#SEC1">SYNOPSIS</a>
17<li><a name="TOC2" href="#SEC2">DESCRIPTION</a>
18<li><a name="TOC3" href="#SEC3">SUPPORT FOR COMPRESSED FILES</a>
19<li><a name="TOC4" href="#SEC4">BINARY FILES</a>
20<li><a name="TOC5" href="#SEC5">OPTIONS</a>
21<li><a name="TOC6" href="#SEC6">ENVIRONMENT VARIABLES</a>
22<li><a name="TOC7" href="#SEC7">NEWLINES</a>
23<li><a name="TOC8" href="#SEC8">OPTIONS COMPATIBILITY</a>
24<li><a name="TOC9" href="#SEC9">OPTIONS WITH DATA</a>
25<li><a name="TOC10" href="#SEC10">MATCHING ERRORS</a>
26<li><a name="TOC11" href="#SEC11">DIAGNOSTICS</a>
27<li><a name="TOC12" href="#SEC12">SEE ALSO</a>
28<li><a name="TOC13" href="#SEC13">AUTHOR</a>
29<li><a name="TOC14" href="#SEC14">REVISION</a>
30</ul>
31<br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br>
32<P>
33<b>pcregrep [options] [long options] [pattern] [path1 path2 ...]</b>
34</P>
35<br><a name="SEC2" href="#TOC1">DESCRIPTION</a><br>
36<P>
37<b>pcregrep</b> searches files for character patterns, in the same way as other
38grep commands do, but it uses the PCRE regular expression library to support
39patterns that are compatible with the regular expressions of Perl 5. See
40<a href="pcrepattern.html"><b>pcrepattern</b>(3)</a>
41for a full description of syntax and semantics of the regular expressions
42that PCRE supports.
43</P>
44<P>
45Patterns, whether supplied on the command line or in a separate file, are given
46without delimiters. For example:
47<pre>
48  pcregrep Thursday /etc/motd
49</pre>
50If you attempt to use delimiters (for example, by surrounding a pattern with
51slashes, as is common in Perl scripts), they are interpreted as part of the
52pattern. Quotes can of course be used to delimit patterns on the command line
53because they are interpreted by the shell, and indeed they are required if a
54pattern contains white space or shell metacharacters.
55</P>
56<P>
57The first argument that follows any option settings is treated as the single
58pattern to be matched when neither <b>-e</b> nor <b>-f</b> is present.
59Conversely, when one or both of these options are used to specify patterns, all
60arguments are treated as path names. At least one of <b>-e</b>, <b>-f</b>, or an
61argument pattern must be provided.
62</P>
63<P>
64If no files are specified, <b>pcregrep</b> reads the standard input. The
65standard input can also be referenced by a name consisting of a single hyphen.
66For example:
67<pre>
68  pcregrep some-pattern /file1 - /file3
69</pre>
70By default, each line that matches a pattern is copied to the standard
71output, and if there is more than one file, the file name is output at the
72start of each line, followed by a colon. However, there are options that can
73change how <b>pcregrep</b> behaves. In particular, the <b>-M</b> option makes it
74possible to search for patterns that span line boundaries. What defines a line
75boundary is controlled by the <b>-N</b> (<b>--newline</b>) option.
76</P>
77<P>
78The amount of memory used for buffering files that are being scanned is
79controlled by a parameter that can be set by the <b>--buffer-size</b> option.
80The default value for this parameter is specified when <b>pcregrep</b> is built,
81with the default default being 20K. A block of memory three times this size is
82used (to allow for buffering "before" and "after" lines). An error occurs if a
83line overflows the buffer.
84</P>
85<P>
86Patterns are limited to 8K or BUFSIZ bytes, whichever is the greater. BUFSIZ is
87defined in <b>&#60;stdio.h&#62;</b>. When there is more than one pattern (specified by
88the use of <b>-e</b> and/or <b>-f</b>), each pattern is applied to each line in
89the order in which they are defined, except that all the <b>-e</b> patterns are
90tried before the <b>-f</b> patterns.
91</P>
92<P>
93By default, as soon as one pattern matches (or fails to match when <b>-v</b> is
94used), no further patterns are considered. However, if <b>--colour</b> (or
95<b>--color</b>) is used to colour the matching substrings, or if
96<b>--only-matching</b>, <b>--file-offsets</b>, or <b>--line-offsets</b> is used to
97output only the part of the line that matched (either shown literally, or as an
98offset), scanning resumes immediately following the match, so that further
99matches on the same line can be found. If there are multiple patterns, they are
100all tried on the remainder of the line, but patterns that follow the one that
101matched are not tried on the earlier part of the line.
102</P>
103<P>
104This is the same behaviour as GNU grep, but it does mean that the order in
105which multiple patterns are specified can affect the output when one of the
106above options is used.
107</P>
108<P>
109Patterns that can match an empty string are accepted, but empty string
110matches are never recognized. An example is the pattern "(super)?(man)?", in
111which all components are optional. This pattern finds all occurrences of both
112"super" and "man"; the output differs from matching with "super|man" when only
113the matching substrings are being shown.
114</P>
115<P>
116If the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variable is set,
117<b>pcregrep</b> uses the value to set a locale when calling the PCRE library.
118The <b>--locale</b> option can be used to override this.
119</P>
120<br><a name="SEC3" href="#TOC1">SUPPORT FOR COMPRESSED FILES</a><br>
121<P>
122It is possible to compile <b>pcregrep</b> so that it uses <b>libz</b> or
123<b>libbz2</b> to read files whose names end in <b>.gz</b> or <b>.bz2</b>,
124respectively. You can find out whether your binary has support for one or both
125of these file types by running it with the <b>--help</b> option. If the
126appropriate support is not present, files are treated as plain text. The
127standard input is always so treated.
128</P>
129<br><a name="SEC4" href="#TOC1">BINARY FILES</a><br>
130<P>
131By default, a file that contains a binary zero byte within the first 1024 bytes
132is identified as a binary file, and is processed specially. (GNU grep also
133identifies binary files in this manner.) See the <b>--binary-files</b> option
134for a means of changing the way binary files are handled.
135</P>
136<br><a name="SEC5" href="#TOC1">OPTIONS</a><br>
137<P>
138The order in which some of the options appear can affect the output. For
139example, both the <b>-h</b> and <b>-l</b> options affect the printing of file
140names. Whichever comes later in the command line will be the one that takes
141effect. Numerical values for options may be followed by K or M, to signify
142multiplication by 1024 or 1024*1024 respectively.
143</P>
144<P>
145<b>--</b>
146This terminates the list of options. It is useful if the next item on the
147command line starts with a hyphen but is not an option. This allows for the
148processing of patterns and filenames that start with hyphens.
149</P>
150<P>
151<b>-A</b> <i>number</i>, <b>--after-context=</b><i>number</i>
152Output <i>number</i> lines of context after each matching line. If filenames
153and/or line numbers are being output, a hyphen separator is used instead of a
154colon for the context lines. A line containing "--" is output between each
155group of lines, unless they are in fact contiguous in the input file. The value
156of <i>number</i> is expected to be relatively small. However, <b>pcregrep</b>
157guarantees to have up to 8K of following text available for context output.
158</P>
159<P>
160<b>-a</b>, <b>--text</b>
161Treat binary files as text. This is equivalent to
162<b>--binary-files</b>=<i>text</i>.
163</P>
164<P>
165<b>-B</b> <i>number</i>, <b>--before-context=</b><i>number</i>
166Output <i>number</i> lines of context before each matching line. If filenames
167and/or line numbers are being output, a hyphen separator is used instead of a
168colon for the context lines. A line containing "--" is output between each
169group of lines, unless they are in fact contiguous in the input file. The value
170of <i>number</i> is expected to be relatively small. However, <b>pcregrep</b>
171guarantees to have up to 8K of preceding text available for context output.
172</P>
173<P>
174<b>--binary-files=</b><i>word</i>
175Specify how binary files are to be processed. If the word is "binary" (the
176default), pattern matching is performed on binary files, but the only output is
177"Binary file &#60;name&#62; matches" when a match succeeds. If the word is "text",
178which is equivalent to the <b>-a</b> or <b>--text</b> option, binary files are
179processed in the same way as any other file. In this case, when a match
180succeeds, the output may be binary garbage, which can have nasty effects if
181sent to a terminal. If the word is "without-match", which is equivalent to the
182<b>-I</b> option, binary files are not processed at all; they are assumed not to
183be of interest.
184</P>
185<P>
186<b>--buffer-size=</b><i>number</i>
187Set the parameter that controls how much memory is used for buffering files
188that are being scanned.
189</P>
190<P>
191<b>-C</b> <i>number</i>, <b>--context=</b><i>number</i>
192Output <i>number</i> lines of context both before and after each matching line.
193This is equivalent to setting both <b>-A</b> and <b>-B</b> to the same value.
194</P>
195<P>
196<b>-c</b>, <b>--count</b>
197Do not output individual lines from the files that are being scanned; instead
198output the number of lines that would otherwise have been shown. If no lines
199are selected, the number zero is output. If several files are are being
200scanned, a count is output for each of them. However, if the
201<b>--files-with-matches</b> option is also used, only those files whose counts
202are greater than zero are listed. When <b>-c</b> is used, the <b>-A</b>,
203<b>-B</b>, and <b>-C</b> options are ignored.
204</P>
205<P>
206<b>--colour</b>, <b>--color</b>
207If this option is given without any data, it is equivalent to "--colour=auto".
208If data is required, it must be given in the same shell item, separated by an
209equals sign.
210</P>
211<P>
212<b>--colour=</b><i>value</i>, <b>--color=</b><i>value</i>
213This option specifies under what circumstances the parts of a line that matched
214a pattern should be coloured in the output. By default, the output is not
215coloured. The value (which is optional, see above) may be "never", "always", or
216"auto". In the latter case, colouring happens only if the standard output is
217connected to a terminal. More resources are used when colouring is enabled,
218because <b>pcregrep</b> has to search for all possible matches in a line, not
219just one, in order to colour them all.
220<br>
221<br>
222The colour that is used can be specified by setting the environment variable
223PCREGREP_COLOUR or PCREGREP_COLOR. The value of this variable should be a
224string of two numbers, separated by a semicolon. They are copied directly into
225the control string for setting colour on a terminal, so it is your
226responsibility to ensure that they make sense. If neither of the environment
227variables is set, the default is "1;31", which gives red.
228</P>
229<P>
230<b>-D</b> <i>action</i>, <b>--devices=</b><i>action</i>
231If an input path is not a regular file or a directory, "action" specifies how
232it is to be processed. Valid values are "read" (the default) or "skip"
233(silently skip the path).
234</P>
235<P>
236<b>-d</b> <i>action</i>, <b>--directories=</b><i>action</i>
237If an input path is a directory, "action" specifies how it is to be processed.
238Valid values are "read" (the default), "recurse" (equivalent to the <b>-r</b>
239option), or "skip" (silently skip the path). In the default case, directories
240are read as if they were ordinary files. In some operating systems the effect
241of reading a directory like this is an immediate end-of-file.
242</P>
243<P>
244<b>-e</b> <i>pattern</i>, <b>--regex=</b><i>pattern</i>, <b>--regexp=</b><i>pattern</i>
245Specify a pattern to be matched. This option can be used multiple times in
246order to specify several patterns. It can also be used as a way of specifying a
247single pattern that starts with a hyphen. When <b>-e</b> is used, no argument
248pattern is taken from the command line; all arguments are treated as file
249names. There is an overall maximum of 100 patterns. They are applied to each
250line in the order in which they are defined until one matches (or fails to
251match if <b>-v</b> is used). If <b>-f</b> is used with <b>-e</b>, the command line
252patterns are matched first, followed by the patterns from the file, independent
253of the order in which these options are specified. Note that multiple use of
254<b>-e</b> is not the same as a single pattern with alternatives. For example,
255X|Y finds the first character in a line that is X or Y, whereas if the two
256patterns are given separately, <b>pcregrep</b> finds X if it is present, even if
257it follows Y in the line. It finds Y only if there is no X in the line. This
258really matters only if you are using <b>-o</b> to show the part(s) of the line
259that matched.
260</P>
261<P>
262<b>--exclude</b>=<i>pattern</i>
263When <b>pcregrep</b> is searching the files in a directory as a consequence of
264the <b>-r</b> (recursive search) option, any regular files whose names match the
265pattern are excluded. Subdirectories are not excluded by this option; they are
266searched recursively, subject to the <b>--exclude-dir</b> and
267<b>--include_dir</b> options. The pattern is a PCRE regular expression, and is
268matched against the final component of the file name (not the entire path). If
269a file name matches both <b>--include</b> and <b>--exclude</b>, it is excluded.
270There is no short form for this option.
271</P>
272<P>
273<b>--exclude-dir</b>=<i>pattern</i>
274When <b>pcregrep</b> is searching the contents of a directory as a consequence
275of the <b>-r</b> (recursive search) option, any subdirectories whose names match
276the pattern are excluded. (Note that the \fP--exclude\fP option does not affect
277subdirectories.) The pattern is a PCRE regular expression, and is matched
278against the final component of the name (not the entire path). If a
279subdirectory name matches both <b>--include-dir</b> and <b>--exclude-dir</b>, it
280is excluded. There is no short form for this option.
281</P>
282<P>
283<b>-F</b>, <b>--fixed-strings</b>
284Interpret each pattern as a list of fixed strings, separated by newlines,
285instead of as a regular expression. The <b>-w</b> (match as a word) and <b>-x</b>
286(match whole line) options can be used with <b>-F</b>. They apply to each of the
287fixed strings. A line is selected if any of the fixed strings are found in it
288(subject to <b>-w</b> or <b>-x</b>, if present).
289</P>
290<P>
291<b>-f</b> <i>filename</i>, <b>--file=</b><i>filename</i>
292Read a number of patterns from the file, one per line, and match them against
293each line of input. A data line is output if any of the patterns match it. The
294filename can be given as "-" to refer to the standard input. When <b>-f</b> is
295used, patterns specified on the command line using <b>-e</b> may also be
296present; they are tested before the file's patterns. However, no other pattern
297is taken from the command line; all arguments are treated as the names of paths
298to be searched. There is an overall maximum of 100 patterns. Trailing white
299space is removed from each line, and blank lines are ignored. An empty file
300contains no patterns and therefore matches nothing. See also the comments about
301multiple patterns versus a single pattern with alternatives in the description
302of <b>-e</b> above.
303</P>
304<P>
305<b>--file-list</b>=<i>filename</i>
306Read a list of files to be searched from the given file, one per line. Trailing
307white space is removed from each line, and blank lines are ignored. These files
308are searched before any others that may be listed on the command line. The
309filename can be given as "-" to refer to the standard input. If <b>--file</b>
310and <b>--file-list</b> are both specified as "-", patterns are read first. This
311is useful only when the standard input is a terminal, from which further lines
312(the list of files) can be read after an end-of-file indication.
313</P>
314<P>
315<b>--file-offsets</b>
316Instead of showing lines or parts of lines that match, show each match as an
317offset from the start of the file and a length, separated by a comma. In this
318mode, no context is shown. That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b>
319options are ignored. If there is more than one match in a line, each of them is
320shown separately. This option is mutually exclusive with <b>--line-offsets</b>
321and <b>--only-matching</b>.
322</P>
323<P>
324<b>-H</b>, <b>--with-filename</b>
325Force the inclusion of the filename at the start of output lines when searching
326a single file. By default, the filename is not shown in this case. For matching
327lines, the filename is followed by a colon; for context lines, a hyphen
328separator is used. If a line number is also being output, it follows the file
329name.
330</P>
331<P>
332<b>-h</b>, <b>--no-filename</b>
333Suppress the output filenames when searching multiple files. By default,
334filenames are shown when multiple files are searched. For matching lines, the
335filename is followed by a colon; for context lines, a hyphen separator is used.
336If a line number is also being output, it follows the file name.
337</P>
338<P>
339<b>--help</b>
340Output a help message, giving brief details of the command options and file
341type support, and then exit.
342</P>
343<P>
344<b>-I</b>
345Treat binary files as never matching. This is equivalent to
346<b>--binary-files</b>=<i>without-match</i>.
347</P>
348<P>
349<b>-i</b>, <b>--ignore-case</b>
350Ignore upper/lower case distinctions during comparisons.
351</P>
352<P>
353<b>--include</b>=<i>pattern</i>
354When <b>pcregrep</b> is searching the files in a directory as a consequence of
355the <b>-r</b> (recursive search) option, only those regular files whose names
356match the pattern are included. Subdirectories are always included and searched
357recursively, subject to the \fP--include-dir\fP and <b>--exclude-dir</b>
358options. The pattern is a PCRE regular expression, and is matched against the
359final component of the file name (not the entire path). If a file name matches
360both <b>--include</b> and <b>--exclude</b>, it is excluded. There is no short
361form for this option.
362</P>
363<P>
364<b>--include-dir</b>=<i>pattern</i>
365When <b>pcregrep</b> is searching the contents of a directory as a consequence
366of the <b>-r</b> (recursive search) option, only those subdirectories whose
367names match the pattern are included. (Note that the <b>--include</b> option
368does not affect subdirectories.) The pattern is a PCRE regular expression, and
369is matched against the final component of the name (not the entire path). If a
370subdirectory name matches both <b>--include-dir</b> and <b>--exclude-dir</b>, it
371is excluded. There is no short form for this option.
372</P>
373<P>
374<b>-L</b>, <b>--files-without-match</b>
375Instead of outputting lines from the files, just output the names of the files
376that do not contain any lines that would have been output. Each file name is
377output once, on a separate line.
378</P>
379<P>
380<b>-l</b>, <b>--files-with-matches</b>
381Instead of outputting lines from the files, just output the names of the files
382containing lines that would have been output. Each file name is output
383once, on a separate line. Searching normally stops as soon as a matching line
384is found in a file. However, if the <b>-c</b> (count) option is also used,
385matching continues in order to obtain the correct count, and those files that
386have at least one match are listed along with their counts. Using this option
387with <b>-c</b> is a way of suppressing the listing of files with no matches.
388</P>
389<P>
390<b>--label</b>=<i>name</i>
391This option supplies a name to be used for the standard input when file names
392are being output. If not supplied, "(standard input)" is used. There is no
393short form for this option.
394</P>
395<P>
396<b>--line-buffered</b>
397When this option is given, input is read and processed line by line, and the
398output is flushed after each write. By default, input is read in large chunks,
399unless <b>pcregrep</b> can determine that it is reading from a terminal (which
400is currently possible only in Unix environments). Output to terminal is
401normally automatically flushed by the operating system. This option can be
402useful when the input or output is attached to a pipe and you do not want
403<b>pcregrep</b> to buffer up large amounts of data. However, its use will affect
404performance, and the <b>-M</b> (multiline) option ceases to work.
405</P>
406<P>
407<b>--line-offsets</b>
408Instead of showing lines or parts of lines that match, show each match as a
409line number, the offset from the start of the line, and a length. The line
410number is terminated by a colon (as usual; see the <b>-n</b> option), and the
411offset and length are separated by a comma. In this mode, no context is shown.
412That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b> options are ignored. If there is
413more than one match in a line, each of them is shown separately. This option is
414mutually exclusive with <b>--file-offsets</b> and <b>--only-matching</b>.
415</P>
416<P>
417<b>--locale</b>=<i>locale-name</i>
418This option specifies a locale to be used for pattern matching. It overrides
419the value in the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variables. If no
420locale is specified, the PCRE library's default (usually the "C" locale) is
421used. There is no short form for this option.
422</P>
423<P>
424<b>--match-limit</b>=<i>number</i>
425Processing some regular expression patterns can require a very large amount of
426memory, leading in some cases to a program crash if not enough is available.
427Other patterns may take a very long time to search for all possible matching
428strings. The <b>pcre_exec()</b> function that is called by <b>pcregrep</b> to do
429the matching has two parameters that can limit the resources that it uses.
430<br>
431<br>
432The <b>--match-limit</b> option provides a means of limiting resource usage
433when processing patterns that are not going to match, but which have a very
434large number of possibilities in their search trees. The classic example is a
435pattern that uses nested unlimited repeats. Internally, PCRE uses a function
436called <b>match()</b> which it calls repeatedly (sometimes recursively). The
437limit set by <b>--match-limit</b> is imposed on the number of times this
438function is called during a match, which has the effect of limiting the amount
439of backtracking that can take place.
440<br>
441<br>
442The <b>--recursion-limit</b> option is similar to <b>--match-limit</b>, but
443instead of limiting the total number of times that <b>match()</b> is called, it
444limits the depth of recursive calls, which in turn limits the amount of memory
445that can be used. The recursion depth is a smaller number than the total number
446of calls, because not all calls to <b>match()</b> are recursive. This limit is
447of use only if it is set smaller than <b>--match-limit</b>.
448<br>
449<br>
450There are no short forms for these options. The default settings are specified
451when the PCRE library is compiled, with the default default being 10 million.
452</P>
453<P>
454<b>-M</b>, <b>--multiline</b>
455Allow patterns to match more than one line. When this option is given, patterns
456may usefully contain literal newline characters and internal occurrences of ^
457and $ characters. The output for a successful match may consist of more than
458one line, the last of which is the one in which the match ended. If the matched
459string ends with a newline sequence the output ends at the end of that line.
460<br>
461<br>
462When this option is set, the PCRE library is called in "multiline" mode.
463There is a limit to the number of lines that can be matched, imposed by the way
464that <b>pcregrep</b> buffers the input file as it scans it. However,
465<b>pcregrep</b> ensures that at least 8K characters or the rest of the document
466(whichever is the shorter) are available for forward matching, and similarly
467the previous 8K characters (or all the previous characters, if fewer than 8K)
468are guaranteed to be available for lookbehind assertions. This option does not
469work when input is read line by line (see \fP--line-buffered\fP.)
470</P>
471<P>
472<b>-N</b> <i>newline-type</i>, <b>--newline</b>=<i>newline-type</i>
473The PCRE library supports five different conventions for indicating
474the ends of lines. They are the single-character sequences CR (carriage return)
475and LF (linefeed), the two-character sequence CRLF, an "anycrlf" convention,
476which recognizes any of the preceding three types, and an "any" convention, in
477which any Unicode line ending sequence is assumed to end a line. The Unicode
478sequences are the three just mentioned, plus VT (vertical tab, U+000B), FF
479(form feed, U+000C), NEL (next line, U+0085), LS (line separator, U+2028), and
480PS (paragraph separator, U+2029).
481<br>
482<br>
483When the PCRE library is built, a default line-ending sequence is specified.
484This is normally the standard sequence for the operating system. Unless
485otherwise specified by this option, <b>pcregrep</b> uses the library's default.
486The possible values for this option are CR, LF, CRLF, ANYCRLF, or ANY. This
487makes it possible to use <b>pcregrep</b> on files that have come from other
488environments without having to modify their line endings. If the data that is
489being scanned does not agree with the convention set by this option,
490<b>pcregrep</b> may behave in strange ways.
491</P>
492<P>
493<b>-n</b>, <b>--line-number</b>
494Precede each output line by its line number in the file, followed by a colon
495for matching lines or a hyphen for context lines. If the filename is also being
496output, it precedes the line number. This option is forced if
497<b>--line-offsets</b> is used.
498</P>
499<P>
500<b>--no-jit</b>
501If the PCRE library is built with support for just-in-time compiling (which
502speeds up matching), <b>pcregrep</b> automatically makes use of this, unless it
503was explicitly disabled at build time. This option can be used to disable the
504use of JIT at run time. It is provided for testing and working round problems.
505It should never be needed in normal use.
506</P>
507<P>
508<b>-o</b>, <b>--only-matching</b>
509Show only the part of the line that matched a pattern instead of the whole
510line. In this mode, no context is shown. That is, the <b>-A</b>, <b>-B</b>, and
511<b>-C</b> options are ignored. If there is more than one match in a line, each
512of them is shown separately. If <b>-o</b> is combined with <b>-v</b> (invert the
513sense of the match to find non-matching lines), no output is generated, but the
514return code is set appropriately. If the matched portion of the line is empty,
515nothing is output unless the file name or line number are being printed, in
516which case they are shown on an otherwise empty line. This option is mutually
517exclusive with <b>--file-offsets</b> and <b>--line-offsets</b>.
518</P>
519<P>
520<b>-o</b><i>number</i>, <b>--only-matching</b>=<i>number</i>
521Show only the part of the line that matched the capturing parentheses of the
522given number. Up to 32 capturing parentheses are supported. Because these
523options can be given without an argument (see above), if an argument is
524present, it must be given in the same shell item, for example, -o3 or
525--only-matching=2. The comments given for the non-argument case above also
526apply to this case. If the specified capturing parentheses do not exist in the
527pattern, or were not set in the match, nothing is output unless the file name
528or line number are being printed.
529</P>
530<P>
531<b>-q</b>, <b>--quiet</b>
532Work quietly, that is, display nothing except error messages. The exit
533status indicates whether or not any matches were found.
534</P>
535<P>
536<b>-r</b>, <b>--recursive</b>
537If any given path is a directory, recursively scan the files it contains,
538taking note of any <b>--include</b> and <b>--exclude</b> settings. By default, a
539directory is read as a normal file; in some operating systems this gives an
540immediate end-of-file. This option is a shorthand for setting the <b>-d</b>
541option to "recurse".
542</P>
543<P>
544<b>--recursion-limit</b>=<i>number</i>
545See <b>--match-limit</b> above.
546</P>
547<P>
548<b>-s</b>, <b>--no-messages</b>
549Suppress error messages about non-existent or unreadable files. Such files are
550quietly skipped. However, the return code is still 2, even if matches were
551found in other files.
552</P>
553<P>
554<b>-u</b>, <b>--utf-8</b>
555Operate in UTF-8 mode. This option is available only if PCRE has been compiled
556with UTF-8 support. Both patterns and subject lines must be valid strings of
557UTF-8 characters.
558</P>
559<P>
560<b>-V</b>, <b>--version</b>
561Write the version numbers of <b>pcregrep</b> and the PCRE library that is being
562used to the standard error stream.
563</P>
564<P>
565<b>-v</b>, <b>--invert-match</b>
566Invert the sense of the match, so that lines which do <i>not</i> match any of
567the patterns are the ones that are found.
568</P>
569<P>
570<b>-w</b>, <b>--word-regex</b>, <b>--word-regexp</b>
571Force the patterns to match only whole words. This is equivalent to having \b
572at the start and end of the pattern.
573</P>
574<P>
575<b>-x</b>, <b>--line-regex</b>, <b>--line-regexp</b>
576Force the patterns to be anchored (each must start matching at the beginning of
577a line) and in addition, require them to match entire lines. This is
578equivalent to having ^ and $ characters at the start and end of each
579alternative branch in every pattern.
580</P>
581<br><a name="SEC6" href="#TOC1">ENVIRONMENT VARIABLES</a><br>
582<P>
583The environment variables <b>LC_ALL</b> and <b>LC_CTYPE</b> are examined, in that
584order, for a locale. The first one that is set is used. This can be overridden
585by the <b>--locale</b> option. If no locale is set, the PCRE library's default
586(usually the "C" locale) is used.
587</P>
588<br><a name="SEC7" href="#TOC1">NEWLINES</a><br>
589<P>
590The <b>-N</b> (<b>--newline</b>) option allows <b>pcregrep</b> to scan files with
591different newline conventions from the default. However, the setting of this
592option does not affect the way in which <b>pcregrep</b> writes information to
593the standard error and output streams. It uses the string "\n" in C
594<b>printf()</b> calls to indicate newlines, relying on the C I/O library to
595convert this to an appropriate sequence if the output is sent to a file.
596</P>
597<br><a name="SEC8" href="#TOC1">OPTIONS COMPATIBILITY</a><br>
598<P>
599Many of the short and long forms of <b>pcregrep</b>'s options are the same
600as in the GNU <b>grep</b> program. Any long option of the form
601<b>--xxx-regexp</b> (GNU terminology) is also available as <b>--xxx-regex</b>
602(PCRE terminology). However, the <b>--file-list</b>, <b>--file-offsets</b>,
603<b>--include-dir</b>, <b>--line-offsets</b>, <b>--locale</b>, <b>--match-limit</b>,
604<b>-M</b>, <b>--multiline</b>, <b>-N</b>, <b>--newline</b>,
605<b>--recursion-limit</b>, <b>-u</b>, and <b>--utf-8</b> options are specific to
606<b>pcregrep</b>, as is the use of the <b>--only-matching</b> option with a
607capturing parentheses number.
608</P>
609<P>
610Although most of the common options work the same way, a few are different in
611<b>pcregrep</b>. For example, the <b>--include</b> option's argument is a glob
612for GNU <b>grep</b>, but a regular expression for <b>pcregrep</b>. If both the
613<b>-c</b> and <b>-l</b> options are given, GNU grep lists only file names,
614without counts, but <b>pcregrep</b> gives the counts.
615</P>
616<br><a name="SEC9" href="#TOC1">OPTIONS WITH DATA</a><br>
617<P>
618There are four different ways in which an option with data can be specified.
619If a short form option is used, the data may follow immediately, or (with one
620exception) in the next command line item. For example:
621<pre>
622  -f/some/file
623  -f /some/file
624</pre>
625The exception is the <b>-o</b> option, which may appear with or without data.
626Because of this, if data is present, it must follow immediately in the same
627item, for example -o3.
628</P>
629<P>
630If a long form option is used, the data may appear in the same command line
631item, separated by an equals character, or (with two exceptions) it may appear
632in the next command line item. For example:
633<pre>
634  --file=/some/file
635  --file /some/file
636</pre>
637Note, however, that if you want to supply a file name beginning with ~ as data
638in a shell command, and have the shell expand ~ to a home directory, you must
639separate the file name from the option, because the shell does not treat ~
640specially unless it is at the start of an item.
641</P>
642<P>
643The exceptions to the above are the <b>--colour</b> (or <b>--color</b>) and
644<b>--only-matching</b> options, for which the data is optional. If one of these
645options does have data, it must be given in the first form, using an equals
646character. Otherwise <b>pcregrep</b> will assume that it has no data.
647</P>
648<br><a name="SEC10" href="#TOC1">MATCHING ERRORS</a><br>
649<P>
650It is possible to supply a regular expression that takes a very long time to
651fail to match certain lines. Such patterns normally involve nested indefinite
652repeats, for example: (a+)*\d when matched against a line of a's with no final
653digit. The PCRE matching function has a resource limit that causes it to abort
654in these circumstances. If this happens, <b>pcregrep</b> outputs an error
655message and the line that caused the problem to the standard error stream. If
656there are more than 20 such errors, <b>pcregrep</b> gives up.
657</P>
658<P>
659The <b>--match-limit</b> option of <b>pcregrep</b> can be used to set the overall
660resource limit; there is a second option called <b>--recursion-limit</b> that
661sets a limit on the amount of memory (usually stack) that is used (see the
662discussion of these options above).
663</P>
664<br><a name="SEC11" href="#TOC1">DIAGNOSTICS</a><br>
665<P>
666Exit status is 0 if any matches were found, 1 if no matches were found, and 2
667for syntax errors, overlong lines, non-existent or inaccessible files (even if
668matches were found in other files) or too many matching errors. Using the
669<b>-s</b> option to suppress error messages about inaccessible files does not
670affect the return code.
671</P>
672<br><a name="SEC12" href="#TOC1">SEE ALSO</a><br>
673<P>
674<b>pcrepattern</b>(3), <b>pcretest</b>(1).
675</P>
676<br><a name="SEC13" href="#TOC1">AUTHOR</a><br>
677<P>
678Philip Hazel
679<br>
680University Computing Service
681<br>
682Cambridge CB2 3QH, England.
683<br>
684</P>
685<br><a name="SEC14" href="#TOC1">REVISION</a><br>
686<P>
687Last updated: 04 March 2012
688<br>
689Copyright &copy; 1997-2012 University of Cambridge.
690<br>
691<p>
692Return to the <a href="index.html">PCRE index page</a>.
693</p>
694