1<html> 2<head> 3<title>pcregrep specification</title> 4</head> 5<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB"> 6<h1>pcregrep man page</h1> 7<p> 8Return to the <a href="index.html">PCRE index page</a>. 9</p> 10<p> 11This page is part of the PCRE HTML documentation. It was generated automatically 12from the original man page. If there is any nonsense in it, please consult the 13man page, in case the conversion went wrong. 14<br> 15<ul> 16<li><a name="TOC1" href="#SEC1">SYNOPSIS</a> 17<li><a name="TOC2" href="#SEC2">DESCRIPTION</a> 18<li><a name="TOC3" href="#SEC3">SUPPORT FOR COMPRESSED FILES</a> 19<li><a name="TOC4" href="#SEC4">BINARY FILES</a> 20<li><a name="TOC5" href="#SEC5">OPTIONS</a> 21<li><a name="TOC6" href="#SEC6">ENVIRONMENT VARIABLES</a> 22<li><a name="TOC7" href="#SEC7">NEWLINES</a> 23<li><a name="TOC8" href="#SEC8">OPTIONS COMPATIBILITY</a> 24<li><a name="TOC9" href="#SEC9">OPTIONS WITH DATA</a> 25<li><a name="TOC10" href="#SEC10">MATCHING ERRORS</a> 26<li><a name="TOC11" href="#SEC11">DIAGNOSTICS</a> 27<li><a name="TOC12" href="#SEC12">SEE ALSO</a> 28<li><a name="TOC13" href="#SEC13">AUTHOR</a> 29<li><a name="TOC14" href="#SEC14">REVISION</a> 30</ul> 31<br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br> 32<P> 33<b>pcregrep [options] [long options] [pattern] [path1 path2 ...]</b> 34</P> 35<br><a name="SEC2" href="#TOC1">DESCRIPTION</a><br> 36<P> 37<b>pcregrep</b> searches files for character patterns, in the same way as other 38grep commands do, but it uses the PCRE regular expression library to support 39patterns that are compatible with the regular expressions of Perl 5. See 40<a href="pcrepattern.html"><b>pcrepattern</b>(3)</a> 41for a full description of syntax and semantics of the regular expressions 42that PCRE supports. 43</P> 44<P> 45Patterns, whether supplied on the command line or in a separate file, are given 46without delimiters. For example: 47<pre> 48 pcregrep Thursday /etc/motd 49</pre> 50If you attempt to use delimiters (for example, by surrounding a pattern with 51slashes, as is common in Perl scripts), they are interpreted as part of the 52pattern. Quotes can of course be used to delimit patterns on the command line 53because they are interpreted by the shell, and indeed they are required if a 54pattern contains white space or shell metacharacters. 55</P> 56<P> 57The first argument that follows any option settings is treated as the single 58pattern to be matched when neither <b>-e</b> nor <b>-f</b> is present. 59Conversely, when one or both of these options are used to specify patterns, all 60arguments are treated as path names. At least one of <b>-e</b>, <b>-f</b>, or an 61argument pattern must be provided. 62</P> 63<P> 64If no files are specified, <b>pcregrep</b> reads the standard input. The 65standard input can also be referenced by a name consisting of a single hyphen. 66For example: 67<pre> 68 pcregrep some-pattern /file1 - /file3 69</pre> 70By default, each line that matches a pattern is copied to the standard 71output, and if there is more than one file, the file name is output at the 72start of each line, followed by a colon. However, there are options that can 73change how <b>pcregrep</b> behaves. In particular, the <b>-M</b> option makes it 74possible to search for patterns that span line boundaries. What defines a line 75boundary is controlled by the <b>-N</b> (<b>--newline</b>) option. 76</P> 77<P> 78The amount of memory used for buffering files that are being scanned is 79controlled by a parameter that can be set by the <b>--buffer-size</b> option. 80The default value for this parameter is specified when <b>pcregrep</b> is built, 81with the default default being 20K. A block of memory three times this size is 82used (to allow for buffering "before" and "after" lines). An error occurs if a 83line overflows the buffer. 84</P> 85<P> 86Patterns are limited to 8K or BUFSIZ bytes, whichever is the greater. BUFSIZ is 87defined in <b><stdio.h></b>. When there is more than one pattern (specified by 88the use of <b>-e</b> and/or <b>-f</b>), each pattern is applied to each line in 89the order in which they are defined, except that all the <b>-e</b> patterns are 90tried before the <b>-f</b> patterns. 91</P> 92<P> 93By default, as soon as one pattern matches (or fails to match when <b>-v</b> is 94used), no further patterns are considered. However, if <b>--colour</b> (or 95<b>--color</b>) is used to colour the matching substrings, or if 96<b>--only-matching</b>, <b>--file-offsets</b>, or <b>--line-offsets</b> is used to 97output only the part of the line that matched (either shown literally, or as an 98offset), scanning resumes immediately following the match, so that further 99matches on the same line can be found. If there are multiple patterns, they are 100all tried on the remainder of the line, but patterns that follow the one that 101matched are not tried on the earlier part of the line. 102</P> 103<P> 104This is the same behaviour as GNU grep, but it does mean that the order in 105which multiple patterns are specified can affect the output when one of the 106above options is used. 107</P> 108<P> 109Patterns that can match an empty string are accepted, but empty string 110matches are never recognized. An example is the pattern "(super)?(man)?", in 111which all components are optional. This pattern finds all occurrences of both 112"super" and "man"; the output differs from matching with "super|man" when only 113the matching substrings are being shown. 114</P> 115<P> 116If the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variable is set, 117<b>pcregrep</b> uses the value to set a locale when calling the PCRE library. 118The <b>--locale</b> option can be used to override this. 119</P> 120<br><a name="SEC3" href="#TOC1">SUPPORT FOR COMPRESSED FILES</a><br> 121<P> 122It is possible to compile <b>pcregrep</b> so that it uses <b>libz</b> or 123<b>libbz2</b> to read files whose names end in <b>.gz</b> or <b>.bz2</b>, 124respectively. You can find out whether your binary has support for one or both 125of these file types by running it with the <b>--help</b> option. If the 126appropriate support is not present, files are treated as plain text. The 127standard input is always so treated. 128</P> 129<br><a name="SEC4" href="#TOC1">BINARY FILES</a><br> 130<P> 131By default, a file that contains a binary zero byte within the first 1024 bytes 132is identified as a binary file, and is processed specially. (GNU grep also 133identifies binary files in this manner.) See the <b>--binary-files</b> option 134for a means of changing the way binary files are handled. 135</P> 136<br><a name="SEC5" href="#TOC1">OPTIONS</a><br> 137<P> 138The order in which some of the options appear can affect the output. For 139example, both the <b>-h</b> and <b>-l</b> options affect the printing of file 140names. Whichever comes later in the command line will be the one that takes 141effect. Numerical values for options may be followed by K or M, to signify 142multiplication by 1024 or 1024*1024 respectively. 143</P> 144<P> 145<b>--</b> 146This terminates the list of options. It is useful if the next item on the 147command line starts with a hyphen but is not an option. This allows for the 148processing of patterns and filenames that start with hyphens. 149</P> 150<P> 151<b>-A</b> <i>number</i>, <b>--after-context=</b><i>number</i> 152Output <i>number</i> lines of context after each matching line. If filenames 153and/or line numbers are being output, a hyphen separator is used instead of a 154colon for the context lines. A line containing "--" is output between each 155group of lines, unless they are in fact contiguous in the input file. The value 156of <i>number</i> is expected to be relatively small. However, <b>pcregrep</b> 157guarantees to have up to 8K of following text available for context output. 158</P> 159<P> 160<b>-a</b>, <b>--text</b> 161Treat binary files as text. This is equivalent to 162<b>--binary-files</b>=<i>text</i>. 163</P> 164<P> 165<b>-B</b> <i>number</i>, <b>--before-context=</b><i>number</i> 166Output <i>number</i> lines of context before each matching line. If filenames 167and/or line numbers are being output, a hyphen separator is used instead of a 168colon for the context lines. A line containing "--" is output between each 169group of lines, unless they are in fact contiguous in the input file. The value 170of <i>number</i> is expected to be relatively small. However, <b>pcregrep</b> 171guarantees to have up to 8K of preceding text available for context output. 172</P> 173<P> 174<b>--binary-files=</b><i>word</i> 175Specify how binary files are to be processed. If the word is "binary" (the 176default), pattern matching is performed on binary files, but the only output is 177"Binary file <name> matches" when a match succeeds. If the word is "text", 178which is equivalent to the <b>-a</b> or <b>--text</b> option, binary files are 179processed in the same way as any other file. In this case, when a match 180succeeds, the output may be binary garbage, which can have nasty effects if 181sent to a terminal. If the word is "without-match", which is equivalent to the 182<b>-I</b> option, binary files are not processed at all; they are assumed not to 183be of interest. 184</P> 185<P> 186<b>--buffer-size=</b><i>number</i> 187Set the parameter that controls how much memory is used for buffering files 188that are being scanned. 189</P> 190<P> 191<b>-C</b> <i>number</i>, <b>--context=</b><i>number</i> 192Output <i>number</i> lines of context both before and after each matching line. 193This is equivalent to setting both <b>-A</b> and <b>-B</b> to the same value. 194</P> 195<P> 196<b>-c</b>, <b>--count</b> 197Do not output individual lines from the files that are being scanned; instead 198output the number of lines that would otherwise have been shown. If no lines 199are selected, the number zero is output. If several files are are being 200scanned, a count is output for each of them. However, if the 201<b>--files-with-matches</b> option is also used, only those files whose counts 202are greater than zero are listed. When <b>-c</b> is used, the <b>-A</b>, 203<b>-B</b>, and <b>-C</b> options are ignored. 204</P> 205<P> 206<b>--colour</b>, <b>--color</b> 207If this option is given without any data, it is equivalent to "--colour=auto". 208If data is required, it must be given in the same shell item, separated by an 209equals sign. 210</P> 211<P> 212<b>--colour=</b><i>value</i>, <b>--color=</b><i>value</i> 213This option specifies under what circumstances the parts of a line that matched 214a pattern should be coloured in the output. By default, the output is not 215coloured. The value (which is optional, see above) may be "never", "always", or 216"auto". In the latter case, colouring happens only if the standard output is 217connected to a terminal. More resources are used when colouring is enabled, 218because <b>pcregrep</b> has to search for all possible matches in a line, not 219just one, in order to colour them all. 220<br> 221<br> 222The colour that is used can be specified by setting the environment variable 223PCREGREP_COLOUR or PCREGREP_COLOR. The value of this variable should be a 224string of two numbers, separated by a semicolon. They are copied directly into 225the control string for setting colour on a terminal, so it is your 226responsibility to ensure that they make sense. If neither of the environment 227variables is set, the default is "1;31", which gives red. 228</P> 229<P> 230<b>-D</b> <i>action</i>, <b>--devices=</b><i>action</i> 231If an input path is not a regular file or a directory, "action" specifies how 232it is to be processed. Valid values are "read" (the default) or "skip" 233(silently skip the path). 234</P> 235<P> 236<b>-d</b> <i>action</i>, <b>--directories=</b><i>action</i> 237If an input path is a directory, "action" specifies how it is to be processed. 238Valid values are "read" (the default), "recurse" (equivalent to the <b>-r</b> 239option), or "skip" (silently skip the path). In the default case, directories 240are read as if they were ordinary files. In some operating systems the effect 241of reading a directory like this is an immediate end-of-file. 242</P> 243<P> 244<b>-e</b> <i>pattern</i>, <b>--regex=</b><i>pattern</i>, <b>--regexp=</b><i>pattern</i> 245Specify a pattern to be matched. This option can be used multiple times in 246order to specify several patterns. It can also be used as a way of specifying a 247single pattern that starts with a hyphen. When <b>-e</b> is used, no argument 248pattern is taken from the command line; all arguments are treated as file 249names. There is an overall maximum of 100 patterns. They are applied to each 250line in the order in which they are defined until one matches (or fails to 251match if <b>-v</b> is used). If <b>-f</b> is used with <b>-e</b>, the command line 252patterns are matched first, followed by the patterns from the file, independent 253of the order in which these options are specified. Note that multiple use of 254<b>-e</b> is not the same as a single pattern with alternatives. For example, 255X|Y finds the first character in a line that is X or Y, whereas if the two 256patterns are given separately, <b>pcregrep</b> finds X if it is present, even if 257it follows Y in the line. It finds Y only if there is no X in the line. This 258really matters only if you are using <b>-o</b> to show the part(s) of the line 259that matched. 260</P> 261<P> 262<b>--exclude</b>=<i>pattern</i> 263When <b>pcregrep</b> is searching the files in a directory as a consequence of 264the <b>-r</b> (recursive search) option, any regular files whose names match the 265pattern are excluded. Subdirectories are not excluded by this option; they are 266searched recursively, subject to the <b>--exclude-dir</b> and 267<b>--include_dir</b> options. The pattern is a PCRE regular expression, and is 268matched against the final component of the file name (not the entire path). If 269a file name matches both <b>--include</b> and <b>--exclude</b>, it is excluded. 270There is no short form for this option. 271</P> 272<P> 273<b>--exclude-dir</b>=<i>pattern</i> 274When <b>pcregrep</b> is searching the contents of a directory as a consequence 275of the <b>-r</b> (recursive search) option, any subdirectories whose names match 276the pattern are excluded. (Note that the \fP--exclude\fP option does not affect 277subdirectories.) The pattern is a PCRE regular expression, and is matched 278against the final component of the name (not the entire path). If a 279subdirectory name matches both <b>--include-dir</b> and <b>--exclude-dir</b>, it 280is excluded. There is no short form for this option. 281</P> 282<P> 283<b>-F</b>, <b>--fixed-strings</b> 284Interpret each pattern as a list of fixed strings, separated by newlines, 285instead of as a regular expression. The <b>-w</b> (match as a word) and <b>-x</b> 286(match whole line) options can be used with <b>-F</b>. They apply to each of the 287fixed strings. A line is selected if any of the fixed strings are found in it 288(subject to <b>-w</b> or <b>-x</b>, if present). 289</P> 290<P> 291<b>-f</b> <i>filename</i>, <b>--file=</b><i>filename</i> 292Read a number of patterns from the file, one per line, and match them against 293each line of input. A data line is output if any of the patterns match it. The 294filename can be given as "-" to refer to the standard input. When <b>-f</b> is 295used, patterns specified on the command line using <b>-e</b> may also be 296present; they are tested before the file's patterns. However, no other pattern 297is taken from the command line; all arguments are treated as the names of paths 298to be searched. There is an overall maximum of 100 patterns. Trailing white 299space is removed from each line, and blank lines are ignored. An empty file 300contains no patterns and therefore matches nothing. See also the comments about 301multiple patterns versus a single pattern with alternatives in the description 302of <b>-e</b> above. 303</P> 304<P> 305<b>--file-list</b>=<i>filename</i> 306Read a list of files to be searched from the given file, one per line. Trailing 307white space is removed from each line, and blank lines are ignored. These files 308are searched before any others that may be listed on the command line. The 309filename can be given as "-" to refer to the standard input. If <b>--file</b> 310and <b>--file-list</b> are both specified as "-", patterns are read first. This 311is useful only when the standard input is a terminal, from which further lines 312(the list of files) can be read after an end-of-file indication. 313</P> 314<P> 315<b>--file-offsets</b> 316Instead of showing lines or parts of lines that match, show each match as an 317offset from the start of the file and a length, separated by a comma. In this 318mode, no context is shown. That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b> 319options are ignored. If there is more than one match in a line, each of them is 320shown separately. This option is mutually exclusive with <b>--line-offsets</b> 321and <b>--only-matching</b>. 322</P> 323<P> 324<b>-H</b>, <b>--with-filename</b> 325Force the inclusion of the filename at the start of output lines when searching 326a single file. By default, the filename is not shown in this case. For matching 327lines, the filename is followed by a colon; for context lines, a hyphen 328separator is used. If a line number is also being output, it follows the file 329name. 330</P> 331<P> 332<b>-h</b>, <b>--no-filename</b> 333Suppress the output filenames when searching multiple files. By default, 334filenames are shown when multiple files are searched. For matching lines, the 335filename is followed by a colon; for context lines, a hyphen separator is used. 336If a line number is also being output, it follows the file name. 337</P> 338<P> 339<b>--help</b> 340Output a help message, giving brief details of the command options and file 341type support, and then exit. 342</P> 343<P> 344<b>-I</b> 345Treat binary files as never matching. This is equivalent to 346<b>--binary-files</b>=<i>without-match</i>. 347</P> 348<P> 349<b>-i</b>, <b>--ignore-case</b> 350Ignore upper/lower case distinctions during comparisons. 351</P> 352<P> 353<b>--include</b>=<i>pattern</i> 354When <b>pcregrep</b> is searching the files in a directory as a consequence of 355the <b>-r</b> (recursive search) option, only those regular files whose names 356match the pattern are included. Subdirectories are always included and searched 357recursively, subject to the \fP--include-dir\fP and <b>--exclude-dir</b> 358options. The pattern is a PCRE regular expression, and is matched against the 359final component of the file name (not the entire path). If a file name matches 360both <b>--include</b> and <b>--exclude</b>, it is excluded. There is no short 361form for this option. 362</P> 363<P> 364<b>--include-dir</b>=<i>pattern</i> 365When <b>pcregrep</b> is searching the contents of a directory as a consequence 366of the <b>-r</b> (recursive search) option, only those subdirectories whose 367names match the pattern are included. (Note that the <b>--include</b> option 368does not affect subdirectories.) The pattern is a PCRE regular expression, and 369is matched against the final component of the name (not the entire path). If a 370subdirectory name matches both <b>--include-dir</b> and <b>--exclude-dir</b>, it 371is excluded. There is no short form for this option. 372</P> 373<P> 374<b>-L</b>, <b>--files-without-match</b> 375Instead of outputting lines from the files, just output the names of the files 376that do not contain any lines that would have been output. Each file name is 377output once, on a separate line. 378</P> 379<P> 380<b>-l</b>, <b>--files-with-matches</b> 381Instead of outputting lines from the files, just output the names of the files 382containing lines that would have been output. Each file name is output 383once, on a separate line. Searching normally stops as soon as a matching line 384is found in a file. However, if the <b>-c</b> (count) option is also used, 385matching continues in order to obtain the correct count, and those files that 386have at least one match are listed along with their counts. Using this option 387with <b>-c</b> is a way of suppressing the listing of files with no matches. 388</P> 389<P> 390<b>--label</b>=<i>name</i> 391This option supplies a name to be used for the standard input when file names 392are being output. If not supplied, "(standard input)" is used. There is no 393short form for this option. 394</P> 395<P> 396<b>--line-buffered</b> 397When this option is given, input is read and processed line by line, and the 398output is flushed after each write. By default, input is read in large chunks, 399unless <b>pcregrep</b> can determine that it is reading from a terminal (which 400is currently possible only in Unix environments). Output to terminal is 401normally automatically flushed by the operating system. This option can be 402useful when the input or output is attached to a pipe and you do not want 403<b>pcregrep</b> to buffer up large amounts of data. However, its use will affect 404performance, and the <b>-M</b> (multiline) option ceases to work. 405</P> 406<P> 407<b>--line-offsets</b> 408Instead of showing lines or parts of lines that match, show each match as a 409line number, the offset from the start of the line, and a length. The line 410number is terminated by a colon (as usual; see the <b>-n</b> option), and the 411offset and length are separated by a comma. In this mode, no context is shown. 412That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b> options are ignored. If there is 413more than one match in a line, each of them is shown separately. This option is 414mutually exclusive with <b>--file-offsets</b> and <b>--only-matching</b>. 415</P> 416<P> 417<b>--locale</b>=<i>locale-name</i> 418This option specifies a locale to be used for pattern matching. It overrides 419the value in the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variables. If no 420locale is specified, the PCRE library's default (usually the "C" locale) is 421used. There is no short form for this option. 422</P> 423<P> 424<b>--match-limit</b>=<i>number</i> 425Processing some regular expression patterns can require a very large amount of 426memory, leading in some cases to a program crash if not enough is available. 427Other patterns may take a very long time to search for all possible matching 428strings. The <b>pcre_exec()</b> function that is called by <b>pcregrep</b> to do 429the matching has two parameters that can limit the resources that it uses. 430<br> 431<br> 432The <b>--match-limit</b> option provides a means of limiting resource usage 433when processing patterns that are not going to match, but which have a very 434large number of possibilities in their search trees. The classic example is a 435pattern that uses nested unlimited repeats. Internally, PCRE uses a function 436called <b>match()</b> which it calls repeatedly (sometimes recursively). The 437limit set by <b>--match-limit</b> is imposed on the number of times this 438function is called during a match, which has the effect of limiting the amount 439of backtracking that can take place. 440<br> 441<br> 442The <b>--recursion-limit</b> option is similar to <b>--match-limit</b>, but 443instead of limiting the total number of times that <b>match()</b> is called, it 444limits the depth of recursive calls, which in turn limits the amount of memory 445that can be used. The recursion depth is a smaller number than the total number 446of calls, because not all calls to <b>match()</b> are recursive. This limit is 447of use only if it is set smaller than <b>--match-limit</b>. 448<br> 449<br> 450There are no short forms for these options. The default settings are specified 451when the PCRE library is compiled, with the default default being 10 million. 452</P> 453<P> 454<b>-M</b>, <b>--multiline</b> 455Allow patterns to match more than one line. When this option is given, patterns 456may usefully contain literal newline characters and internal occurrences of ^ 457and $ characters. The output for a successful match may consist of more than 458one line, the last of which is the one in which the match ended. If the matched 459string ends with a newline sequence the output ends at the end of that line. 460<br> 461<br> 462When this option is set, the PCRE library is called in "multiline" mode. 463There is a limit to the number of lines that can be matched, imposed by the way 464that <b>pcregrep</b> buffers the input file as it scans it. However, 465<b>pcregrep</b> ensures that at least 8K characters or the rest of the document 466(whichever is the shorter) are available for forward matching, and similarly 467the previous 8K characters (or all the previous characters, if fewer than 8K) 468are guaranteed to be available for lookbehind assertions. This option does not 469work when input is read line by line (see \fP--line-buffered\fP.) 470</P> 471<P> 472<b>-N</b> <i>newline-type</i>, <b>--newline</b>=<i>newline-type</i> 473The PCRE library supports five different conventions for indicating 474the ends of lines. They are the single-character sequences CR (carriage return) 475and LF (linefeed), the two-character sequence CRLF, an "anycrlf" convention, 476which recognizes any of the preceding three types, and an "any" convention, in 477which any Unicode line ending sequence is assumed to end a line. The Unicode 478sequences are the three just mentioned, plus VT (vertical tab, U+000B), FF 479(form feed, U+000C), NEL (next line, U+0085), LS (line separator, U+2028), and 480PS (paragraph separator, U+2029). 481<br> 482<br> 483When the PCRE library is built, a default line-ending sequence is specified. 484This is normally the standard sequence for the operating system. Unless 485otherwise specified by this option, <b>pcregrep</b> uses the library's default. 486The possible values for this option are CR, LF, CRLF, ANYCRLF, or ANY. This 487makes it possible to use <b>pcregrep</b> on files that have come from other 488environments without having to modify their line endings. If the data that is 489being scanned does not agree with the convention set by this option, 490<b>pcregrep</b> may behave in strange ways. 491</P> 492<P> 493<b>-n</b>, <b>--line-number</b> 494Precede each output line by its line number in the file, followed by a colon 495for matching lines or a hyphen for context lines. If the filename is also being 496output, it precedes the line number. This option is forced if 497<b>--line-offsets</b> is used. 498</P> 499<P> 500<b>--no-jit</b> 501If the PCRE library is built with support for just-in-time compiling (which 502speeds up matching), <b>pcregrep</b> automatically makes use of this, unless it 503was explicitly disabled at build time. This option can be used to disable the 504use of JIT at run time. It is provided for testing and working round problems. 505It should never be needed in normal use. 506</P> 507<P> 508<b>-o</b>, <b>--only-matching</b> 509Show only the part of the line that matched a pattern instead of the whole 510line. In this mode, no context is shown. That is, the <b>-A</b>, <b>-B</b>, and 511<b>-C</b> options are ignored. If there is more than one match in a line, each 512of them is shown separately. If <b>-o</b> is combined with <b>-v</b> (invert the 513sense of the match to find non-matching lines), no output is generated, but the 514return code is set appropriately. If the matched portion of the line is empty, 515nothing is output unless the file name or line number are being printed, in 516which case they are shown on an otherwise empty line. This option is mutually 517exclusive with <b>--file-offsets</b> and <b>--line-offsets</b>. 518</P> 519<P> 520<b>-o</b><i>number</i>, <b>--only-matching</b>=<i>number</i> 521Show only the part of the line that matched the capturing parentheses of the 522given number. Up to 32 capturing parentheses are supported. Because these 523options can be given without an argument (see above), if an argument is 524present, it must be given in the same shell item, for example, -o3 or 525--only-matching=2. The comments given for the non-argument case above also 526apply to this case. If the specified capturing parentheses do not exist in the 527pattern, or were not set in the match, nothing is output unless the file name 528or line number are being printed. 529</P> 530<P> 531<b>-q</b>, <b>--quiet</b> 532Work quietly, that is, display nothing except error messages. The exit 533status indicates whether or not any matches were found. 534</P> 535<P> 536<b>-r</b>, <b>--recursive</b> 537If any given path is a directory, recursively scan the files it contains, 538taking note of any <b>--include</b> and <b>--exclude</b> settings. By default, a 539directory is read as a normal file; in some operating systems this gives an 540immediate end-of-file. This option is a shorthand for setting the <b>-d</b> 541option to "recurse". 542</P> 543<P> 544<b>--recursion-limit</b>=<i>number</i> 545See <b>--match-limit</b> above. 546</P> 547<P> 548<b>-s</b>, <b>--no-messages</b> 549Suppress error messages about non-existent or unreadable files. Such files are 550quietly skipped. However, the return code is still 2, even if matches were 551found in other files. 552</P> 553<P> 554<b>-u</b>, <b>--utf-8</b> 555Operate in UTF-8 mode. This option is available only if PCRE has been compiled 556with UTF-8 support. Both patterns and subject lines must be valid strings of 557UTF-8 characters. 558</P> 559<P> 560<b>-V</b>, <b>--version</b> 561Write the version numbers of <b>pcregrep</b> and the PCRE library that is being 562used to the standard error stream. 563</P> 564<P> 565<b>-v</b>, <b>--invert-match</b> 566Invert the sense of the match, so that lines which do <i>not</i> match any of 567the patterns are the ones that are found. 568</P> 569<P> 570<b>-w</b>, <b>--word-regex</b>, <b>--word-regexp</b> 571Force the patterns to match only whole words. This is equivalent to having \b 572at the start and end of the pattern. 573</P> 574<P> 575<b>-x</b>, <b>--line-regex</b>, <b>--line-regexp</b> 576Force the patterns to be anchored (each must start matching at the beginning of 577a line) and in addition, require them to match entire lines. This is 578equivalent to having ^ and $ characters at the start and end of each 579alternative branch in every pattern. 580</P> 581<br><a name="SEC6" href="#TOC1">ENVIRONMENT VARIABLES</a><br> 582<P> 583The environment variables <b>LC_ALL</b> and <b>LC_CTYPE</b> are examined, in that 584order, for a locale. The first one that is set is used. This can be overridden 585by the <b>--locale</b> option. If no locale is set, the PCRE library's default 586(usually the "C" locale) is used. 587</P> 588<br><a name="SEC7" href="#TOC1">NEWLINES</a><br> 589<P> 590The <b>-N</b> (<b>--newline</b>) option allows <b>pcregrep</b> to scan files with 591different newline conventions from the default. However, the setting of this 592option does not affect the way in which <b>pcregrep</b> writes information to 593the standard error and output streams. It uses the string "\n" in C 594<b>printf()</b> calls to indicate newlines, relying on the C I/O library to 595convert this to an appropriate sequence if the output is sent to a file. 596</P> 597<br><a name="SEC8" href="#TOC1">OPTIONS COMPATIBILITY</a><br> 598<P> 599Many of the short and long forms of <b>pcregrep</b>'s options are the same 600as in the GNU <b>grep</b> program. Any long option of the form 601<b>--xxx-regexp</b> (GNU terminology) is also available as <b>--xxx-regex</b> 602(PCRE terminology). However, the <b>--file-list</b>, <b>--file-offsets</b>, 603<b>--include-dir</b>, <b>--line-offsets</b>, <b>--locale</b>, <b>--match-limit</b>, 604<b>-M</b>, <b>--multiline</b>, <b>-N</b>, <b>--newline</b>, 605<b>--recursion-limit</b>, <b>-u</b>, and <b>--utf-8</b> options are specific to 606<b>pcregrep</b>, as is the use of the <b>--only-matching</b> option with a 607capturing parentheses number. 608</P> 609<P> 610Although most of the common options work the same way, a few are different in 611<b>pcregrep</b>. For example, the <b>--include</b> option's argument is a glob 612for GNU <b>grep</b>, but a regular expression for <b>pcregrep</b>. If both the 613<b>-c</b> and <b>-l</b> options are given, GNU grep lists only file names, 614without counts, but <b>pcregrep</b> gives the counts. 615</P> 616<br><a name="SEC9" href="#TOC1">OPTIONS WITH DATA</a><br> 617<P> 618There are four different ways in which an option with data can be specified. 619If a short form option is used, the data may follow immediately, or (with one 620exception) in the next command line item. For example: 621<pre> 622 -f/some/file 623 -f /some/file 624</pre> 625The exception is the <b>-o</b> option, which may appear with or without data. 626Because of this, if data is present, it must follow immediately in the same 627item, for example -o3. 628</P> 629<P> 630If a long form option is used, the data may appear in the same command line 631item, separated by an equals character, or (with two exceptions) it may appear 632in the next command line item. For example: 633<pre> 634 --file=/some/file 635 --file /some/file 636</pre> 637Note, however, that if you want to supply a file name beginning with ~ as data 638in a shell command, and have the shell expand ~ to a home directory, you must 639separate the file name from the option, because the shell does not treat ~ 640specially unless it is at the start of an item. 641</P> 642<P> 643The exceptions to the above are the <b>--colour</b> (or <b>--color</b>) and 644<b>--only-matching</b> options, for which the data is optional. If one of these 645options does have data, it must be given in the first form, using an equals 646character. Otherwise <b>pcregrep</b> will assume that it has no data. 647</P> 648<br><a name="SEC10" href="#TOC1">MATCHING ERRORS</a><br> 649<P> 650It is possible to supply a regular expression that takes a very long time to 651fail to match certain lines. Such patterns normally involve nested indefinite 652repeats, for example: (a+)*\d when matched against a line of a's with no final 653digit. The PCRE matching function has a resource limit that causes it to abort 654in these circumstances. If this happens, <b>pcregrep</b> outputs an error 655message and the line that caused the problem to the standard error stream. If 656there are more than 20 such errors, <b>pcregrep</b> gives up. 657</P> 658<P> 659The <b>--match-limit</b> option of <b>pcregrep</b> can be used to set the overall 660resource limit; there is a second option called <b>--recursion-limit</b> that 661sets a limit on the amount of memory (usually stack) that is used (see the 662discussion of these options above). 663</P> 664<br><a name="SEC11" href="#TOC1">DIAGNOSTICS</a><br> 665<P> 666Exit status is 0 if any matches were found, 1 if no matches were found, and 2 667for syntax errors, overlong lines, non-existent or inaccessible files (even if 668matches were found in other files) or too many matching errors. Using the 669<b>-s</b> option to suppress error messages about inaccessible files does not 670affect the return code. 671</P> 672<br><a name="SEC12" href="#TOC1">SEE ALSO</a><br> 673<P> 674<b>pcrepattern</b>(3), <b>pcretest</b>(1). 675</P> 676<br><a name="SEC13" href="#TOC1">AUTHOR</a><br> 677<P> 678Philip Hazel 679<br> 680University Computing Service 681<br> 682Cambridge CB2 3QH, England. 683<br> 684</P> 685<br><a name="SEC14" href="#TOC1">REVISION</a><br> 686<P> 687Last updated: 04 March 2012 688<br> 689Copyright © 1997-2012 University of Cambridge. 690<br> 691<p> 692Return to the <a href="index.html">PCRE index page</a>. 693</p> 694