1<h1>TRE API reference manual</h1> 2 3<h2>The <tt>regcomp()</tt> functions</h2> 4<a name="regcomp"></a> 5 6<div class="code"> 7<code> 8#include <tre/regex.h> 9<br> 10<br> 11<font class="type">int</font> 12<font class="func">regcomp</font>(<font 13class="type">regex_t</font> *<font class="arg">preg</font>, 14<font class="qual">const</font> <font class="type">char</font> 15*<font class="arg">regex</font>, <font class="type">int</font> 16<font class="arg">cflags</font>); 17<br> 18<font class="type">int</font> <font 19class="func">regncomp</font>(<font class="type">regex_t</font> 20*<font class="arg">preg</font>, <font class="qual">const</font> 21<font class="type">char</font> *<font class="arg">regex</font>, 22<font class="type">size_t</font> <font class="arg">len</font>, 23<font class="type">int</font> <font class="arg">cflags</font>); 24<br> 25<font class="type">int</font> <font 26class="func">regwcomp</font>(<font class="type">regex_t</font> 27*<font class="arg">preg</font>, <font class="qual">const</font> 28<font class="type">wchar_t</font> *<font 29class="arg">regex</font>, <font class="type">int</font> <font 30class="arg">cflags</font>); 31<br> 32<font class="type">int</font> <font 33class="func">regwncomp</font>(<font class="type">regex_t</font> 34*<font class="arg">preg</font>, <font class="qual">const</font> 35<font class="type">wchar_t</font> *<font 36class="arg">regex</font>, <font class="type">size_t</font> 37<font class="arg">len</font>, <font class="type">int</font> 38<font class="arg">cflags</font>); 39<br> 40<font class="type">void</font> <font 41class="func">regfree</font>(<font class="type">regex_t</font> 42*<font class="arg">preg</font>); 43<br> 44</code> 45</div> 46 47<p> 48The <tt><font class="func">regcomp</font>()</tt> function compiles 49the regex string pointed to by <tt><font 50class="arg">regex</font></tt> to an internal representation and 51stores the result in the pattern buffer structure pointed to by 52<tt><font class="arg">preg</font></tt>. The <tt><font 53class="func">regncomp</font>()</tt> function is like <tt><font 54class="func">regcomp</font>()</tt>, but <tt><font 55class="arg">regex</font></tt> is not terminated with the null 56byte. Instead, the <tt><font class="arg">len</font></tt> argument 57is used to give the length of the string, and the string may contain 58null bytes. The <tt><font class="func">regwcomp</font>()</tt> and 59<tt><font class="func">regwncomp</font>()</tt> functions work like 60<tt><font class="func">regcomp</font>()</tt> and <tt><font 61class="func">regncomp</font>()</tt>, respectively, but take a 62wide-character (<tt><font class="type">wchar_t</font></tt>) string 63instead of a byte string. 64</p> 65 66<p> 67The <tt><font class="arg">cflags</font></tt> argument is a the 68bitwise inclusive OR of zero or more of the following flags (defined 69in the header <tt><tre/regex.h></tt>): 70</p> 71 72<blockquote> 73<dl> 74<dt><tt>REG_EXTENDED</tt></dt> 75<dd>Use POSIX Extended Regular Expression (ERE) compatible syntax when 76compiling <tt><font class="arg">regex</font></tt>. The default 77syntax is the POSIX Basic Regular Expression (BRE) syntax, but it is 78considered obsolete.</dd> 79 80<dt><tt>REG_ICASE</tt></dt> 81<dd>Ignore case. Subsequent searches with the <a 82href="#regexec"><tt>regexec</tt></a> family of functions using this 83pattern buffer will be case insensitive.</dd> 84 85<dt><tt>REG_NOSUB</tt></dt> 86<dd>Do not report submatches. Subsequent searches with the <a 87href="#regexec"><tt>regexec</tt></a> family of functions will only 88report whether a match was found or not and will not fill the submatch 89array.</dd> 90 91<dt><tt>REG_NEWLINE</tt></dt> 92<dd>Normally the newline character is treated as an ordinary 93character. When this flag is used, the newline character 94(<tt>'\n'</tt>, ASCII code 10) is treated specially as follows: 95<ol> 96<li>The match-any-character operator (dot <tt>"."</tt> outside a 97bracket expression) does not match a newline.</li> 98<li>A non-matching list (<tt>[^...]</tt>) not containing a newline 99does not match a newline.</li> 100<li>The match-beginning-of-line operator <tt>^</tt> matches the empty 101string immediately after a newline as well as the empty string at the 102beginning of the string (but see the <code>REG_NOTBOL</code> 103<code>regexec()</code> flag below). 104<li>The match-end-of-line operator <tt>$</tt> matches the empty 105string immediately before a newline as well as the empty string at the 106end of the string (but see the <code>REG_NOTEOL</code> 107<code>regexec()</code> flag below). 108</ol> 109</dd> 110 111<dt><tt>REG_LITERAL</tt></dt> 112<dd>Interpret the entire <tt><font class="arg">regex</font></tt> 113argument as a literal string, that is, all characters will be 114considered ordinary. This is a nonstandard extension, compatible with 115but not specified by POSIX.</dd> 116 117<dt><tt>REG_NOSPEC</tt></dt> 118<dd>Same as <tt>REG_LITERAL</tt>. This flag is provided for 119compatibility with BSD.</dd> 120 121<dt><tt>REG_RIGHT_ASSOC</tt></dt> 122<dd>By default, concatenation is left associative in TRE, as per 123the grammar given in the <a 124href="http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap09.html">base 125specifications on regular expressions</a> of Std 1003.1-2001 (POSIX). 126This flag flips associativity of concatenation to right associative. 127Associativity can have an effect on how a match is divided into 128submatches, but does not change what is matched by the entire regexp. 129</dd> 130 131<dt><tt>REG_UNGREEDY</tt></dt> 132<dd>By default, repetition operators are greedy in TRE as per Std 1003.1-2001 (POSIX) and 133can be forced to be non-greedy by appending a <tt>?</tt> character. This flag reverses this behavior 134by making the operators non-greedy by default and greedy when a <tt>?</tt> is specified.</dd> 135</dl> 136</blockquote> 137 138<p> 139After a successful call to <tt><font class="func">regcomp</font></tt> it is 140possible to use the <tt><font class="arg">preg</font></tt> pattern buffer for 141searching for matches in strings (see below). Once the pattern buffer is no 142longer needed, it should be freed with <tt><font 143class="func">regfree</font></tt> to free the memory allocated for it. 144</p> 145 146 147<p> 148The <tt><font class="type">regex_t</font></tt> structure has the 149following fields that the application can read: 150</p> 151<blockquote> 152<dl> 153<dt><tt><font class="type">size_t</font> <font 154class="arg">re_nsub</font></tt></dt> 155<dd>Number of parenthesized subexpressions in <tt><font 156class="arg">regex</font></tt>. 157</dd> 158</dl> 159</blockquote> 160 161<p> 162The <tt><font class="func">regcomp</font></tt> function returns 163zero if the compilation was successful, or one of the following error 164codes if there was an error: 165</p> 166<blockquote> 167<dl> 168<dt><tt>REG_BADPAT</tt></dt> 169<dd>Invalid regexp. TRE returns this only if a multibyte character 170set is used in the current locale, and <tt><font 171class="arg">regex</font></tt> contained an invalid multibyte 172sequence.</dd> 173<dt><tt>REG_ECOLLATE</tt></dt> 174<dd>Invalid collating element referenced. TRE returns this whenever 175equivalence classes or multicharacter collating elements are used in 176bracket expressions (they are not supported yet).</dd> 177<dt><tt>REG_ECTYPE</tt></dt> 178<dd>Unknown character class name in <tt>[[:<i>name</i>:]]</tt>.</dd> 179<dt><tt>REG_EESCAPE</tt></dt> 180<dd>The last character of <tt><font class="arg">regex</font></tt> 181was a backslash (<tt>\</tt>).</dd> 182<dt><tt>REG_ESUBREG</tt></dt> 183<dd>Invalid back reference; number in <tt>\<i>digit</i></tt> 184invalid.</dd> 185<dt><tt>REG_EBRACK</tt></dt> 186<dd><tt>[]</tt> imbalance.</dd> 187<dt><tt>REG_EPAREN</tt></dt> 188<dd><tt>\(\)</tt> or <tt>()</tt> imbalance.</dd> 189<dt><tt>REG_EBRACE</tt></dt> 190<dd><tt>\{\}</tt> or <tt>{}</tt> imbalance.</dd> 191<dt><tt>REG_BADBR</tt></dt> 192<dd><tt>{}</tt> content invalid: not a number, more than two numbers, 193first larger than second, or number too large. 194<dt><tt>REG_ERANGE</tt></dt> 195<dd>Invalid character range, e.g. ending point is earlier in the 196collating order than the starting point.</dd> 197<dt><tt>REG_ESPACE</tt></dt> 198<dd>Out of memory, or an internal limit exceeded.</dd> 199<dt><tt>REG_BADRPT</tt></dt> 200<dd>Invalid use of repetition operators: two or more repetition operators have 201been chained in an undefined way.</dd> 202</dl> 203</blockquote> 204 205 206<h2>The <tt>regexec()</tt> functions</h2> 207<a name="regexec"></a> 208 209<div class="code"> 210<code> 211#include <tre/regex.h> 212<br> 213<br> 214<font class="type">int</font> <font 215class="func">regexec</font>(<font class="qual">const</font> 216<font class="type">regex_t</font> *<font 217class="arg">preg</font>, <font class="qual">const</font> <font 218class="type">char</font> *<font class="arg">string</font>, 219<font class="type">size_t</font> <font 220class="arg">nmatch</font>, 221<br> 222<font class="type">regmatch_t</font> <font 223class="arg">pmatch</font>[], <font class="type">int</font> 224<font class="arg">eflags</font>); 225<br> 226<font class="type">int</font> <font 227class="func">regnexec</font>(<font class="qual">const</font> 228<font class="type">regex_t</font> *<font 229class="arg">preg</font>, <font class="qual">const</font> <font 230class="type">char</font> *<font class="arg">string</font>, 231<font class="type">size_t</font> <font class="arg">len</font>, 232<br> 233<font class="type">size_t</font> <font 234class="arg">nmatch</font>, <font class="type">regmatch_t</font> 235<font class="arg">pmatch</font>[], <font 236class="type">int</font> <font class="arg">eflags</font>); 237<br> 238<font class="type">int</font> <font 239class="func">regwexec</font>(<font class="qual">const</font> 240<font class="type">regex_t</font> *<font 241class="arg">preg</font>, <font class="qual">const</font> <font 242class="type">wchar_t</font> *<font class="arg">string</font>, 243<font class="type">size_t</font> <font 244class="arg">nmatch</font>, 245<br> 246<font class="type">regmatch_t</font> <font 247class="arg">pmatch</font>[], <font class="type">int</font> 248<font class="arg">eflags</font>); 249<br> 250<font class="type">int</font> <font 251class="func">regwnexec</font>(<font class="qual">const</font> 252<font class="type">regex_t</font> *<font 253class="arg">preg</font>, <font class="qual">const</font> <font 254class="type">wchar_t</font> *<font class="arg">string</font>, 255<font class="type">size_t</font> <font class="arg">len</font>, 256<br> 257 258<font class="type">size_t</font> <font 259class="arg">nmatch</font>, <font class="type">regmatch_t</font> 260<font class="arg">pmatch</font>[], <font 261class="type">int</font> <font class="arg">eflags</font>); 262</code> 263</div> 264 265<p> 266The <tt><font class="func">regexec</font>()</tt> function matches 267the null-terminated string against the compiled regexp <tt><font 268class="arg">preg</font></tt>, initialized by a previous call to 269any one of the <a href="#regcomp"><tt>regcomp</tt></a> functions. The 270<tt><font class="func">regnexec</font>()</tt> function is like 271<tt><font class="func">regexec</font>()</tt>, but <tt><font 272class="arg">string</font></tt> is not terminated with a null byte. 273Instead, the <tt><font class="arg">len</font></tt> argument is used 274to give the length of the string, and the string may contain null 275bytes. The <tt><font class="func">regwexec</font>()</tt> and 276<tt><font class="func">regwnexec</font>()</tt> functions work like 277<tt><font class="func">regexec</font>()</tt> and <tt><font 278class="func">regnexec</font>()</tt>, respectively, but take a wide 279character (<tt><font class="type">wchar_t</font></tt>) string 280instead of a byte string. The <tt><font 281class="arg">eflags</font></tt> argument is a bitwise OR of zero or 282more of the following flags: 283</p> 284<blockquote> 285<dl> 286<dt><code>REG_NOTBOL</code></dt> 287<dd> 288<p> 289When this flag is used, the match-beginning-of-line operator 290<tt>^</tt> does not match the empty string at the beginning of 291<tt><font class="arg">string</font></tt>. If 292<code>REG_NEWLINE</code> was used when compiling 293<tt><font class="arg">preg</font></tt> the empty string 294immediately after a newline character will still be matched. 295</p> 296</dd> 297 298<dt><code>REG_NOTEOL</code></dt> 299<dd> 300<p> 301When this flag is used, the match-end-of-line operator 302<tt>$</tt> does not match the empty string at the end of 303<tt><font class="arg">string</font></tt>. If 304<code>REG_NEWLINE</code> was used when compiling 305<tt><font class="arg">preg</font></tt> the empty string 306immediately before a newline character will still be matched. 307</p> 308 309</dl> 310 311<p> 312These flags are useful when different portions of a string are passed 313to <code>regexec</code> and the beginning or end of the partial string 314should not be interpreted as the beginning or end of a line. 315</p> 316 317</blockquote> 318 319<p> 320If <code>REG_NOSUB</code> was used when compiling <tt><font 321class="arg">preg</font></tt>, <tt><font 322class="arg">nmatch</font></tt> is zero, or <tt><font 323class="arg">pmatch</font></tt> is <code>NULL</code>, then the 324<tt><font class="arg">pmatch</font></tt> argument is ignored. 325Otherwise, the submatches corresponding to the parenthesized 326subexpressions are filled in the elements of <tt><font 327class="arg">pmatch</font></tt>, which must be dimensioned to have 328at least <tt><font class="arg">nmatch</font></tt> elements. 329</p> 330 331<p> 332The <tt><font class="type">regmatch_t</font></tt> structure contains 333at least the following fields: 334</p> 335<blockquote> 336<dl> 337<dt><tt><font class="type">regoff_t</font> <font 338class="arg">rm_so</font></tt></dt> 339<dd>Offset from start of <tt><font class="arg">string</font></tt> to start of 340substring. </dd> 341<dt><tt><font class="type">regoff_t</font> <font 342class="arg">rm_eo</font></tt></dt> 343<dd>Offset from start of <tt><font class="arg">string</font></tt> to the first 344character after the substring. </dd> 345</dl> 346</blockquote> 347 348<p> 349The length of a submatch can be computed by subtracting <code>rm_eo</code> and 350<code>rm_so</code>. If a parenthesized subexpression did not participate in a 351match, the <code>rm_so</code> and <code>rm_eo</code> fields for the 352corresponding <code>pmatch</code> element are set to <code>-1</code>. Note 353that when a multibyte character set is in effect, the submatch offsets are 354given as byte offsets, not character offsets. 355</p> 356 357<p> 358The <code>regexec()</code> functions return zero if a match was found, 359otherwise they return <code>REG_NOMATCH</code> to indicate no match, 360or <code>REG_ESPACE</code> to indicate that enough temporary memory 361could not be allocated to complete the matching operation. 362</p> 363 364 365 366<h3>reguexec()</h3> 367 368<div class="code"> 369<code> 370#include <tre/regex.h> 371<br> 372<br> 373<font class="qual">typedef struct</font> { 374<br> 375 <font class="type">int</font> (*get_next_char)(<font 376class="type">tre_char_t</font> *<font class="arg">c</font>, <font 377class="type">unsigned int</font> *<font class="arg">pos_add</font>, 378<font class="type">void</font> *<font class="arg">context</font>); 379<br> 380 <font class="type">void</font> (*rewind)(<font 381class="type">size_t</font> <font class="arg">pos</font>, <font 382class="type">void</font> *<font class="arg">context</font>); 383<br> 384 <font class="type">int</font> (*compare)(<font 385class="type">size_t</font> <font class="arg">pos1</font>, <font 386class="type">size_t</font> <font class="arg">pos2</font>, <font 387class="type">size_t</font> <font class="arg">len</font>, <font 388class="type">void</font> *<font class="arg">context</font>); 389<br> 390 <font class="type">void</font> *<font 391class="arg">context</font>; 392<br> 393} <font class="type">tre_str_source</font>; 394<br> 395<br> 396<font class="type">int</font> <font 397class="func">reguexec</font>(<font class="qual">const</font> 398<font class="type">regex_t</font> *<font 399class="arg">preg</font>, <font class="qual">const</font> <font 400class="type">tre_str_source</font> *<font class="arg">string</font>, 401<font class="type">size_t</font> <font class="arg">nmatch</font>, 402<br> 403<font class="type">regmatch_t</font> <font 404class="arg">pmatch</font>[], <font class="type">int</font> 405<font class="arg">eflags</font>); 406</code> 407</div> 408 409<p> 410The <tt><font class="func">reguexec</font>()</tt> function works just 411like the other <tt>regexec()</tt> functions, except that the input 412string is read from user specified callback functions instead of a 413character array. This makes it possible, for example, to match 414regexps over arbitrary user specified data structures. 415</p> 416 417<p> 418The <tt><font class="type">tre_str_source</font></tt> structure 419contains the following fields: 420</p> 421<blockquote> 422<dl> 423<dt><tt>get_next_char</tt></dt> 424<dd>This function must retrieve the next available character. If a 425character is not available, the space pointed to by 426<tt><font class="arg">c</font></tt> must be set to zero and it must return 427a nonzero value. If a character is available, it must be stored 428to the space pointed to by 429<tt><font class="arg">c</font></tt>, and the integer pointer to by 430<tt><font class="arg">pos_add</font></tt> must be set to the 431number of units advanced in the input (the value must be 432<tt>>=1</tt>), and zero must be returned.</dd> 433 434<dt><tt>rewind</tt></dt> 435<dd>This function must rewind the input stream to the position 436specified by <tt><font class="arg">pos</font></tt>. Unless the regexp 437uses back references, <tt>rewind</tt> is not needed and can be set to 438<tt>NULL</tt>.</dd> 439 440<dt><tt>compare</tt></dt> 441<dd>This function compares two substrings in the input streams 442starting at the positions specified by <tt><font 443class="arg">pos1</font></tt> and <tt><font 444class="arg">pos2</font></tt> of length <tt><font 445class="arg">len</font></tt>. If the substrings are equal, 446<tt>compare</tt> must return zero, otherwise a nonzero value must be 447returned. Unless the regexp uses back references, <tt>compare</tt> is 448not needed and can be set to <tt>NULL</tt>.</dd> 449 450<dt><tt>context</tt></dt> 451<dd>This is a context variable, passed as the last argument to 452all of the above functions for keeping track of the internal state of 453the users code.</dd> 454 455</dl> 456</blockquote> 457 458<p> 459The position in the input stream is measured in <tt><font 460class="type">size_t</font></tt> units. The current position is the 461sum of the increments gotten from <tt><font 462class="arg">pos_add</font></tt> (plus the position of the last 463<tt>rewind</tt>, if any). The starting position is zero. Submatch 464positions filled in the <tt><font class="arg">pmatch</font>[]</tt> 465array are, of course, given using positions computed in this way. 466</p> 467 468<p> 469For an example of how to use <tt>reguexec()</tt>, see the 470<tt>tests/test-str-source.c</tt> file in the TRE source code 471distribution. 472</p> 473 474<h2>The approximate matching functions</h2> 475<a name="regaexec"></a> 476 477<div class="code"> 478<code> 479#include <tre/regex.h> 480<br> 481<br> 482<font class="qual">typedef struct</font> {<br> 483 <font class="type">int</font> 484<font class="arg">cost_ins</font>;<br> 485 <font class="type">int</font> 486<font class="arg">cost_del</font>;<br> 487 <font class="type">int</font> 488<font class="arg">cost_subst</font>;<br> 489 <font class="type">int</font> 490<font class="arg">max_cost</font>;<br><br> 491 <font class="type">int</font> 492<font class="arg">max_ins</font>;<br> 493 <font class="type">int</font> 494<font class="arg">max_del</font>;<br> 495 <font class="type">int</font> 496<font class="arg">max_subst</font>;<br> 497 <font class="type">int</font> 498<font class="arg">max_err</font>;<br> 499} <font class="type">regaparams_t</font>;<br> 500<br> 501<font class="qual">typedef struct</font> {<br> 502 <font class="type">size_t</font> 503<font class="arg">nmatch</font>;<br> 504 <font class="type">regmatch_t</font> 505*<font class="arg">pmatch</font>;<br> 506 <font class="type">int</font> 507<font class="arg">cost</font>;<br> 508 <font class="type">int</font> 509<font class="arg">num_ins</font>;<br> 510 <font class="type">int</font> 511<font class="arg">num_del</font>;<br> 512 <font class="type">int</font> 513<font class="arg">num_subst</font>;<br> 514} <font class="type">regamatch_t</font>;<br> 515<br> 516<font class="type">int</font> <font 517class="func">regaexec</font>(<font class="qual">const</font> 518<font class="type">regex_t</font> *<font 519class="arg">preg</font>, <font class="qual">const</font> <font 520class="type">char</font> *<font class="arg">string</font>,<br> 521 522<font class="type">regamatch_t</font> 523*<font class="arg">match</font>, 524<font class="type">regaparams_t</font> 525<font class="arg">params</font>, 526<font class="type">int</font> 527<font class="arg">eflags</font>); 528<br> 529<font class="type">int</font> <font 530class="func">reganexec</font>(<font class="qual">const</font> 531<font class="type">regex_t</font> *<font 532class="arg">preg</font>, <font class="qual">const</font> <font 533class="type">char</font> *<font class="arg">string</font>, 534<font class="type">size_t</font> <font class="arg">len</font>,<br> 535 536<font class="type">regamatch_t</font> 537*<font class="arg">match</font>, 538<font class="type">regaparams_t</font> 539<font class="arg">params</font>, 540<font class="type">int</font> <font class="arg">eflags</font>); 541<br> 542<font class="type">int</font> <font 543class="func">regawexec</font>(<font class="qual">const</font> 544<font class="type">regex_t</font> *<font 545class="arg">preg</font>, <font class="qual">const</font> <font 546class="type">wchar_t</font> *<font class="arg">string</font>,<br> 547 548<font class="type">regamatch_t</font> 549*<font class="arg">match</font>, 550<font class="type">regaparams_t</font> 551<font class="arg">params</font>, 552<font class="type">int</font> 553<font class="arg">eflags</font>); 554<br> 555<font class="type">int</font> 556<font class="func">regawnexec</font>( 557<font class="qual">const</font> 558<font class="type">regex_t</font> 559*<font class="arg">preg</font>, 560<font class="qual">const</font> 561<font class="type">wchar_t</font> 562*<font class="arg">string</font>, 563<font class="type">size_t</font> 564<font class="arg">len</font>,<br> 565 566<font class="type">regamatch_t</font> 567*<font class="arg">match</font>, 568<font class="type">regaparams_t</font> 569<font class="arg">params</font>, 570<font class="type">int</font> 571<font class="arg">eflags</font>); 572<br> 573</code> 574</div> 575 576<p> 577The <tt><font class="func">regaexec</font>()</tt> function searches for 578the best match in <tt><font class="arg">string</font></tt> 579against the compiled regexp <tt><font 580class="arg">preg</font></tt>, initialized by a previous call to 581any one of the <a href="#regcomp"><tt>regcomp</tt></a> functions. 582</p> 583 584<p> 585The <tt><font class="func">reganexec</font>()</tt> function is like 586<tt><font class="func">regaexec</font>()</tt>, but <tt><font 587class="arg">string</font></tt> is not terminated by a null byte. 588Instead, the <tt><font class="arg">len</font></tt> argument is used to 589tell the length of the string, and the string may contain null 590bytes. The <tt><font class="func">regawexec</font>()</tt> and 591<tt><font class="func">regawnexec</font>()</tt> functions work like 592<tt><font class="func">regaexec</font>()</tt> and <tt><font 593class="func">reganexec</font>()</tt>, respectively, but take a wide 594character (<tt><font class="type">wchar_t</font></tt>) string instead 595of a byte string. 596</p> 597 598<p> 599The <tt><font class="arg">eflags</font></tt> argument is like for 600the regexec() functions. 601</p> 602 603<p> 604The <tt><font class="arg">params</font></tt> struct controls the 605approximate matching parameters: 606<blockquote> 607<dl> 608 <dt><tt><font class="type">int</font></tt> 609 <tt><font class="arg">cost_ins</font></tt></dt> 610 <dd>The default cost of an inserted character, that is, an extra 611 character in <tt><font class="arg">string</font></tt>.</dd> 612 613 <dt><tt><font class="type">int</font></tt> 614 <tt><font class="arg">cost_del</font></tt></dt> 615 <dd>The default cost of a deleted character, that is, a character 616 missing from <tt><font class="arg">string</font></tt>.</dd> 617 618 <dt><tt><font class="type">int</font></tt> 619 <tt><font class="arg">cost_subst</font></tt></dt> 620 <dd>The default cost of a substituted character.</dd> 621 622 <dt><tt><font class="type">int</font></tt> 623 <tt><font class="arg">max_cost</font></tt></dt> 624 <dd>The maximum allowed cost of a match. If this is set to zero, 625 an exact matching is searched for, and results equivalent to 626 those returned by the <tt>regexec()</tt> functions are 627 returned.</dd> 628 629 <dt><tt><font class="type">int</font></tt> 630 <tt><font class="arg">max_ins</font></tt></dt> 631 <dd>Maximum allowed number of inserted characters.</dd> 632 633 <dt><tt><font class="type">int</font></tt> 634 <tt><font class="arg">max_del</font></tt></dt> 635 <dd>Maximum allowed number of deleted characters.</dd> 636 637 <dt><tt><font class="type">int</font></tt> 638 <tt><font class="arg">max_subst</font></tt></dt> 639 <dd>Maximum allowed number of substituted characters.</dd> 640 641 <dt><tt><font class="type">int</font></tt> 642 <tt><font class="arg">max_err</font></tt></dt> 643 <dd>Maximum allowed number of errors (inserts + deletes + 644 substitutes).</dd> 645</dl> 646</blockquote> 647 648<p> 649The <tt><font class="arg">match</font></tt> argument points to a 650<tt><font class="type">regamatch_t</font></tt> structure. The 651<tt><font class="arg">nmatch</font></tt> and <tt><font 652class="arg">pmatch</font></tt> field must be filled by the caller. If 653<code>REG_NOSUB</code> was used when compiling the regexp, or 654<code>match->nmatch</code> is zero, or 655<code>match->pmatch</code> is <code>NULL</code>, the 656<code>match->pmatch</code> argument is ignored. Otherwise, the 657submatches corresponding to the parenthesized subexpressions are 658filled in the elements of <code>match->pmatch</code>, which must be 659dimensioned to have at least <code>match->nmatch</code> elements. 660The <code>match->cost</code> field is set to the cost of the match 661found, and the <code>match->num_ins</code>, 662<code>match->num_del</code>, and <code>match->num_subst</code> 663fields are set to the number of inserts, deletes, and substitutes in 664the match, respectively. 665</p> 666 667<p> 668The <tt>regaexec()</tt> functions return zero if a match with cost 669smaller than <code>params->max_cost</code> was found, otherwise 670they return <code>REG_NOMATCH</code> to indicate no match, or 671<code>REG_ESPACE</code> to indicate that enough temporary memory could 672not be allocated to complete the matching operation. 673</p> 674 675<h2>Miscellaneous</h2> 676 677<div class="code"> 678<code> 679#include <tre/regex.h> 680<br> 681<br> 682<font class="type">int</font> <font 683class="func">tre_have_backrefs</font>(<font class="qual">const</font> 684<font class="type">regex_t</font> *<font class="arg">preg</font>); 685<br> 686<font class="type">int</font> <font 687class="func">tre_have_approx</font>(<font class="qual">const</font> 688<font class="type">regex_t</font> *<font class="arg">preg</font>); 689<br> 690</code> 691</div> 692 693<p> 694The <tt><font class="func">tre_have_backrefs</font>()</tt> and 695<tt><font class="func">tre_have_approx</font>()</tt> functions return 6961 if the compiled pattern has back references or uses approximate 697matching, respectively, and 0 if not. 698</p> 699 700 701<h2>Checking build time options</h2> 702 703<a name="tre_config"></a> 704<div class="code"> 705<code> 706#include <tre/regex.h> 707<br> 708<br> 709<font class="type">char</font> *<font 710class="func">tre_version</font>(<font class="type">void</font>); 711<br> 712<font class="type">int</font> <font 713class="func">tre_config</font>(<font class="type">int</font> <font 714class="arg">query</font>, <font class="type">void</font> *<font 715class="arg">result</font>); 716<br> 717</code> 718</div> 719 720<p> 721The <tt><font class="func">tre_config</font>()</tt> function can be 722used to retrieve information of which optional features have been 723compiled into the TRE library and information of other parameters that 724may change between releases. 725</p> 726 727<p> 728The <tt><font class="arg">query</font></tt> argument is an integer 729telling what information is requested for. The <tt><font 730class="arg">result</font></tt> argument is a pointer to a variable 731where the information is returned. The return value of a call to 732<tt><font class="func">tre_config</font>()</tt> is zero if <tt><font 733class="arg">query</font></tt> was recognized, REG_NOMATCH otherwise. 734</p> 735 736<p> 737The following values are recognized for <tt><font 738class="arg">query</font></tt>: 739 740<blockquote> 741<dl> 742<dt><tt>TRE_CONFIG_APPROX</tt></dt> 743<dd>The result is an integer that is set to one if approximate 744matching support is available, zero if not.</dd> 745<dt><tt>TRE_CONFIG_WCHAR</tt></dt> 746<dd>The result is an integer that is set to one if wide character 747support is available, zero if not.</dd> 748<dt><tt>TRE_CONFIG_MULTIBYTE</tt></dt> 749<dd>The result is an integer that is set to one if multibyte character 750set support is available, zero if not.</dd> 751<dt><tt>TRE_CONFIG_SYSTEM_ABI</tt></dt> 752<dd>The result is an integer that is set to one if TRE has been 753compiled to be compatible with the system regex ABI, zero if not.</dd> 754<dt><tt>TRE_CONFIG_VERSION</tt></dt> 755<dd>The result is a pointer to a static character string that gives 756the version of the TRE library.</dd> 757</dl> 758</blockquote> 759 760 761<p> 762The <tt><font class="func">tre_version</font>()</tt> function returns 763a short human readable character string which shows the software name, 764version, and license. 765 766<h2>Preprocessor definitions</h2> 767 768<p>The header <tt><tre/regex.h></tt> defines certain 769C preprocessor symbols. 770 771<h3>Version information</h3> 772 773<p>The following definitions may be useful for checking whether a new 774enough version is being used. Note that it is recommended to use the 775<tt>pkg-config</tt> tool for version and other checks in Autoconf 776scripts.</p> 777 778<blockquote> 779<dl> 780<dt><tt>TRE_VERSION</tt></dt> 781<dd>The version string. </dd> 782 783<dt><tt>TRE_VERSION_1</tt></dt> 784<dd>The major version number (first part of version string).</dd> 785 786<dt><tt>TRE_VERSION_2</tt></dt> 787<dd>The minor version number (second part of version string).</dd> 788 789<dt><tt>TRE_VERSION_3</tt></dt> 790<dd>The micro version number (third part of version string).</dd> 791 792</dl> 793</blockquote> 794 795<h3>Features</h3> 796 797<p>The following definitions may be useful for checking whether all 798necessary features are enabled. Use these only if compile time 799checking suffices (linking statically with TRE). When linking 800dynamically <a href="#tre_config"><tt>tre_config()</tt></a> should be used 801instead.</p> 802 803<blockquote> 804<dl> 805<dt><tt>TRE_APPROX</tt></dt> 806<dd>This is defined if approximate matching support is enabled. The 807prototypes for approximate matching functions are defined only if 808<tt>TRE_APPROX</tt> is defined.</dd> 809 810<dt><tt>TRE_WCHAR</tt></dt> 811<dd>This is defined if wide character support is enabled. The 812prototypes for wide character matching functions are defined only if 813<tt>TRE_WCHAR</tt> is defined.</dd> 814 815<dt><tt>TRE_MULTIBYTE</tt></dt> 816<dd>This is defined if multibyte character set support is enabled. 817If this is not set any locale settings are ignored, and the default 818locale is used when parsing regexps and matching strings.</dd> 819 820</dl> 821</blockquote> 822