History log of /seL4-test-master/projects/musllibc/src/regex/regexec.c
Revision Date Author Comments
# aee6abb2 05-Oct-2016 Rich Felker <dalias@aerifal.cx>

fix regexec with haystack strings longer than INT_MAX

we inherited from TRE regexec code that's utterly wrong with respect
to the integer types it's using. while it doesn't appear that
compilers are producing unsafe output, signed integer overflows seem
to happen, and regexec fails to find matches past offset INT_MAX.

this patch fixes the type of all variables/fields used to store
offsets in the string from int to regoff_t. after the changes, basic
testing showed that regexec can now find matches past 2GB (INT_MAX)
and past 4GB on x86_64, and code generation is unchanged on i386.


# c3edc06d 06-Oct-2016 Rich Felker <dalias@aerifal.cx>

fix missing integer overflow checks in regexec buffer size computations

most of the possible overflows were already ruled out in practice by
regcomp having already succeeded performing larger allocations.
however at least the num_states*num_tags multiplication can clearly
overflow in practice. for safety, check them all, and use the proper
type, size_t, rather than int.

also improve comments, use calloc in place of malloc+memset, and
remove bogus casts.


# 546f6b32 05-Sep-2014 Szabolcs Nagy <nsz@port70.net>

fix memory leak in regexec when input contains illegal sequence


# 72ed3d47 17-Jul-2014 Rich Felker <dalias@aerifal.cx>

fix crash in regexec for nonzero nmatch argument with REG_NOSUB

per POSIX, the nmatch and pmatch arguments are ignored when the regex
was compiled with REG_NOSUB.


# ae4b0b96 31-Jan-2013 Rich Felker <dalias@aerifal.cx>

revert regex "cleanup" that seems unjustified and may break backtracking

it's not clear to me at the moment whether the code that was removed
(and which is now being re-added) is needed, but it's far from being a
no-op, and i don't want to risk breaking regex in this release.


# dd959163 13-Jan-2013 Szabolcs Nagy <nsz@port70.net>

regex: remove an unused local variable from regexec

pos_start local variable is not used in tre_tnfa_run_backtrack


# 400c5e5c 06-Sep-2012 Rich Felker <dalias@aerifal.cx>

use restrict everywhere it's required by c99 and/or posix 2008

to deal with the fact that the public headers may be used with pre-c99
compilers, __restrict is used in place of restrict, and defined
appropriately for any supported compiler. we also avoid the form
[restrict] since older versions of gcc rejected it due to a bug in the
original c99 standard, and instead use the form *restrict.


# b9dd43db 14-Apr-2012 Rich Felker <dalias@aerifal.cx>

fix signedness error handling invalid multibyte sequences in regexec

the "< 0" test was always false due to use of an unsigned type. this
resulted in infinite loops on 32-bit machines (adding -1U to a pointer
is the same as adding -1) and crashes on 64-bit machines (offsetting
the string pointer by 4gb-1b when an illegal sequence was hit).


# ad47d45e 20-Mar-2012 Rich Felker <dalias@aerifal.cx>

upgrade to latest upstream TRE regex code (0.8.0)

the main practical results of this change are
1. the regex code is no longer subject to LGPL; it's now 2-clause BSD
2. most (all?) popular nonstandard regex extensions are supported

I hesitate to call this a "sync" since both the old and new code are
heavily modified. in one sense, the old code was "more severely"
modified, in that it was actively hostile to non-strictly-conforming
expressions. on the other hand, the new code has eliminated the
useless translation of the entire regex string to wchar_t prior to
compiling, and now only converts multibyte character literals as
needed.

in the future i may use this modified TRE as a basis for writing the
long-planned new regex engine that will avoid multibyte-to-wide
character conversion entirely by compiling multibyte bracket
expressions specific to UTF-8.


# 74f75541 07-Apr-2011 Rich Felker <dalias@aerifal.cx>

fix bug in TRE found by clang (typo && instead of &)


# 0b44a031 11-Feb-2011 Rich Felker <dalias@aerifal.cx>

initial check-in, version 0.5.0