354628 |
11-Nov-2019 |
kevans |
MFC bsdgrep(1) fixes: r320414, r328559, r332805-r332806, r332809, r332832, r332850-r332852, r332856, r332858, r332876, r333351, r334803, r334806-r334809, r334821, r334837, r334889, r335188, r351769, r352691
r320414: Expect :mmap_eof_not_eol to fail
It relies on a jemalloc feature (opt.redzone) no longer available after r319971.
r328559: Remove t_grep:mmap_eof_not_eol test
The test was marked as an expected failure in r320414 after r319971's import of a newer jemalloc removed an essential feature (opt.redzone) for reproducing the behavior it was testing. Since then, no way has been found or demonstrated to reliably test the behavior, so remove the test.
r332805: bsdgrep: Split match processing out of procfile
procfile is getting kind of hairy, and it's not going to get better as we correct some more bits that assume we process one line at a time.
r332806: bsdgrep: Clean up procmatches a little bit
r332809: bsdgrep: Add some TODOs for future work on operating on chunks
r332832: bsdgrep: Break procmatches down a little bit more
Split the matching and non-matching cases out into their own functions to reduce future complexity. As the name implies, procmatches will eventually process more than one match itself in the future.
r332850: bsdgrep: Some light cleanup
There's no point checking for a bunch of file modes if we're not a practicing believer of DIR_SKIP or DEV_SKIP.
This also reduces some style violations that were particularly ugly looking when browsing through.
r332851: bsdgrep: More trivial cleanup/style cleanup
We can avoid branching for these easily reduced patterns
r332852: bsdgrep: if chain => switch
This makes some of this a little easier to follow (in my opinion).
r332856: bsdgrep: Fix --include/--exclude ordering issues
Prior to r332851: * --exclude always win out over --include * --exclude-dir always wins out over --include-dir
r332851 broke that behavior, resulting in: * First of --exclude, --include wins * First of --exclude-dir, --include-dir wins
As it turns out, both behaviors are wrong by modern grep standards- the latest rule wins. e.g.:
`grep --exclude foo --include foo 'thing' foo` foo is included
`grep --include foo --exclude foo 'thing' foo` foo is excluded
As tested with GNU grep 3.1.
This commit makes bsdgrep follow this behavior.
r332858: bsdgrep: Use grep_strdup instead of grep_malloc+strcpy
r332876: bsdgrep: Fix build failure WITHOUT_LZMA (incorrect bracket placement)
r333351: bsdgrep: Allow "-" to be passed to -f to mean "standard input"
A version of this patch was originally sent to me by se@, matching behavior from newer versions of GNU grep.
While there have been some differences of opinion on whether stdin should be closed or not after depleting it in process of -f, I've opted to leave stdin open and just let the later matching stuff fail and result in a no-match. I'm not married to the current behavior- it was generally chosen since we are adopting this in particular from GNU grep, and I would like to stay consistent without a strong argument to the contrary. The current behavior isn't technically wrong, it's just fairly unfriendly to the developer-user of grep that may not realize their usage is trivially invalid.
r334803: netbsd-tests: grep(1): Add test for -c flag
Someone might be inclined to accidentally break this. someone might have written said test because they broke it locally.
r334806: bsdgrep(1): Do some less dirty things with return types
Neither procfile nor grep_tree return anything meaningful to their callers. None of the callers actually care about how many lines were matched in all of the files they processed; it's all about "did anything match?"
This is generally just a light refactoring to remind me of what actually matters as I'm rewriting these bits to care less about 'stuff'.
r334807: bsdgrep(1): whoops, garbage collect the now write-only variable
r334808: bsdgrep(1): Don't initialize fts_flags twice
Admittedly, this is a clang-scan complaint... but it wasn't wrong. fts_flags is initialized by all cases in the switch(), which should be fairly obvious. Annotate this anyways.
r334809: netbsd-tests: bsdgrep(1): Add a test for -m, too
r334821: bsdgrep(1): Slooowly peel away the chunky onion
(or peel off the band-aid, whatever floats your boat)
This addresses two separate issues:
1.) Nothing within bsdgrep actually knew whether it cared about line numbers or not.
2.) The file layer knew nothing about the context in which it was being called.
#1 is only important when we're *not* processing line-by-line. #2 is debatably a good idea; the parsing context is only handy because that's where we store current offset information and, as of this commit, whether or not it needs to be line-aware.
r334837: bsdgrep(1): Evict character sequence that moved in
r334889: bsdgrep(1): Some more int -> bool conversions and name changes
Again motivated by upcoming work to rewrite a bunch of this- single-letter variable names and slightly misleading variable names ("lastmatches" to indicate that the last matched) are not helpful.
r335188: bsdgrep(1): Remove redundant initialization; unconditionally assigned later
r351769: bsdgrep(1): add some basic tests for some GNU Extension support
These will be expanded later as I come up with good test cases; for now, these seem to be enough to trigger bugs in base gnugrep and expose missing features in bsdgrep.
r352691: bsdgrep(1): various fixes of empty pattern/exit code/-c behavior
When an empty pattern is encountered in the pattern list, I had previously broken bsdgrep to count that as a "match all" and ignore any other patterns in the list. This commit rectifies that mistake, among others:
- The -v flag semantics were not quite right; lines matched should have been counted differently based on whether the -v flag was set or not. procline now definitively returns whether it's matched or not, and interpreting that result has been kicked up a level. - Empty patterns with the -x flag was broken similarly to empty patterns with the -w flag. The former is a whole-line match and should be more strict, only matching blank lines. No -x and no -w will will match the empty string at the beginning of each line. - The exit code with -L was broken, w.r.t. modern grep. Modern grap will exit(0) if any file that didn't match was output, so our interpretation was simply backwards. The new interpretation makes sense to me.
Tests updated and added to try and catch some of this.
This misbehavior was found by autoconf while fixing ports found in PR 229925 expecting either a more sane or a more GNU-like sed. |
322625 |
17-Aug-2017 |
kevans |
bsdgrep: bump version number to 2.6.0 and update copyright information
MFC r319132: bsdgrep: bump version number and add Kyle Evans copyright
The following changes have been made over the last couple of months:
Features:
- With bsdgrep -r, the working directory is implied if no directory is specified - bsdgrep will now behave as bsdgrep -r does when it's named rgrep - bsdgrep now understands -z/--null-data to use \0 as EOL - GNU regex compatibility is now indicated with a "GNU compatible" in the version string
Fixes:
- --mmap no longer hangs when coming across an EOF without an accompanying EOL - -o/--color matching generally improved, now produces earliest / longest matches - Context output now more closely aligns with GNU grep - Zero-length matches no longer exhibit broken behavior - Every output line now honors -b/-H/-n flags
Tests have been added for previous regressions as well as other previously untested behaviors.
Various other fixes have been commited, and refactoring for further / later improvements has taken place.
(The original submission changed the version string to 2.5.2, but I decided to use 2.6.0 to reflect the addition of new features.)
MFC r320754: Update copyright e-mail address to @FreeBSD.org address
Approved by: emaste (mentor, blanket MFC) |
322587 |
16-Aug-2017 |
kevans |
bsdgrep: fix -w flag matching with an empty pattern
MFC r317703: bsdgrep: fix -w flag matching with an empty pattern
-w flag matching with an empty pattern was generally 'broken', allowing matches to occur on any line whether or not it actually matches -w criteria.
This fix required a good amount of refactoring to address. procline() is altered to *only* process the line and return whether it was a match or not, necessary to be able to short-circuit the whole function in case of this matchall flag. -m flag handling is moved out as well because it suffers from the same fate as context handling if we bypass any actual pattern matching.
The matching context (matches, mostly) didn't previously exist outside of procline(), so we go ahead and create context object for file processing bits to pass around. grep_printline() was created due to this, for the scenarios where the matches don't actually matter and we just want to print a line or two, a la flushing the context queue and no -o or --color specified.
Damage from this broken behavior would have been mitigated by the fact that it is unlikely users would invoke grep -w with an empty pattern.
This was identified while checking PR 105221 for problems it this may cause in BSD grep, but PR 105221 is *not* a report of this behavior.
MFC r317741: bsdgrep: correct uninitialized variable introduced in r317703
MFC r317842: bsdgrep: don't ouptut matches with -c, -l, -L
Refactoring done in r317703 broke -c, -l, and -L flags implying suppression of match printing. Fortunately this is just a matter of not doing any printing of the resulting matches and context printing was not broken in this refactoring.
Add some regression tests since this area may still see further refactoring, include different context flags as well even though they were not broken in this case.
PR: 219077 Approved by: emaste (mentor, blanket MFC) |
322564 |
16-Aug-2017 |
kevans |
bsdgrep: Use implied working directory for -r if no directories are passed
MFC r317050: bsdgrep: for -r, use the working directory if none specified
This is more sensible than the previous behaviour of grepping stdin, and matches newer GNU grep behaviour.
MFC r317300 (ngie): Only expect :grep_r_implied to pass with bsdgrep(1)
The test fails with gnu grep from base and ports.
MFC r319002 (ngie): :rgrep : use atf-check to check the exit code/save the output of grep -r instead of calling grep -r without it, and saving the output to a file
This ensures that any errors thrown via grep -r are caught, not lost, and uses existing atf-sh idioms for saving files.
PR: 216307 Approved by: emaste (mentor, blanket MFC) Relnotes: yes |
322555 |
16-Aug-2017 |
kevans |
bsdgrep: Fix matching behavior and add regression tests
MFC r316477: bsdgrep: fix matching behaviour
- Set REG_NOTBOL if we've already matched beginning of line and we're examining later parts
- For each pattern we examine, apply it to the remaining bits of the line rather than (potentially) smaller subsets
- Check for REG_NOSUB after we've looked at all patterns initially matching the line
- Keep track of the last match we made to later determine if we're simply not matching any longer or if we need to proceed another byte because we hit a zero-length match
- Match the earliest and longest bit of each line before moving the beginning of what we match to further in the line, past the end of the longest match; this generally matches how gnugrep(1) seems to behave, and seems like pretty good behavior to me
- Finally, bail out of printing any matches if we were set to print all (empty pattern) but -o (output matches) was set
MFC r316489: bsdgrep: Initialize vars to avoid a false positive GCC warning
MFC r316491: bsdgrep: revert color changes from r316477
r316477 changed the color output to match exactly the in-tree GNU grep, but introduces unnecessary escape sequences.
MFC r316536: bsdgrep: create additional tests for coverage on recent fixes
Create additional tests to cover regressions that were discovered by PRs linked to reviews D10098, D10102, and D10104.
It is worth noting that neither bsdgrep(1) nor gnugrep(1) in the base system currently pass all of these tests, and gnugrep(1) not quite being up to snuff was also noted in at least one of the PRs.
MFC r317052: bsdgrep: fix zero-length matches without the -o flag
r316477 broke zero-length matches when not using the -o flag, by skipping over them entirely.
Add a regression test so that it doesn't break again in the future.
PR: 175314, 180990, 181263, 195763, 197531, 197555, 202022, 209116 Approved by: emaste (mentor, blanket MFC) Relnotes: yes |
299094 |
04-May-2016 |
ngie |
Merge ^/user/ngie/release-pkg-fix-tests to unbreak how test files are installed after r298107
Summary of changes:
- Replace all instances of FILES/TESTS with ${PACKAGE}FILES. This ensures that namespacing is kept with FILES appropriately, and that this shouldn't need to be repeated if the namespace changes -- only the definition of PACKAGE needs to be changed - Allow PACKAGE to be overridden by callers instead of forcing it to always be `tests`. In the event we get to the point where things can be split up enough in the base system, it would make more sense to group the tests with the blocks they're a part of, e.g. byacc with byacc-tests, etc - Remove PACKAGE definitions where possible, i.e. where FILES wasn't used previously. - Remove unnecessary TESTSPACKAGE definitions; this has been elided into bsd.tests.mk - Remove unnecessary BINDIRs used previously with ${PACKAGE}FILES; ${PACKAGE}FILESDIR is now automatically defined in bsd.test.mk. - Fix installation of files under data/ subdirectories in lib/libc/tests/hash and lib/libc/tests/net/getaddrinfo - Remove unnecessary .include <bsd.own.mk>s (some opportunistic cleanup)
Document the proposed changes in share/examples/tests/tests/... via examples so it's clear that ${PACKAGES}FILES is the suggested way forward in terms of replacing FILES. share/mk/bsd.README didn't seem like the appropriate method of communicating that info.
MFC after: never probably X-MFC with: r298107 PR: 209114 Relnotes: yes Tested with: buildworld, installworld, checkworld; buildworld, packageworld Sponsored by: EMC / Isilon Storage Division
|