t: add lint-style.pl and convert grep to test_grep#2135
Open
mmontalbo wants to merge 6 commits into
Open
Conversation
3258ee2 to
cfa4313
Compare
test_grep is a wrapper around grep for test assertions that prints the file contents on failure for easier debugging. It also accepts '!' as its first argument for negation, which preserves the diagnostic output that '! test_grep' would suppress. Despite being widely used (and the preferred replacement for bare grep in assertions), test_grep has no entry in t/README alongside the other documented helpers like test_cmp and test_line_count. Add one. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
379dcce to
1f8f6e1
Compare
Move the Lexer, ShellParser, and ScriptParser packages from chainlint.pl into t/chainlint/lib-perl.pl so they can be reused by other tools. ScriptParser's check_test() is a no-op in the shared module; callers subclass ScriptParser and override it. chainlint.pl defines TestParser (&&-chain detection) and ChainlintParser (a ScriptParser subclass whose check_test runs TestParser and formats the results). These subclasses use 'our @isa' rather than 'use base' because they are compiled before do() loads the shared module at runtime, so the base classes do not yet exist at compile time. The shared module itself can use 'use base' because it is compiled during the do() call, after which ShellParser is available. A subsequent commit introduces lint-style.pl which needs the same shell parser to properly tokenize test scripts. Sharing the parser avoids reimplementing heredoc handling, $(...) nesting, pipe tracking, quoting, and test body extraction. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
When a $() command substitution spans multiple lines inside a double-quoted string, scan_dqstring's post-loop newline counter re-counts newlines that were already counted during the recursive parse of the $() body. This happens because scan_dollar includes "\n" tokens from the recursive parse in its synthetic join text, and scan_dqstring then counts those newlines again. The result is that the Lexer's lineno advances too far, causing all subsequent tokens to be reported at line numbers higher than their actual position in the body text. The drift equals the number of newlines inside $() substitutions within double-quoted strings. Fix this by filtering "\n" tokens out of scan_dollar's join for $(). The newlines in the synthetic text are meaningless for the token representation (it already flattens the $() to a single token), and removing them prevents the double-count. This does not affect chainlint's output because chainlint annotates the original body text using byte offsets, not token line numbers. It does matter for tools like lint-style.pl (introduced in a subsequent commit) that use token line numbers to locate and fix specific lines in the original file. Add check-chainlint-lineno.pl to verify that the Lexer reports correct line numbers after multi-line $() in double-quoted strings. The test is appended to the check-chainlint Makefile target since it validates the shared parser, not a standalone tool. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Add a mechanical lint checker for test scripts, similar in spirit to check-non-portable-shell.pl but focused on test conventions rather than portability. The tool defines LintParser, a subclass of ScriptParser (from the shared chainlint/lib-perl.pl module). ScriptParser's parse_cmd() finds test_expect_success blocks and calls check_test() for each body; LintParser overrides check_test() to run lint rules on the parsed commands. A "# lint-ok" comment suppresses all checks for intentional style violations. The first rule detects '! test_grep' and replaces it with 'test_grep !'. Shell-level negation suppresses the diagnostic output that test_grep prints on failure; the built-in negation preserves it. A fallback line scan also checks helper functions outside test bodies. All four existing violations are converted mechanically via --fix. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Three grep assertions were missing their file arguments, causing them to read from empty stdin instead of the intended file: - t2402: '! grep ...' should read from 'out', matching the grep on the preceding line. - t7507: the closing quote is in the wrong place, making the entire 'diff --git actual' a single pattern with no file argument instead of pattern 'diff --git' and file 'actual'. - t7700: '! grep ...' should read from 'packlist', matching the redirect on the preceding line. Without file arguments these greps always succeed (empty stdin matches nothing), so the assertions were not actually checking anything. All three tests pass with the corrected file arguments, confirming the intended behavior is sound. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Extend lint-style.pl with a rule that detects bare 'grep' used as a test assertion and converts it to test_grep. test_grep prints the file contents on failure, making test debugging significantly easier. The rule reuses the same parser-based architecture as the negation rule added earlier. parse_commands() is extended to track pipeline and redirection context so that grep used as a data filter (pipes, redirects, -c/-l/-L/-r flags, command substitution, control flow conditions, for-loop item lists) is distinguished from grep used as an assertion (checking exit status against a file argument). is_grep_assertion() classifies each grep command: convertible assertion (pattern and file present), filter (not an assertion), or missing file argument (assertion-level grep reading from empty stdin, flagged as a likely bug but not auto-fixed). The --fix mode handles: - Replacing 'grep' with 'test_grep' - Moving negation from '! grep' to 'test_grep !' - Stripping the -q flag (test_grep inherently checks match status) Since check_bare_grep handles '! grep' -> 'test_grep !' directly, the fallback raw-line scan for '! test_grep' outside test bodies (introduced in the previous negation-only commit) is no longer needed and is removed. Three files require '# lint-ok' annotations for intentional grep usage that cannot be mechanically converted: t3901 (piped stdin via case block), t6437 (glob argument breaks test_grep's test -f check), and t7527 ($? capture on the next line). The test-lint-style scope is extended to include sourced test fragments in subdirectories (t5411/*.sh and similar) via a new TSOURCED variable. Run '--fix' to convert all ~2800 grep assertions across ~340 files in the test suite. test-lib-functions.sh and lib-rebase.sh are excluded from linting since they implement test infrastructure rather than test assertions. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
1f8f6e1 to
5d5366a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The test suite has a test_grep wrapper that prints file contents on
assertion failure, making debugging easier. Many tests still use
bare 'grep' for assertions, which silently swallows context on
failure.
This series adds a lint tool (lint-style.pl) to mechanically detect
and convert these, then applies it across the test suite.
The tool reuses the Lexer and ShellParser from chainlint's shared
parser module rather than reimplementing shell parsing. This
gives us proper handling of heredocs, $(...), pipes, quoting,
case patterns, and control flow structures for free. The only
new parsing logic is the grep assertion classifier, which works
on the already-parsed token stream.
The approach:
Positive matching: is_grep_assertion() examines parsed tokens
and classifies whether a grep looks like an assertion (PATTERN +
FILE) vs a filter (piped, redirected, inside $(), with -c/-l/-L).
Only assertions are converted.
Missing file detection: greps at assertion level with no file
argument (reading from empty stdin) are flagged as likely bugs.
Three pre-existing instances were found and fixed in patch 5.
Auto-fix: --fix applies the conversions mechanically. The -q
flag is dropped (test_grep checks status natively), and
negation is rewritten from '! grep' to 'test_grep !'.
Structure:
1/6 t/README: document test_grep helper
2/6 t: extract chainlint's parser into shared module
3/6 t: fix Lexer line count for $() inside double-quoted strings
4/6 t: add lint-style.pl with test_grep negation rule
5/6 t: fix grep assertions missing file arguments
6/6 t: lint and convert grep assertions to test_grep
Patch 2 extracts chainlint.pl's Lexer, ShellParser, and
ScriptParser into a shared module (t/chainlint/lib-perl.pl).
ScriptParser's check_test() becomes a no-op in the module;
chainlint.pl defines ChainlintParser (a ScriptParser subclass
that runs TestParser for &&-chain detection), and lint-style.pl
defines LintParser (a ScriptParser subclass that runs grep
lint rules). chainlint.pl loads the module via do() and
behaves identically.
Patch 3 fixes a pre-existing Lexer bug where multi-line $()
substitutions inside double-quoted strings caused line numbers to
drift. chainlint is unaffected (it uses byte offsets), but
lint-style.pl needs accurate token line numbers to locate and fix
specific lines in the original file.
Patch 4 introduces the lint-style.pl framework (LintParser
subclass, Makefile targets, fixture infrastructure) with a small,
complete rule (! test_grep -> test_grep !) so reviewers can see
the machinery in action before the bulk conversion.
Patch 5 fixes three test bugs where grep assertions were missing
their file arguments, causing them to pass vacuously (all three
pass with the corrected arguments). These were independently
discovered by the missing-file detection rule introduced in
patch 6.
Patch 6 adds the main rule, its fixtures, and the mechanical
conversion of ~2800 assertions across ~340 files, including sourced
test fragments in t/t5411/ and heredoc test bodies in t/t5564/.
To verify the conversion (patch 6 adds both the rule and the
mechanical conversion in the same commit, so apply the full
series and re-run --fix to confirm it produces no further
changes):
git checkout &&
perl t/lint-style.pl --fix t/t*.sh t/test-lib*.sh t/lib-.sh
t/-tests.sh t/perf/.sh t/t5411/.sh
As an independent completeness check:
grep -rn '^\s*!\sgrep\s' t/t*.sh t/lib-.sh t/-tests.sh
t/test-lib*.sh t/perf/*.sh |
grep -v test_grep | grep -v lint-ok | grep -v '#.*grep'
Known limitations:
Two greps in t/t3901 inside case branches that inherit piped
stdin from two lines above are suppressed with '# lint-ok'.
One grep in t/t6437 uses glob expansion (grep -q content *)
which breaks test_grep's file check; suppressed with '# lint-ok'.
One grep in t/t7527 captures $? for later use rather than
asserting inline; suppressed with '# lint-ok'.