t: add lint-style.pl and convert grep to test_grep by mmontalbo · Pull Request #2135 · gitgitgadget/git

mmontalbo · 2026-06-02T21:21:30Z

The test suite has a test_grep wrapper that prints file contents on
assertion failure, making debugging easier. Many tests still use
bare 'grep' for assertions, which silently swallows context on
failure.

This series adds a lint tool (lint-style.pl) to mechanically detect
and convert these, then applies it across the test suite.

The tool reuses the Lexer and ShellParser from chainlint's shared
parser module rather than reimplementing shell parsing. This
gives us proper handling of heredocs, $(...), pipes, quoting,
case patterns, and control flow structures for free. The only
new parsing logic is the grep assertion classifier, which works
on the already-parsed token stream.

The approach:

Positive matching: is_grep_assertion() examines parsed tokens
and classifies whether a grep looks like an assertion (PATTERN +
FILE) vs a filter (piped, redirected, inside $(), with -c/-l/-L).
Only assertions are converted.
Missing file detection: greps at assertion level with no file
argument (reading from empty stdin) are flagged as likely bugs.
Three pre-existing instances were found and fixed in patch 5.
Auto-fix: --fix applies the conversions mechanically. The -q
flag is dropped (test_grep checks status natively), and
negation is rewritten from '! grep' to 'test_grep !'.

Structure:

1/6 t/README: document test_grep helper
2/6 t: extract chainlint's parser into shared module
3/6 t: fix Lexer line count for $() inside double-quoted strings
4/6 t: add lint-style.pl with test_grep negation rule
5/6 t: fix grep assertions missing file arguments
6/6 t: lint and convert grep assertions to test_grep

Patch 2 extracts chainlint.pl's Lexer, ShellParser, and
ScriptParser into a shared module (t/chainlint/lib-perl.pl).
ScriptParser's check_test() becomes a no-op in the module;
chainlint.pl defines ChainlintParser (a ScriptParser subclass
that runs TestParser for &&-chain detection), and lint-style.pl
defines LintParser (a ScriptParser subclass that runs grep
lint rules). chainlint.pl loads the module via do() and
behaves identically.

Patch 3 fixes a pre-existing Lexer bug where multi-line $()
substitutions inside double-quoted strings caused line numbers to
drift. chainlint is unaffected (it uses byte offsets), but
lint-style.pl needs accurate token line numbers to locate and fix
specific lines in the original file.

Patch 4 introduces the lint-style.pl framework (LintParser
subclass, Makefile targets, fixture infrastructure) with a small,
complete rule (! test_grep -> test_grep !) so reviewers can see
the machinery in action before the bulk conversion.

Patch 5 fixes three test bugs where grep assertions were missing
their file arguments, causing them to pass vacuously (all three
pass with the corrected arguments). These were independently
discovered by the missing-file detection rule introduced in
patch 6.

Patch 6 adds the main rule, its fixtures, and the mechanical
conversion of ~2800 assertions across ~340 files, including sourced
test fragments in t/t5411/ and heredoc test bodies in t/t5564/.

To verify the conversion (patch 6 adds both the rule and the
mechanical conversion in the same commit, so apply the full
series and re-run --fix to confirm it produces no further
changes):

git checkout &&
perl t/lint-style.pl --fix t/t*.sh t/test-lib*.sh t/lib-.sh
t/-tests.sh t/perf/.sh t/t5411/.sh

As an independent completeness check:

grep -rn '^\s*!\sgrep\s' t/t*.sh t/lib-.sh t/-tests.sh
t/test-lib*.sh t/perf/*.sh |
grep -v test_grep | grep -v lint-ok | grep -v '#.*grep'

Known limitations:

Two greps in t/t3901 inside case branches that inherit piped
stdin from two lines above are suppressed with '# lint-ok'.
One grep in t/t6437 uses glob expansion (grep -q content *)
which breaks test_grep's file check; suppressed with '# lint-ok'.
One grep in t/t7527 captures $? for later use rather than
asserting inline; suppressed with '# lint-ok'.

test_grep is a wrapper around grep for test assertions that prints the file contents on failure for easier debugging. It also accepts '!' as its first argument for negation, which preserves the diagnostic output that '! test_grep' would suppress. Despite being widely used (and the preferred replacement for bare grep in assertions), test_grep has no entry in t/README alongside the other documented helpers like test_cmp and test_line_count. Add one. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>

@isa

Move the Lexer, ShellParser, and ScriptParser packages from chainlint.pl into t/chainlint/lib-perl.pl so they can be reused by other tools. ScriptParser's check_test() is a no-op in the shared module; callers subclass ScriptParser and override it. chainlint.pl defines TestParser (&&-chain detection) and ChainlintParser (a ScriptParser subclass whose check_test runs TestParser and formats the results). These subclasses use 'our @isa' rather than 'use base' because they are compiled before do() loads the shared module at runtime, so the base classes do not yet exist at compile time. The shared module itself can use 'use base' because it is compiled during the do() call, after which ShellParser is available. A subsequent commit introduces lint-style.pl which needs the same shell parser to properly tokenize test scripts. Sharing the parser avoids reimplementing heredoc handling, $(...) nesting, pipe tracking, quoting, and test body extraction. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>

When a $() command substitution spans multiple lines inside a double-quoted string, scan_dqstring's post-loop newline counter re-counts newlines that were already counted during the recursive parse of the $() body. This happens because scan_dollar includes "\n" tokens from the recursive parse in its synthetic join text, and scan_dqstring then counts those newlines again. The result is that the Lexer's lineno advances too far, causing all subsequent tokens to be reported at line numbers higher than their actual position in the body text. The drift equals the number of newlines inside $() substitutions within double-quoted strings. Fix this by filtering "\n" tokens out of scan_dollar's join for $(). The newlines in the synthetic text are meaningless for the token representation (it already flattens the $() to a single token), and removing them prevents the double-count. This does not affect chainlint's output because chainlint annotates the original body text using byte offsets, not token line numbers. It does matter for tools like lint-style.pl (introduced in a subsequent commit) that use token line numbers to locate and fix specific lines in the original file. Add check-chainlint-lineno.pl to verify that the Lexer reports correct line numbers after multi-line $() in double-quoted strings. The test is appended to the check-chainlint Makefile target since it validates the shared parser, not a standalone tool. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>

Add a mechanical lint checker for test scripts, similar in spirit to check-non-portable-shell.pl but focused on test conventions rather than portability. The tool defines LintParser, a subclass of ScriptParser (from the shared chainlint/lib-perl.pl module). ScriptParser's parse_cmd() finds test_expect_success blocks and calls check_test() for each body; LintParser overrides check_test() to run lint rules on the parsed commands. A "# lint-ok" comment suppresses all checks for intentional style violations. The first rule detects '! test_grep' and replaces it with 'test_grep !'. Shell-level negation suppresses the diagnostic output that test_grep prints on failure; the built-in negation preserves it. A fallback line scan also checks helper functions outside test bodies. All four existing violations are converted mechanically via --fix. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>

Three grep assertions were missing their file arguments, causing them to read from empty stdin instead of the intended file: - t2402: '! grep ...' should read from 'out', matching the grep on the preceding line. - t7507: the closing quote is in the wrong place, making the entire 'diff --git actual' a single pattern with no file argument instead of pattern 'diff --git' and file 'actual'. - t7700: '! grep ...' should read from 'packlist', matching the redirect on the preceding line. Without file arguments these greps always succeed (empty stdin matches nothing), so the assertions were not actually checking anything. All three tests pass with the corrected file arguments, confirming the intended behavior is sound. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>

Extend lint-style.pl with a rule that detects bare 'grep' used as a test assertion and converts it to test_grep. test_grep prints the file contents on failure, making test debugging significantly easier. The rule reuses the same parser-based architecture as the negation rule added earlier. parse_commands() is extended to track pipeline and redirection context so that grep used as a data filter (pipes, redirects, -c/-l/-L/-r flags, command substitution, control flow conditions, for-loop item lists) is distinguished from grep used as an assertion (checking exit status against a file argument). is_grep_assertion() classifies each grep command: convertible assertion (pattern and file present), filter (not an assertion), or missing file argument (assertion-level grep reading from empty stdin, flagged as a likely bug but not auto-fixed). The --fix mode handles: - Replacing 'grep' with 'test_grep' - Moving negation from '! grep' to 'test_grep !' - Stripping the -q flag (test_grep inherently checks match status) Since check_bare_grep handles '! grep' -> 'test_grep !' directly, the fallback raw-line scan for '! test_grep' outside test bodies (introduced in the previous negation-only commit) is no longer needed and is removed. Three files require '# lint-ok' annotations for intentional grep usage that cannot be mechanically converted: t3901 (piped stdin via case block), t6437 (glob argument breaks test_grep's test -f check), and t7527 ($? capture on the next line). The test-lint-style scope is extended to include sourced test fragments in subdirectories (t5411/*.sh and similar) via a new TSOURCED variable. Run '--fix' to convert all ~2800 grep assertions across ~340 files in the test suite. test-lib-functions.sh and lib-rebase.sh are excluded from linting since they implement test infrastructure rather than test assertions. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>

mmontalbo force-pushed the mm/test-grep-docs branch 2 times, most recently from 3258ee2 to cfa4313 Compare June 3, 2026 02:33

mmontalbo force-pushed the mm/test-grep-docs branch 26 times, most recently from 379dcce to 1f8f6e1 Compare June 3, 2026 10:58

mmontalbo added 4 commits June 3, 2026 04:10

mmontalbo force-pushed the mm/test-grep-docs branch from 1f8f6e1 to 5d5366a Compare June 3, 2026 11:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

t: add lint-style.pl and convert grep to test_grep#2135

t: add lint-style.pl and convert grep to test_grep#2135
mmontalbo wants to merge 6 commits into
gitgitgadget:masterfrom
mmontalbo:mm/test-grep-docs

mmontalbo commented Jun 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mmontalbo commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mmontalbo commented Jun 2, 2026 •

edited

Loading