Skip to content

t: add lint-style.pl and convert grep to test_grep#2135

Open
mmontalbo wants to merge 6 commits into
gitgitgadget:masterfrom
mmontalbo:mm/test-grep-docs
Open

t: add lint-style.pl and convert grep to test_grep#2135
mmontalbo wants to merge 6 commits into
gitgitgadget:masterfrom
mmontalbo:mm/test-grep-docs

Conversation

@mmontalbo
Copy link
Copy Markdown

@mmontalbo mmontalbo commented Jun 2, 2026

The test suite has a test_grep wrapper that prints file contents on
assertion failure, making debugging easier. Many tests still use
bare 'grep' for assertions, which silently swallows context on
failure.

This series adds a lint tool (lint-style.pl) to mechanically detect
and convert these, then applies it across the test suite.

The tool reuses the Lexer and ShellParser from chainlint's shared
parser module rather than reimplementing shell parsing. This
gives us proper handling of heredocs, $(...), pipes, quoting,
case patterns, and control flow structures for free. The only
new parsing logic is the grep assertion classifier, which works
on the already-parsed token stream.

The approach:

  • Positive matching: is_grep_assertion() examines parsed tokens
    and classifies whether a grep looks like an assertion (PATTERN +
    FILE) vs a filter (piped, redirected, inside $(), with -c/-l/-L).
    Only assertions are converted.

  • Missing file detection: greps at assertion level with no file
    argument (reading from empty stdin) are flagged as likely bugs.
    Three pre-existing instances were found and fixed in patch 5.

  • Auto-fix: --fix applies the conversions mechanically. The -q
    flag is dropped (test_grep checks status natively), and
    negation is rewritten from '! grep' to 'test_grep !'.

Structure:

1/6 t/README: document test_grep helper
2/6 t: extract chainlint's parser into shared module
3/6 t: fix Lexer line count for $() inside double-quoted strings
4/6 t: add lint-style.pl with test_grep negation rule
5/6 t: fix grep assertions missing file arguments
6/6 t: lint and convert grep assertions to test_grep

Patch 2 extracts chainlint.pl's Lexer, ShellParser, and
ScriptParser into a shared module (t/chainlint/lib-perl.pl).
ScriptParser's check_test() becomes a no-op in the module;
chainlint.pl defines ChainlintParser (a ScriptParser subclass
that runs TestParser for &&-chain detection), and lint-style.pl
defines LintParser (a ScriptParser subclass that runs grep
lint rules). chainlint.pl loads the module via do() and
behaves identically.

Patch 3 fixes a pre-existing Lexer bug where multi-line $()
substitutions inside double-quoted strings caused line numbers to
drift. chainlint is unaffected (it uses byte offsets), but
lint-style.pl needs accurate token line numbers to locate and fix
specific lines in the original file.

Patch 4 introduces the lint-style.pl framework (LintParser
subclass, Makefile targets, fixture infrastructure) with a small,
complete rule (! test_grep -> test_grep !) so reviewers can see
the machinery in action before the bulk conversion.

Patch 5 fixes three test bugs where grep assertions were missing
their file arguments, causing them to pass vacuously (all three
pass with the corrected arguments). These were independently
discovered by the missing-file detection rule introduced in
patch 6.

Patch 6 adds the main rule, its fixtures, and the mechanical
conversion of ~2800 assertions across ~340 files, including sourced
test fragments in t/t5411/ and heredoc test bodies in t/t5564/.

To verify the conversion (patch 6 adds both the rule and the
mechanical conversion in the same commit, so apply the full
series and re-run --fix to confirm it produces no further
changes):

git checkout &&
perl t/lint-style.pl --fix t/t*.sh t/test-lib*.sh t/lib-.sh
t/
-tests.sh t/perf/.sh t/t5411/.sh

As an independent completeness check:

grep -rn '^\s*!\sgrep\s' t/t*.sh t/lib-.sh t/-tests.sh
t/test-lib*.sh t/perf/*.sh |
grep -v test_grep | grep -v lint-ok | grep -v '#.*grep'

Known limitations:

  • Two greps in t/t3901 inside case branches that inherit piped
    stdin from two lines above are suppressed with '# lint-ok'.

  • One grep in t/t6437 uses glob expansion (grep -q content *)
    which breaks test_grep's file check; suppressed with '# lint-ok'.

  • One grep in t/t7527 captures $? for later use rather than
    asserting inline; suppressed with '# lint-ok'.

@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch 2 times, most recently from 3258ee2 to cfa4313 Compare June 3, 2026 02:33
test_grep is a wrapper around grep for test assertions that prints
the file contents on failure for easier debugging.  It also accepts
'!' as its first argument for negation, which preserves the
diagnostic output that '! test_grep' would suppress.

Despite being widely used (and the preferred replacement for bare
grep in assertions), test_grep has no entry in t/README alongside
the other documented helpers like test_cmp and test_line_count.
Add one.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch 26 times, most recently from 379dcce to 1f8f6e1 Compare June 3, 2026 10:58
Move the Lexer, ShellParser, and ScriptParser packages from
chainlint.pl into t/chainlint/lib-perl.pl so they can be reused
by other tools.  ScriptParser's check_test() is a no-op in the
shared module; callers subclass ScriptParser and override it.

chainlint.pl defines TestParser (&&-chain detection) and
ChainlintParser (a ScriptParser subclass whose check_test runs
TestParser and formats the results).  These subclasses use
'our @isa' rather than 'use base' because they are compiled
before do() loads the shared module at runtime, so the base
classes do not yet exist at compile time.  The shared module
itself can use 'use base' because it is compiled during the
do() call, after which ShellParser is available.

A subsequent commit introduces lint-style.pl which needs the same
shell parser to properly tokenize test scripts.  Sharing the parser
avoids reimplementing heredoc handling, $(...) nesting, pipe
tracking, quoting, and test body extraction.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
mmontalbo added 4 commits June 3, 2026 04:10
When a $() command substitution spans multiple lines inside a
double-quoted string, scan_dqstring's post-loop newline counter
re-counts newlines that were already counted during the recursive
parse of the $() body.  This happens because scan_dollar includes
"\n" tokens from the recursive parse in its synthetic join text,
and scan_dqstring then counts those newlines again.

The result is that the Lexer's lineno advances too far, causing all
subsequent tokens to be reported at line numbers higher than their
actual position in the body text.  The drift equals the number of
newlines inside $() substitutions within double-quoted strings.

Fix this by filtering "\n" tokens out of scan_dollar's join for
$().  The newlines in the synthetic text are meaningless for the
token representation (it already flattens the $() to a single
token), and removing them prevents the double-count.

This does not affect chainlint's output because chainlint annotates
the original body text using byte offsets, not token line numbers.
It does matter for tools like lint-style.pl (introduced in a
subsequent commit) that use token line numbers to locate and fix
specific lines in the original file.

Add check-chainlint-lineno.pl to verify that the Lexer reports
correct line numbers after multi-line $() in double-quoted strings.
The test is appended to the check-chainlint Makefile target since
it validates the shared parser, not a standalone tool.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Add a mechanical lint checker for test scripts, similar in spirit to
check-non-portable-shell.pl but focused on test conventions rather
than portability.

The tool defines LintParser, a subclass of ScriptParser (from the
shared chainlint/lib-perl.pl module).  ScriptParser's parse_cmd()
finds test_expect_success blocks and calls check_test() for each
body; LintParser overrides check_test() to run lint rules on the
parsed commands.  A "# lint-ok" comment suppresses all checks for
intentional style violations.

The first rule detects '! test_grep' and replaces it with
'test_grep !'.  Shell-level negation suppresses the diagnostic
output that test_grep prints on failure; the built-in negation
preserves it.  A fallback line scan also checks helper functions
outside test bodies.  All four existing violations are converted
mechanically via --fix.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Three grep assertions were missing their file arguments, causing
them to read from empty stdin instead of the intended file:

- t2402: '! grep ...' should read from 'out', matching the
  grep on the preceding line.
- t7507: the closing quote is in the wrong place, making the
  entire 'diff --git actual' a single pattern with no file
  argument instead of pattern 'diff --git' and file 'actual'.
- t7700: '! grep ...' should read from 'packlist', matching
  the redirect on the preceding line.

Without file arguments these greps always succeed (empty stdin
matches nothing), so the assertions were not actually checking
anything.  All three tests pass with the corrected file arguments,
confirming the intended behavior is sound.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Extend lint-style.pl with a rule that detects bare 'grep' used as a
test assertion and converts it to test_grep.  test_grep prints the
file contents on failure, making test debugging significantly easier.

The rule reuses the same parser-based architecture as the negation
rule added earlier.  parse_commands() is extended to track pipeline
and redirection context so that grep used as a data filter (pipes,
redirects, -c/-l/-L/-r flags, command substitution, control flow
conditions, for-loop item lists) is distinguished from grep used as
an assertion (checking exit status against a file argument).

is_grep_assertion() classifies each grep command: convertible
assertion (pattern and file present), filter (not an assertion), or
missing file argument (assertion-level grep reading from empty
stdin, flagged as a likely bug but not auto-fixed).

The --fix mode handles:
 - Replacing 'grep' with 'test_grep'
 - Moving negation from '! grep' to 'test_grep !'
 - Stripping the -q flag (test_grep inherently checks match status)

Since check_bare_grep handles '! grep' -> 'test_grep !' directly,
the fallback raw-line scan for '! test_grep' outside test bodies
(introduced in the previous negation-only commit) is no longer
needed and is removed.

Three files require '# lint-ok' annotations for intentional grep
usage that cannot be mechanically converted: t3901 (piped stdin via
case block), t6437 (glob argument breaks test_grep's test -f
check), and t7527 ($? capture on the next line).

The test-lint-style scope is extended to include sourced test
fragments in subdirectories (t5411/*.sh and similar) via a new
TSOURCED variable.

Run '--fix' to convert all ~2800 grep assertions across ~340 files
in the test suite.  test-lib-functions.sh and lib-rebase.sh are
excluded from linting since they implement test infrastructure
rather than test assertions.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch from 1f8f6e1 to 5d5366a Compare June 3, 2026 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant