feat: add readonly mode with bash safety and spawn child filtering by ofriw · Pull Request #7 · agenticoding/pi-agenticoding

ofriw · 2026-05-27T05:09:22Z

Note: This PR was generated by an AI agent. If you would like to talk with other humans, drop by our Discord!

Readonly Mode: OS-Level Sandboxing with Token-Efficient Guardrails

What changed and why

This PR implements the readonly mode requested in issue #2, but significantly stronger than the original proposal. Instead of relying solely on command-pattern filtering – which models consistently escaped by spawning interpreters (python -c, node -e, etc.) or using indirect write mechanisms – we built a two-layer enforcement with an OS-level sandbox as the primary barrier.

On macOS, commands run through sandbox-exec with a Seatbelt profile that denies all file-write operations except to the OS temp dir and /dev/null. On Linux, bubblewrap (bwrap) mounts the root filesystem read-only with a writable tmpfs overlay at the temp dir. On Windows (no native sandboxing), or when the OS tool is missing, we fall back to a comprehensively rewritten command-pattern classifier that understands shell pipelines, redirects, command substitutions, interpreter -c flags, xargs, sed -i, package manager mutations, and more.

Beyond bash, the PR blocks write, edit, and handoff at the tool-call layer, with a cache-aware strategy: parent sessions keep these tools in the tool list (to avoid context-cache invalidation) and block at call time, while child spawn sessions remove them entirely since they start fresh. We also gated the /handoff command in readonly mode, added TUI indicators, watchdog nudges, and context-hook messages.

Also tuned the context primer for token efficiency: added a Plan-then-execute section, simplified primacy-zone guidance, and rephrased spawn section for directness.

Value and implications

For projects running token-cost-sensitive coding agents, this is a meaningful UX and cost optimization. The model keeps its entire tool list (maximizing cache hit rates) while being unable to mutate the filesystem outside a controlled scratch space. The OS sandbox is explicitly not a security boundary – it is an active guardrail that prevents accidental writes from agentic loops, interpreter injections, or misdirected redirects. The fallback classifier on Windows means the feature works cross-platform even without OS sandbox tooling.

Breaking changes

None. Readonly is off by default, toggled on demand via /readonly, Ctrl+Shift+R, or the --readonly CLI flag. Existing workflows are unaffected.

Attached is an agent optimized description of the changes in this PR - AGENT_REVIEW.md

…acklist and git allowlist

…with TUI indicator

…and spawn child filtering

…ards

…cription

…only when no branch entries exist

…, and TUI warning

…ion, and redirect analysis

…ges, and CLI flag behavior

grzegorznowak · 2026-05-28T06:36:30Z

Summary: GH-2’s readonly-mode safety value is aligned, but concrete bash mutation bypasses and a no-UI toggle crash mean the PR should not merge as-is.

Business / Product Assessment

Verdict: REQUEST CHANGES

Strengths

The PR implements the requested opt-in readonly mode with a CLI flag, /readonly toggle, persisted branch entries, and runtime blocking for write/edit/handoff tools. Sources: work/pr_context.json:5, index.ts:63, index.ts:72, index.ts:127
Readonly behavior is propagated to spawned children by filtering write/edit and installing a readonly bash override when bash is inherited. Sources: spawn/index.ts:269, spawn/index.ts:274, spawn/index.ts:278
User-facing guidance was added through a status indicator and readonly-specific nudges so the mode is visible to users and the agent. Sources: tui.ts:54, index.ts:306, index.ts:318

In Scope Issues

Readonly bash still allows concrete mutation paths through git mixed commands and unparsed command substitution. Sources: readonly-bash.ts:45, readonly-bash.ts:239, readonly-bash.ts:241, readonly-bash.ts:242, readonly-bash.ts:312

High severity · Medium likelihood

Why: Readonly mode is sold as blocking destructive bash during exploratory sessions. git branch new-name, git tag v1, or git status $(git add .) can still mutate repository state while readonly is enabled.

Assumptions / Preconditions: The agent invokes bash while readonly mode is active.

Downgrade Factors: This PR describes readonly as guardrails rather than a security boundary, but these are ordinary git commands an agent may plausibly use.

Code Trail: The classifier splits only on shell separators and quote state, not command substitution. A segment is sent to the git allowlist only when the whole trimmed segment starts with git. Separately, branch and tag mixed policies accept any non-flag argument, which includes ref-creating forms.

Reproduction: In readonly mode, ask bash to run git branch review-temp or git status $(git add .). The classifier path can return safe instead of blocking the mutation.
/readonly is not safe in no-UI contexts and can throw after partially applying state. Sources: index.ts:69, index.ts:72, index.ts:73, index.ts:74, tui.ts:32

Medium severity · Medium likelihood

Why: The toggle mutates state and persists an entry before unconditionally calling ctx.ui.notify. In a headless command context, users can end up with readonly toggled and persisted even though the command failed.

Assumptions / Preconditions: /readonly or the idle shortcut is invoked with ctx.hasUI === false or without ctx.ui.

Downgrade Factors: Most interactive use likely has a UI, but other commands in this extension already support no-UI operation.

Code Trail: toggleReadonly flips state.readonlyEnabled, sets readonlyNudgePending, and appends agenticoding-readonly; updateIndicators no-ops when there is no UI, then ctx.ui.notify is dereferenced without checking ctx.hasUI.

Reproduction: Invoke the registered readonly command with a context shaped like { hasUI: false, getContextUsage: () => null }. State is changed before the notify call throws.

Out of Scope Issues

None.

Technical Assessment

Verdict: REQUEST CHANGES

Input Boundary Shape Risk Assessment

Status: Triggered
Boundary: raw tool-call input and persisted session branch entries -> stricter readonly command and rehydration assumptions.
Evidence / mitigation: event.input.command is cast directly to string before classification, and the classifier immediately expects string-like behavior; tests cover valid { command: string } inputs but not missing or malformed bash input. Branch rehydration also scans raw branch entries, though current issue evidence is strongest at the bash tool boundary. Sources: index.ts:136, index.ts:137, readonly-bash.ts:45, agenticoding.test.ts:3994

Strengths

The readonly classifier is factored behind a reusable isSafeReadonlyCommand API and is reused by both parent tool-call blocking and child bash override enforcement. Sources: readonly-bash.ts:307, index.ts:138, spawn/index.ts:141
The PR adds focused static coverage for tool blocking, child tool filtering, session rehydration, readonly nudges, shortcut gating, and state reset. Sources: agenticoding.test.ts:3922, agenticoding.test.ts:4005, agenticoding.test.ts:4183, agenticoding.test.ts:4604

In Scope Issues

The parent bash tool boundary does not validate command before classification. Sources: index.ts:136, index.ts:137, readonly-bash.ts:45, agenticoding.test.ts:3994

Medium severity · Medium likelihood

Why: A malformed bash tool payload should be blocked cleanly in readonly mode. Instead, missing command can throw, and some non-string values can be treated as empty/safe because the runtime cast does not validate shape.

Assumptions / Preconditions: The framework or model supplies malformed bash tool input while readonly mode is active.

Downgrade Factors: Normal bash calls likely provide { command: string }, and existing tests cover that happy path.

Code Trail: The tool-call handler casts event.input.command to string and passes it to isSafeReadonlyCommand. splitUnquotedShellSegments then reads cmd.length, so the boundary depends on runtime string shape without a guard.

Reproduction: Call the readonly tool_call handler with { toolName: "bash", input: {} } or a non-string command. The handler does not return a controlled readonly block result.

Out of Scope Issues

None.

Reusability

buildChildToolNames remains exported and testable, keeping child tool inheritance and readonly filtering easy to validate independently. Sources: spawn/index.ts:126, agenticoding.test.ts:4005
buildNudge centralizes readonly/topic guidance, with tests for readonly topic, no-topic, and boundary-hint cases. Sources: watchdog.ts:15, agenticoding.test.ts:3354, agenticoding.test.ts:3373

review generated with CURe v. 0.7.3 · multi-stage - stages: 4 · sha 6540936 · model gpt-5.5/high · tok 8m/82k/8m · session agenticoding-pi-agenticoding-pr7-20260528-053241-21cc · 16m32s

… Linux (bwrap)

… temp-dir path checking

…mmand validation

…n sessions

…dren, and watchdog throttling

grzegorznowak

Summary: GH-2 adds valuable readonly-mode guardrails, but the current PR does not yet reliably deliver the promised temp-only write behavior across fallback, Linux sandbox, resumed sessions, and active child sessions.

Business / Product Assessment

Verdict: REQUEST CHANGES

Strengths

Readonly mode has clear user entry points through --readonly, /readonly, and Ctrl+Shift+R, and it gives visible status/notification feedback when toggled. Sources: index.ts:63
The parent tool-call path blocks write, edit, and handoff while readonly is enabled, preserving the advertised high-level restriction for direct tool calls. Sources: index.ts:134
Child sessions created while readonly is already enabled remove write/edit from the child tool list and add a readonly bash custom tool. Sources: spawn/index.ts:301

In Scope Issues

The no-sandbox fallback can still allow non-temp filesystem mutation through shell wrappers and expansion forms, so Windows or hosts without usable sandbox tooling do not meet the PR’s cross-platform readonly promise. Sources: readonly-bash.ts:152

High severity · High likelihood

Why: The PR description promises the fallback classifier protects platforms without OS sandboxing. Several classifier paths still let mutating commands hide behind syntax that is common in shell usage, which means readonly can silently become writable outside temp.

Assumptions / Preconditions: OS sandboxing is unavailable or not used, such as Windows or a Linux/macOS host without the sandbox binary.

Downgrade Factors: A functioning OS sandbox would reduce the immediate write impact for commands that reach the wrapper, but the fallback is explicitly part of the feature contract.

Code Trail: applyReadonlyBashGuard returns plain allow when canUseOsSandbox() is false and classifyBashCommand says OK. The env branch only has special raw extraction for -S, while tokenization consumes --split-string; sudo option parsing omits value-taking host options; xargs checks only getMutationTargets, skipping git/package-manager/interpreter classifier logic; command substitution extraction handles $() and backticks but not process substitution, and mutation targets containing $() can be treated as glob-like paths.

Reproduction: Examples that should be blocked but are not covered by the current code shape include env --split-string "rm -rf /", sudo -h localhost rm /etc/passwd, printf install | xargs npm, and cat <(rm /etc/passwd) in fallback mode.
Linux sandbox wrapping changes bash command semantics by running accepted bash commands under /bin/sh, so valid readonly bash commands can fail or behave differently only on Linux with bwrap. Sources: os-sandbox.ts:181

Medium severity · High likelihood

Why: Users call a bash tool and the wrapper comments promise bash command preservation. On Linux, the sandboxed path changes the interpreter, which breaks expected readonly behavior for safe bash syntax like arrays, [[ ... ]], or process substitution.

Assumptions / Preconditions: The host is Linux and bwrap is detected.

Downgrade Factors: Commands using POSIX-compatible shell syntax will usually behave the same.

Code Trail: applyReadonlyBashGuard wraps otherwise accepted commands when OS sandboxing is available. macOS wraps with /bin/bash, but wrapWithBwrap emits bwrap ... /bin/sh << ..., and the test currently asserts the /bin/sh shape instead of protecting bash semantics.

Reproduction: A readonly-safe command such as arr=(a b); echo "${arr[0]}" can pass classifier inspection and then fail under the Linux bwrap wrapper because /bin/sh does not support bash arrays.
Malformed persisted branch entries can crash rehydration or suppress --readonly, making resumed readonly sessions unreliable. Sources: index.ts:97

Medium severity · Medium likelihood

Why: Readonly state is persisted in branch entries, which are a raw session boundary. If malformed entries are present, a resumed session can lose readonly state or fail before indicators and guards are consistent.

Assumptions / Preconditions: The session branch contains malformed values, or a non-custom entry with customType: "agenticoding-readonly".

Downgrade Factors: If the session manager always guarantees well-formed branch arrays, the crash path is less likely, but the code does not document or enforce that contract.

Code Trail: rehydrateReadonlyState casts each branch element and immediately reads entry.type and entry.customType. The CLI precedence check uses a looser branch.some predicate that treats any object with the custom type as a prior readonly entry, even if the scan ignored it because it was not a valid custom entry.

Reproduction: Resume with --readonly and a branch containing { customType: "agenticoding-readonly" } but no valid type: "custom" readonly entry. The scan leaves readonly false, then the CLI fallback is suppressed because branch.some sees the custom type.

Out of Scope Issues

None.

Technical Assessment

Verdict: REQUEST CHANGES

Input Boundary Shape Risk Assessment

Status: Triggered
Boundary: persisted session branch entries and raw bash command strings -> stricter readonly state and command-classification assumptions
Evidence / mitigation: Bash tool input has a string guard, but persisted branch entries are cast without object-shape validation, and bash classifier parsing still makes stricter assumptions about env/sudo/xargs/substitution shapes. Sources: index.ts:97

Strengths

Parent and child bash enforcement share applyReadonlyBashGuard, reducing policy drift between direct bash calls and child spawn bash execution. Sources: readonly-bash.ts:758
The tests cover many direct readonly classifier regressions, including redirects, git mutations, package managers, env -S, interpreters, and temp/glob handling. Sources: agenticoding.test.ts:4790
The TUI safety rule is respected in production code; the only console.log match is inside a test string fixture, not a logging call. Sources: agenticoding.test.ts:5124

In Scope Issues

OS sandbox command construction interpolates temp paths into shell commands and macOS profile strings without complete escaping, so environment-derived temp paths can corrupt or weaken sandbox wrapping. Sources: os-sandbox.ts:93

Medium severity · Low likelihood

Why: Temp paths are external process/environment input. If they contain shell or profile metacharacters, the wrapper can parse a different command/profile than intended before the sandbox starts.

Assumptions / Preconditions: os.tmpdir() or its unresolved/canonical form contains characters such as a quote, backslash, backtick, or $().

Downgrade Factors: Typical Unix temp dirs are /tmp or similar simple paths.

Code Trail: TEMP_DIR derives from os.tmpdir(). macOS rejects single quotes only for the canonical path, then adds the unresolved os.tmpdir() path to double-quoted Seatbelt profile strings. Linux embeds the canonical temp path inside double-quoted shell arguments in the bwrap command.

Reproduction: Run with an unusual TMPDIR containing shell metacharacters and enable readonly on a sandbox-capable platform. The generated wrapper can be parsed differently before sandbox-exec or bwrap enforces anything.
Sandbox availability is cached from command -v only, so installed-but-unusable sandbox binaries make harmless readonly commands fail instead of predictably falling back to classifier-only behavior. Sources: os-sandbox.ts:55

Low severity · Medium likelihood

Why: Operators can have bwrap or sandbox-exec installed but unusable due to kernel, namespace, entitlement, or host policy. Readonly should fail predictably, not turn ordinary read-only commands into runtime errors.

Assumptions / Preconditions: The sandbox binary exists on PATH but cannot execute the required sandbox profile/wrapper.

Downgrade Factors: Healthy local developer machines with working sandbox tooling are unaffected.

Code Trail: canUseOsSandbox delegates to _hasBwrap or _hasSandboxExec, both of which cache command -v results. applyReadonlyBashGuard then returns a sandboxed command rather than falling back when that cached presence check is true.

Reproduction: Put a nonfunctional bwrap earlier on PATH, enable readonly, and run ls. The guard will wrap the command and execution will fail at the sandbox binary rather than allowing classifier-approved read-only execution.

Out of Scope Issues

None.

Reusability

applyReadonlyBashGuard is a reusable policy choke point for parent and child bash paths once the classifier and wrapper gaps are fixed. Sources: readonly-bash.ts:758
buildChildToolNames cleanly separates inherited built-in tools from child custom tools and excludes parent-only spawn/handoff, which is useful for future child authority shaping. Sources: spawn/index.ts:137

review generated with CURe v. 0.8.0 · multi-stage - stages: 5 · sha 039d200 · model gpt-5.5/high · tok 9m/110k/9m · session agenticoding-pi-agenticoding-pr7-20260603-094630-0d6d · 15m4s

…bstitution, xargs

ofriw added 10 commits May 27, 2026 07:55

readonly-bash: add bash safety classifier with destructive command bl…

0c485e7

…acklist and git allowlist

state+tui: add readonlyEnabled and readonlyNudgePending state fields …

33b29c7

…with TUI indicator

readonly: add toggle, tool_call blocking, nudges, session lifecycle, …

671fe7a

…and spawn child filtering

Remove dead-end '/readonly' instruction from model-facing readonly gu…

e7b07b7

…ards

Update readonly toggle messages and include handoff in block list des…

76439d3

…cription

Block handoff at tool_call layer in readonly mode

3c48eaa

Gate /handoff command in readonly mode and apply --readonly CLI flag …

721c858

…only when no branch entries exist

Add readonly-aware guidance to notebook boundary hint, watchdog nudge…

3686a5c

…, and TUI warning

Rewrite bash classifier with shell-aware pipeline, code editor detect…

8ffc797

…ion, and redirect analysis

Add tests for readonly handoff blocking, bash classifier, context nud…

6540936

…ges, and CLI flag behavior

ofriw added 19 commits May 31, 2026 20:05

Add resolve-real-path utility and IDE config poisoning prevention

469a044

Add OS-level sandboxing for readonly bash on macOS (sandbox-exec) and…

28ddffc

… Linux (bwrap)

Rewrite bash classifier with shell-aware pipeline, git allowlist, and…

0e009bd

… temp-dir path checking

Refine config validator with case-insensitive key matching and MCP co…

27845c5

…mmand validation

Add watchdog throttling and merge readonly nudges into context hook

58fd869

Add readonly bash and config-validated write/edit tools to child spaw…

4502977

…n sessions

Update tests for new bash classifier, config validator, readonly chil…

32de094

…dren, and watchdog throttling

Remove standalone config-validator.ts and all references

e562b63

Simplify bash guard from 3 enforcement layers to 2

fa8b893

Add user-friendly sandbox denial messages

2f5f520

Remove debug logging from OS sandbox availability checks

2db1202

Ignore TypeScript compilation output

d043d17

Block wget without output flags to prevent unintended disk writes

c6d0df9

Fix sed -i argument parsing for macOS empty backup extension

26d3b27

Allow bare git tag as read-only command

709513a

Guard ui.notify behind hasUI check to prevent headless crash

3da9806

Use type-safe tool call event handling and fix sandbox mutation path

0d41de0

Ignore PR review artifacts in .gitignore

64e911b

Remove TUI-corrupting console diagnostics

c1fad71

ofriw added 13 commits June 2, 2026 20:28

Share canonical temp dir outside readonly guard

b3ccbe4

Allow blank readonly bash commands

84acc41

Allow wget stdout output in readonly mode

b898b1a

Add readonly classifier edge-case tests

320e0cf

Fix critical rm/rmdir/unlink/mkdir mutation bypass

a9075c9

Fix flag-value skip for truncate/touch --no-create

9d12a06

Fix sed -e expression values leaking as false targets

fa34723

Fix package mutation false positive on 'build'

00d75e2

Resolve glob patterns and tilde expansion in temp path check

d739362

Add type guard for malformed bash input

7e3f78c

Guard readonly toggle behind hasUI check

b661a13

Optimize and compact context primer

e1fdfa0

Add AGENTS.md project instructions

039d200

ofriw requested a review from grzegorznowak June 3, 2026 09:45

grzegorznowak requested changes Jun 3, 2026

View reviewed changes

ofriw added 6 commits June 3, 2026 15:22

Add null guards for notebook branch rehydration

1f703a9

Add null guards for readonly branch rehydration on session_start

3e00c91

Fix bash classifier bypasses: sudo -h, env --split-string, process su…

1ebfb4f

…bstitution, xargs

Add sandbox functional health probes for bwrap and sandbox-exec

f833a36

Switch bwrap from /bin/sh to /bin/bash

f44c31a

Harden sandbox profile path quoting against injection

ba2645c

ofriw requested a review from grzegorznowak June 3, 2026 12:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add readonly mode with bash safety and spawn child filtering#7

feat: add readonly mode with bash safety and spawn child filtering#7
ofriw wants to merge 48 commits into
mainfrom
feat/readonly

ofriw commented May 27, 2026 •

edited

Loading

Uh oh!

grzegorznowak commented May 28, 2026

Uh oh!

grzegorznowak left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ofriw commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Readonly Mode: OS-Level Sandboxing with Token-Efficient Guardrails

What changed and why

Value and implications

Breaking changes

Uh oh!

grzegorznowak commented May 28, 2026

Business / Product Assessment

Strengths

In Scope Issues

Out of Scope Issues

Technical Assessment

Input Boundary Shape Risk Assessment

Strengths

In Scope Issues

Out of Scope Issues

Reusability

Uh oh!

grzegorznowak left a comment

Choose a reason for hiding this comment

Business / Product Assessment

Strengths

In Scope Issues

Out of Scope Issues

Technical Assessment

Input Boundary Shape Risk Assessment

Strengths

In Scope Issues

Out of Scope Issues

Reusability

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ofriw commented May 27, 2026 •

edited

Loading