feat: add readonly mode with bash safety and spawn child filtering#7
feat: add readonly mode with bash safety and spawn child filtering#7ofriw wants to merge 48 commits into
Conversation
…acklist and git allowlist
…with TUI indicator
…and spawn child filtering
…only when no branch entries exist
…, and TUI warning
…ion, and redirect analysis
…ges, and CLI flag behavior
|
Summary: GH-2’s readonly-mode safety value is aligned, but concrete bash mutation bypasses and a no-UI toggle crash mean the PR should not merge as-is. Business / Product AssessmentVerdict: REQUEST CHANGES Strengths
In Scope Issues
Out of Scope Issues
Technical AssessmentVerdict: REQUEST CHANGES Input Boundary Shape Risk AssessmentStatus: Triggered Strengths
In Scope Issues
Out of Scope Issues
Reusability
review generated with CURe v. 0.7.3 · multi-stage - stages: 4 · sha 6540936 · model gpt-5.5/high · tok 8m/82k/8m · session agenticoding-pi-agenticoding-pr7-20260528-053241-21cc · 16m32s |
… temp-dir path checking
…dren, and watchdog throttling
grzegorznowak
left a comment
There was a problem hiding this comment.
Summary: GH-2 adds valuable readonly-mode guardrails, but the current PR does not yet reliably deliver the promised temp-only write behavior across fallback, Linux sandbox, resumed sessions, and active child sessions.
Business / Product Assessment
Verdict: REQUEST CHANGES
Strengths
- Readonly mode has clear user entry points through
--readonly,/readonly, and Ctrl+Shift+R, and it gives visible status/notification feedback when toggled. Sources:index.ts:63 - The parent tool-call path blocks
write,edit, andhandoffwhile readonly is enabled, preserving the advertised high-level restriction for direct tool calls. Sources:index.ts:134 - Child sessions created while readonly is already enabled remove
write/editfrom the child tool list and add a readonly bash custom tool. Sources:spawn/index.ts:301
In Scope Issues
-
The no-sandbox fallback can still allow non-temp filesystem mutation through shell wrappers and expansion forms, so Windows or hosts without usable sandbox tooling do not meet the PR’s cross-platform readonly promise. Sources:
readonly-bash.ts:152High severity · High likelihood
Why: The PR description promises the fallback classifier protects platforms without OS sandboxing. Several classifier paths still let mutating commands hide behind syntax that is common in shell usage, which means readonly can silently become writable outside temp.
Assumptions / Preconditions: OS sandboxing is unavailable or not used, such as Windows or a Linux/macOS host without the sandbox binary.
Downgrade Factors: A functioning OS sandbox would reduce the immediate write impact for commands that reach the wrapper, but the fallback is explicitly part of the feature contract.
Code Trail:
applyReadonlyBashGuardreturns plainallowwhencanUseOsSandbox()is false andclassifyBashCommandsays OK. The env branch only has special raw extraction for-S, while tokenization consumes--split-string; sudo option parsing omits value-taking host options; xargs checks onlygetMutationTargets, skipping git/package-manager/interpreter classifier logic; command substitution extraction handles$()and backticks but not process substitution, and mutation targets containing$()can be treated as glob-like paths.Reproduction: Examples that should be blocked but are not covered by the current code shape include
env --split-string "rm -rf /",sudo -h localhost rm /etc/passwd,printf install | xargs npm, andcat <(rm /etc/passwd)in fallback mode. -
Linux sandbox wrapping changes bash command semantics by running accepted bash commands under
/bin/sh, so valid readonly bash commands can fail or behave differently only on Linux with bwrap. Sources:os-sandbox.ts:181Medium severity · High likelihood
Why: Users call a bash tool and the wrapper comments promise bash command preservation. On Linux, the sandboxed path changes the interpreter, which breaks expected readonly behavior for safe bash syntax like arrays,
[[ ... ]], or process substitution.Assumptions / Preconditions: The host is Linux and
bwrapis detected.Downgrade Factors: Commands using POSIX-compatible shell syntax will usually behave the same.
Code Trail:
applyReadonlyBashGuardwraps otherwise accepted commands when OS sandboxing is available. macOS wraps with/bin/bash, butwrapWithBwrapemitsbwrap ... /bin/sh << ..., and the test currently asserts the/bin/shshape instead of protecting bash semantics.Reproduction: A readonly-safe command such as
arr=(a b); echo "${arr[0]}"can pass classifier inspection and then fail under the Linux bwrap wrapper because/bin/shdoes not support bash arrays. -
Malformed persisted branch entries can crash rehydration or suppress
--readonly, making resumed readonly sessions unreliable. Sources:index.ts:97Medium severity · Medium likelihood
Why: Readonly state is persisted in branch entries, which are a raw session boundary. If malformed entries are present, a resumed session can lose readonly state or fail before indicators and guards are consistent.
Assumptions / Preconditions: The session branch contains malformed values, or a non-custom entry with
customType: "agenticoding-readonly".Downgrade Factors: If the session manager always guarantees well-formed branch arrays, the crash path is less likely, but the code does not document or enforce that contract.
Code Trail:
rehydrateReadonlyStatecasts each branch element and immediately readsentry.typeandentry.customType. The CLI precedence check uses a looserbranch.somepredicate that treats any object with the custom type as a prior readonly entry, even if the scan ignored it because it was not a valid custom entry.Reproduction: Resume with
--readonlyand a branch containing{ customType: "agenticoding-readonly" }but no validtype: "custom"readonly entry. The scan leaves readonly false, then the CLI fallback is suppressed becausebranch.somesees the custom type.
Out of Scope Issues
- None.
Technical Assessment
Verdict: REQUEST CHANGES
Input Boundary Shape Risk Assessment
Status: Triggered
Boundary: persisted session branch entries and raw bash command strings -> stricter readonly state and command-classification assumptions
Evidence / mitigation: Bash tool input has a string guard, but persisted branch entries are cast without object-shape validation, and bash classifier parsing still makes stricter assumptions about env/sudo/xargs/substitution shapes. Sources: index.ts:97
Strengths
- Parent and child bash enforcement share
applyReadonlyBashGuard, reducing policy drift between direct bash calls and child spawn bash execution. Sources:readonly-bash.ts:758 - The tests cover many direct readonly classifier regressions, including redirects, git mutations, package managers, env
-S, interpreters, and temp/glob handling. Sources:agenticoding.test.ts:4790 - The TUI safety rule is respected in production code; the only
console.logmatch is inside a test string fixture, not a logging call. Sources:agenticoding.test.ts:5124
In Scope Issues
-
OS sandbox command construction interpolates temp paths into shell commands and macOS profile strings without complete escaping, so environment-derived temp paths can corrupt or weaken sandbox wrapping. Sources:
os-sandbox.ts:93Medium severity · Low likelihood
Why: Temp paths are external process/environment input. If they contain shell or profile metacharacters, the wrapper can parse a different command/profile than intended before the sandbox starts.
Assumptions / Preconditions:
os.tmpdir()or its unresolved/canonical form contains characters such as a quote, backslash, backtick, or$().Downgrade Factors: Typical Unix temp dirs are
/tmpor similar simple paths.Code Trail:
TEMP_DIRderives fromos.tmpdir(). macOS rejects single quotes only for the canonical path, then adds the unresolvedos.tmpdir()path to double-quoted Seatbelt profile strings. Linux embeds the canonical temp path inside double-quoted shell arguments in the bwrap command.Reproduction: Run with an unusual
TMPDIRcontaining shell metacharacters and enable readonly on a sandbox-capable platform. The generated wrapper can be parsed differently beforesandbox-execorbwrapenforces anything. -
Sandbox availability is cached from
command -vonly, so installed-but-unusable sandbox binaries make harmless readonly commands fail instead of predictably falling back to classifier-only behavior. Sources:os-sandbox.ts:55Low severity · Medium likelihood
Why: Operators can have
bwraporsandbox-execinstalled but unusable due to kernel, namespace, entitlement, or host policy. Readonly should fail predictably, not turn ordinary read-only commands into runtime errors.Assumptions / Preconditions: The sandbox binary exists on PATH but cannot execute the required sandbox profile/wrapper.
Downgrade Factors: Healthy local developer machines with working sandbox tooling are unaffected.
Code Trail:
canUseOsSandboxdelegates to_hasBwrapor_hasSandboxExec, both of which cachecommand -vresults.applyReadonlyBashGuardthen returns a sandboxed command rather than falling back when that cached presence check is true.Reproduction: Put a nonfunctional
bwrapearlier on PATH, enable readonly, and runls. The guard will wrap the command and execution will fail at the sandbox binary rather than allowing classifier-approved read-only execution.
Out of Scope Issues
- None.
Reusability
applyReadonlyBashGuardis a reusable policy choke point for parent and child bash paths once the classifier and wrapper gaps are fixed. Sources:readonly-bash.ts:758buildChildToolNamescleanly separates inherited built-in tools from child custom tools and excludes parent-onlyspawn/handoff, which is useful for future child authority shaping. Sources:spawn/index.ts:137
review generated with CURe v. 0.8.0 · multi-stage - stages: 5 · sha 039d200 · model gpt-5.5/high · tok 9m/110k/9m · session agenticoding-pi-agenticoding-pr7-20260603-094630-0d6d · 15m4s
…bstitution, xargs
Note: This PR was generated by an AI agent. If you would like to talk with other humans, drop by our Discord!
Readonly Mode: OS-Level Sandboxing with Token-Efficient Guardrails
What changed and why
This PR implements the readonly mode requested in issue #2, but significantly stronger than the original proposal. Instead of relying solely on command-pattern filtering – which models consistently escaped by spawning interpreters (python -c, node -e, etc.) or using indirect write mechanisms – we built a two-layer enforcement with an OS-level sandbox as the primary barrier.
On macOS, commands run through
sandbox-execwith a Seatbelt profile that denies all file-write operations except to the OS temp dir and /dev/null. On Linux, bubblewrap (bwrap) mounts the root filesystem read-only with a writable tmpfs overlay at the temp dir. On Windows (no native sandboxing), or when the OS tool is missing, we fall back to a comprehensively rewritten command-pattern classifier that understands shell pipelines, redirects, command substitutions, interpreter-cflags, xargs,sed -i, package manager mutations, and more.Beyond bash, the PR blocks write, edit, and handoff at the tool-call layer, with a cache-aware strategy: parent sessions keep these tools in the tool list (to avoid context-cache invalidation) and block at call time, while child spawn sessions remove them entirely since they start fresh. We also gated the /handoff command in readonly mode, added TUI indicators, watchdog nudges, and context-hook messages.
Also tuned the context primer for token efficiency: added a Plan-then-execute section, simplified primacy-zone guidance, and rephrased spawn section for directness.
Value and implications
For projects running token-cost-sensitive coding agents, this is a meaningful UX and cost optimization. The model keeps its entire tool list (maximizing cache hit rates) while being unable to mutate the filesystem outside a controlled scratch space. The OS sandbox is explicitly not a security boundary – it is an active guardrail that prevents accidental writes from agentic loops, interpreter injections, or misdirected redirects. The fallback classifier on Windows means the feature works cross-platform even without OS sandbox tooling.
Breaking changes
None. Readonly is off by default, toggled on demand via
/readonly, Ctrl+Shift+R, or the--readonlyCLI flag. Existing workflows are unaffected.Attached is an agent optimized description of the changes in this PR - AGENT_REVIEW.md