Skip to content

fix(workflows): validate run_id in RunState.load before touching the …#2813

Open
Quratulain-bilal wants to merge 1 commit into
github:mainfrom
Quratulain-bilal:fix/workflow-runstate-load-path-traversal
Open

fix(workflows): validate run_id in RunState.load before touching the …#2813
Quratulain-bilal wants to merge 1 commit into
github:mainfrom
Quratulain-bilal:fix/workflow-runstate-load-path-traversal

Conversation

@Quratulain-bilal
Copy link
Copy Markdown
Contributor

What

RunState.load(run_id, project_root) interpolates the caller-supplied run_id directly into a filesystem path before
validating it:

# src/specify_cli/workflows/engine.py:317
@classmethod
def load(cls, run_id: str, project_root: Path) -> RunState:
    """Load a run state from disk."""
    runs_dir = project_root / ".specify" / "workflows" / "runs" / run_id
    state_path = runs_dir / "state.json"
    if not state_path.exists():
        ...
    with open(state_path, encoding="utf-8") as f:
        state_data = json.load(f)
    state = cls(run_id=state_data["run_id"], ...)  # <-- validation only here

RunState.__init__ already validates run_id against r'^[a-zA-Z0-9][a-zA-Z0-9_-]*$', but only on the stored
state_data["run_id"] after the file lookup has already happened. The run_id argument passed to load is reachable
from user input via the specify workflow resume <run_id> CLI argument, so a value like ../escape lets
state_path.exists() probe arbitrary paths and json.load parse a state file from outside .specify/workflows/runs/.

Reproducer

import json, tempfile
from pathlib import Path
from specify_cli.workflows.engine import RunState

with tempfile.TemporaryDirectory() as td:
    proj = Path(td) / "proj"; proj.mkdir(); (proj / ".specify").mkdir()
    runs = proj / ".specify" / "workflows" / "runs"; runs.mkdir(parents=True)

    # Plant a state.json *outside* the runs/ directory.
    escape = proj / ".specify" / "workflows" / "escape"; escape.mkdir()
    (escape / "state.json").write_text(json.dumps({
        "run_id": "pwned",
        "workflow_id": "attacker-owned",
        "status": "created",
    }))

    state = RunState.load("../escape", proj)
    print(state.workflow_id)  # 'attacker-owned' — traversal succeeded

Impact

  • File-existence oracle: state_path.exists() reveals whether arbitrary paths the process can read exist.
  • Arbitrary JSON read: a state.json planted anywhere reachable from runs_dir / <traversal> is parsed and the
    workflow resumes under attacker-chosen workflow_id / step state.
  • Persistence: a subsequent state.save() writes back to the traversed location, persisting the corruption across
    resume cycles.

Reachable from:

  • specify workflow resume <run_id> (CLI argument from __init__.py:4239)

The change

  • Extract the existing r'^[a-zA-Z0-9][a-zA-Z0-9_-]*$' regex into a class-level _RUN_ID_PATTERN and a single
    _validate_run_id classmethod so __init__ and load cannot drift.
  • Call _validate_run_id(run_id) at the top of loadbefore the path is built. A malicious value raises
    ValueError cleanly, which the CLI already handles at __init__.py:4243 to surface a friendly "Error: Invalid run_id ..." message.
  • __init__ refactored to call the same helper, so there is exactly one definition of "valid run_id" in the codebase.

Precedent

src/specify_cli/agents.py:435-438 (_ensure_within_directory) already guards extension-install paths against the same
threat model with os.path.normpath + Path.is_relative_to. This change brings RunState in line with that precedent.

Tests

tests/test_workflows.py::TestRunState:

  • test_load_rejects_path_traversal — parametrized over 9 attack vectors: ../escape, .., ../../etc/passwd,
    foo/bar, foo\bar, .hidden, -flag, foo\x00bar, and "". Each test plants a state.json outside the legitimate
    runs/ directory so a missing guard would surface as a successful load rather than the ambiguous FileNotFoundError.
  • test_init_and_load_share_validation — asserts __init__, load, and the helper all reject the same representative
    malformed ID so future changes can't silently drift.
$ python -m pytest tests/test_workflows.py::TestRunState -v
13 passed in 2.79s
$ python -m ruff check src/specify_cli/workflows/engine.py
All checks passed!

Notes for maintainers

  • Happy to move this to a private security advisory if you'd prefer to coordinate disclosure. Filed as a regular PR
    because the threat model is local (requires either CLI argument control or env-var control on the target machine, neither
    of which is a network-reachable attack surface) — but I'll defer to your preferred process.
  • No behaviour change for any valid run_id. The exact charset accepted before this PR is the exact charset accepted
    after.
  • The same regex also gates SPECKIT_WORKFLOW_RUN_ID (env-var override) via __init__'s call to the new helper, so the
    env-var override path is now consistently validated too.

…filesystem

``RunState.load(run_id, project_root)`` interpolates ``run_id`` directly
into ``project_root / ".specify" / "workflows" / "runs" / run_id`` and
then calls ``state_path.exists()`` and ``json.load`` on the result. The
run_id is reachable from user input via ``specify workflow resume
<run_id>`` (CLI argument) and via ``SPECKIT_WORKFLOW_RUN_ID`` (env var
override on the engine's run path), so a value like ``../escape``
turns ``runs_dir`` into ``.specify/workflows/escape/`` and:

  * ``state_path.exists()`` becomes a file-existence oracle for any
    path the process can read.
  * if a ``state.json`` exists at the traversed location (planted by
    a malicious dependency, a misconfigured shared workspace, or an
    older spec-kit version that happened to write there),
    ``json.load`` parses it and the workflow resumes under the
    attacker-chosen ``workflow_id`` / step state.
  * a subsequent ``state.save()`` then writes back to the traversed
    location, persisting the corruption.

``RunState.__init__`` already validates ``run_id`` against
``r'^[a-zA-Z0-9][a-zA-Z0-9_-]*$'`` — but that check runs on
``state_data["run_id"]`` *after* ``load`` has already done the file
lookup, which is too late to prevent the disclosure.

This change extracts the pattern into a class-level constant
``_RUN_ID_PATTERN`` and a single ``_validate_run_id`` classmethod so
``__init__`` and ``load`` cannot drift, then calls the validator at the
top of ``load`` before any path is built. Mirrors the precedent in
``src/specify_cli/agents.py::_ensure_within_directory`` (used at line
437 of that file) which guards extension-install paths against the
same threat model.

Regression tests parametrize 9 traversal vectors (``../escape``,
``..``, ``../../etc/passwd``, ``foo/bar``, ``foo\bar``, ``.hidden``,
``-flag``, ``foo\x00bar``, empty) and plant a malicious ``state.json``
outside ``runs/`` so a missing guard would surface as a successful
load rather than the ambiguous ``FileNotFoundError``. A second test
asserts ``__init__`` and ``load`` reject the same representative
malformed ID, so future changes to one path can't silently drift from
the other.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens workflow run-state loading against path traversal by validating run_id before it is interpolated into the .specify/workflows/runs/<run_id>/... filesystem path, aligning behavior with the existing RunState.__init__ validation.

Changes:

  • Centralized run_id validation into RunState._RUN_ID_PATTERN + RunState._validate_run_id() and reused it from both __init__ and load.
  • Added early validation in RunState.load() to prevent probing/reading files outside the intended runs directory.
  • Added tests covering path traversal and validation consistency.
Show a summary per file
File Description
src/specify_cli/workflows/engine.py Adds a shared run-id validator and calls it before building the load path to prevent traversal.
tests/test_workflows.py Adds parameterized tests for malicious run_id values and a consistency test for validation behavior.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread tests/test_workflows.py
Comment on lines +2476 to +2491
def test_init_and_load_share_validation(self):
"""``__init__`` and ``load`` reject the same set of malformed IDs.

The two entry points must stay in sync — drift would let an ID
slip in via one path that the other would reject, producing
confusing crashes mid-workflow.
"""
from specify_cli.workflows.engine import RunState

# Sample a representative malformed ID — exhaustive coverage is
# in ``test_load_rejects_path_traversal``.
bad = "../escape"
with pytest.raises(ValueError, match="Invalid run_id"):
RunState(run_id=bad)
with pytest.raises(ValueError, match="Invalid run_id"):
RunState._validate_run_id(bad)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants