Skip to content

feat: max-model-multiplier-cap guardrail#4215

Open
Copilot wants to merge 3 commits into
mainfrom
copilot/max-model-multiplier
Open

feat: max-model-multiplier-cap guardrail#4215
Copilot wants to merge 3 commits into
mainfrom
copilot/max-model-multiplier

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 2, 2026

Adds a hard cap on model cost multipliers so unexpected pricing or model-routing changes can't silently blow up spend. Requests whose resolved model multiplier exceeds the configured cap are rejected 400 model_multiplier_cap_exceeded before reaching upstream.

Guard module (containers/api-proxy/guards/max-model-multiplier-guard.js)

  • Resolves a model's multiplier via exact match → longest-prefix match → default (mirrors existing effective-token-guard logic)
  • Returns a structured block state when the cap is exceeded, or null (no-op when unconfigured or multiplier is within the cap)

Proxy integration (containers/api-proxy/proxy-request.js)

  • Adds extractModelFromBody() to pull the model field from the post-transform request body
  • Guard runs after ET and max-runs checks; sees the fully-resolved model name so alias-based routing to expensive models is also caught

Config pipeline

  • apiProxy.maxModelMultiplierCap (number) in AWF config file → validated in log-and-limits.ts → passed through build-configAWF_MAX_MODEL_MULTIPLIER env var injected into the api-proxy container
  • Non-security-sensitive cap value flows through the AWF config file (stdin config), consistent with maxEffectiveTokens and maxRuns
  • New --max-model-multiplier-cap <number> CLI flag added; kept separate from the existing --max-model-multiplier <model:multiplier,...> per-model map flag to avoid key/type collision

Schema

  • maxModelMultiplierCap: { type: "number", exclusiveMinimum: 0 } added to both docs/awf-config.schema.json and src/awf-config-schema.json

Spec & docs

  • docs/awf-config-spec.md §12 "Model Multiplier Cap" — enforcement rules, multiplier resolution algorithm, and error shape
  • docs/api-proxy-sidecar.md — configuration examples, enforcement description, and detection code snippet

Example config

apiProxy:
  maxModelMultiplierCap: 5   # reject requests for models with multiplier > 5 (e.g. claude-opus at 27)
  modelMultipliers:
    claude-opus-4.7: 27
    gpt-4o: 2

Copilot AI linked an issue Jun 2, 2026 that may be closed by this pull request
Add a new guard that rejects API requests whose model's resolved cost
multiplier exceeds the operator-configured cap (AWF_MAX_MODEL_MULTIPLIER).

Changes:
- containers/api-proxy/guards/max-model-multiplier-guard.js: new guard module
- containers/api-proxy/proxy-request.js: extractModelFromBody(), guard integration
- src/types/rate-limit-options.ts: maxModelMultiplier?: number field
- src/config-file.ts: config-file mapping for apiProxy.maxModelMultiplier
- src/services/api-proxy-service.ts: AWF_MAX_MODEL_MULTIPLIER env var passthrough
- src/commands/build-config.ts: BuildConfigInputs.maxModelMultiplier?
- src/commands/validators/log-and-limits.ts: parsing/validation
- src/commands/validators/config-assembly.ts: passthrough to buildConfig
- docs/awf-config.schema.json + src/awf-config-schema.json: schema field
- Tests: guard unit tests, server integration tests, TS config/service tests
Copilot AI changed the title [WIP] Add guardrail for model spending cap feat: max-model-multiplier guardrail Jun 2, 2026
Copilot finished work on behalf of lpcox June 2, 2026 16:50
Copilot AI requested a review from lpcox June 2, 2026 16:50
@lpcox lpcox marked this pull request as ready for review June 2, 2026 17:12
Copilot AI review requested due to automatic review settings June 2, 2026 17:12
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Documentation Preview

Documentation build failed for this PR. View logs.

Built from commit 2bea5c0

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 96.42% 96.42% ➡️ +0.00%
Statements 96.28% 96.28% ➡️ +0.00%
Functions 98.27% 98.27% ➡️ +0.00%
Branches 90.53% 90.45% 📉 -0.08%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/commands/validators/log-and-limits.ts 91.5% → 88.5% (-3.02%) 91.5% → 88.5% (-3.02%)
src/config-writer.ts 89.3% → 90.9% (+1.65%) 89.3% → 90.9% (+1.65%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a “max model multiplier” spend guardrail to the API proxy: if a request’s resolved model multiplier exceeds a configured cap, the proxy rejects it with 400 model_multiplier_cap_exceeded before contacting upstream. It wires the cap from AWF config → validation/build-config → AWF_MAX_MODEL_MULTIPLIER env var, and integrates the new guard into proxy-request with targeted unit/integration tests.

Changes:

  • Introduces maxModelMultiplier configuration (schema + config-file mapping + validator + build-config) and injects it into the api-proxy container as AWF_MAX_MODEL_MULTIPLIER.
  • Implements max-model-multiplier-guard and plugs it into containers/api-proxy/proxy-request.js (including request-body model extraction).
  • Adds test coverage for env-var wiring and for the new guard behavior in both guard-level and proxyRequest-level tests.
Show a summary per file
File Description
src/types/rate-limit-options.ts Adds maxModelMultiplier?: number to rate-limit options and documents the new guardrail behavior.
src/services/api-proxy-service.ts Injects AWF_MAX_MODEL_MULTIPLIER into the api-proxy container environment when configured.
src/services/api-proxy-service-rate-limit.test.ts Verifies AWF_MAX_MODEL_MULTIPLIER is set/unset correctly in generated docker-compose env.
src/config-file.ts Adds apiProxy.maxModelMultiplier to config-file type + maps it into CLI/options assembly.
src/config-file-validation.test.ts Adds config-schema validation tests for apiProxy.maxModelMultiplier.
src/commands/validators/log-and-limits.ts Adds validation/parsing for maxModelMultiplier in the options pipeline (but currently conflicts with an existing CLI option name).
src/commands/validators/config-assembly.ts Plumbs maxModelMultiplier from log/limits validation into assembled config.
src/commands/build-config.ts Adds maxModelMultiplier to WrapperConfig assembly.
src/awf-config-schema.json Adds apiProxy.maxModelMultiplier schema field (plus large formatting churn).
docs/awf-config.schema.json Mirrors schema update for docs (plus large formatting churn).
containers/api-proxy/server.token-guards.test.js Adds proxyRequest-level tests for the new max-model-multiplier guard behavior.
containers/api-proxy/proxy-request.js Extracts model from the (post-transform) request body and applies the new guard before upstream request.
containers/api-proxy/guards/max-model-multiplier-guard.test.js Adds unit tests for multiplier resolution, cache behavior, and error payload structure.
containers/api-proxy/guards/max-model-multiplier-guard.js Implements config parsing/caching, multiplier resolution (exact/longest-prefix/default), and error payload builder.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 14/14 changed files
  • Comments generated: 2

Comment on lines +108 to +112
const maxModelMultiplierOption = (options as Record<string, unknown>).maxModelMultiplier as
| string
| number
| undefined;
const maxModelMultiplier =
Comment on lines +82 to +90
/**
* Returns a block state object when the given model's resolved multiplier
* exceeds the configured cap (AWF_MAX_MODEL_MULTIPLIER), or null when no cap
* is configured or the model is below the cap.
*
* @param {string|null} model - The model name from the request body (may be null)
* @returns {{ model: string, multiplier: number, maxModelMultiplier: number,
* maxExceeded: boolean } | null}
*/
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@lpcox
Copy link
Copy Markdown
Collaborator

lpcox commented Jun 2, 2026

@copilot address review feedback and make sure that any new configuration params are thoroughly described in teh specs and schema. non-security sensitive config params should be passed through stdin config and security sensitive through environment variables

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 2, 2026

@copilot address review feedback and make sure that any new configuration params are thoroughly described in teh specs and schema. non-security sensitive config params should be passed through stdin c...

Addressed in commit fix: rename cap option to maxModelMultiplierCap; fix guard JSDoc:

Naming conflict (reviewer feedback): Renamed the multiplier cap option from maxModelMultiplier to maxModelMultiplierCap everywhere — AwfFileConfig, RateLimitOptions, both JSON schemas, build-config, config-assembly, log-and-limits (the conflicting double-read is now resolved), and api-proxy-service. Added a dedicated --max-model-multiplier-cap <number> CLI flag so the cap can also be set from the command line without colliding with the existing --max-model-multiplier model:multiplier,... flag.

JSDoc mismatch (reviewer feedback): getModelMultiplierCapBlockState now returns null whenever the multiplier is within the cap (matching the documented contract), so callers no longer need to inspect a maxExceeded field.

Spec & schema coverage: apiProxy.maxModelMultiplierCap is documented in both docs/awf-config-spec.md (§12) and docs/api-proxy-sidecar.md with enforcement rules, error shape, config examples, and the mapping to AWF_MAX_MODEL_MULTIPLIER. The non-security cap value flows through the AWF config file (stdin config) → validation → AWF_MAX_MODEL_MULTIPLIER env var injected into the api-proxy container, consistent with how the other spend limits (maxEffectiveTokens, maxRuns) work.

Copilot AI changed the title feat: max-model-multiplier guardrail feat: max-model-multiplier-cap guardrail Jun 2, 2026
Copilot finished work on behalf of lpcox June 2, 2026 17:33
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 96.42% 96.42% ➡️ +0.00%
Statements 96.28% 96.28% ➡️ +0.00%
Functions 98.27% 98.27% ➡️ +0.00%
Branches 90.53% 90.45% 📉 -0.08%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/commands/validators/log-and-limits.ts 91.5% → 88.5% (-3.02%) 91.5% → 88.5% (-3.02%)
src/config-writer.ts 89.3% → 90.9% (+1.65%) 89.3% → 90.9% (+1.65%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Smoke Test: Copilot BYOK (Offline) Mode

Test Result
GitHub MCP connectivity
GitHub.com HTTP ⚠️ (template vars unexpanded)
File write/read ⚠️ (template vars unexpanded)
BYOK inference (api-proxy → api.githubcopilot.com)

Running in BYOK offline mode (COPILOT_OFFLINE=true) via api-proxy → api.githubcopilot.com.

PR by @Copilot, assignees: @lpcox, @Copilot. Overall: PARTIAL PASS (MCP + BYOK inference ✅; pre-step template vars not expanded).

🔑 BYOK report filed by Smoke Copilot BYOK

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

✅ Smoke Test: Claude Engine

Total: PASS

💥 [THE END] — Illustrated by Smoke Claude

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

🔬 Smoke Test Results

Test Result
GitHub MCP connectivity
GitHub.com HTTP connectivity ⚠️ N/A (pre-step outputs not expanded)
File write/read ⚠️ N/A (pre-step outputs not expanded)

PR: feat: max-model-multiplier-cap guardrail
Author: @Copilot | Assignees: @lpcox, @Copilot

Overall: PARTIAL PASS — MCP ✅, pre-step data unavailable for HTTP/file tests.

📰 BREAKING: Report filed by Smoke Copilot

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

refactor(api-proxy): extract duplicate collectLogOutput() into shared test helper ✅
Harden test-coverage-improver prompt injection and trim unused tool surface ✅
GitHub page title ✅
File write/readback ✅
npm ci && npm run build ✅
Overall status: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Smoke Test: API Proxy OpenTelemetry Tracing

Scenario Result Notes
S1: Module Loading ✅ Pass otel.js loads successfully; exports: startRequestSpan, setTokenAttributes, endSpan, endSpanError, shutdown, isEnabled + internal helpers
S2: Test Suite ✅ Pass 33/33 tests pass in otel.test.js
S3: Env Var Forwarding ✅ Pass api-proxy-service.ts forwards OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, GITHUB_AW_OTEL_TRACE_ID, GITHUB_AW_OTEL_PARENT_SPAN_ID; OTEL_SERVICE_NAME defaults to awf-api-proxy
S4: Token Tracker Integration ✅ Pass onUsage callback exists in token-tracker-http.js as the OTEL hook point
S5: OTEL Diagnostics ✅ Pass No OTLP endpoint configured; spans fall back to /var/log/api-proxy/otel.jsonl per graceful-degradation design

All 5 scenarios passed. OTEL tracing integration is working correctly.

📡 OTel tracing validated by Smoke OTel Tracing

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Chroot Version Comparison

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3 ❌ NO
Node.js v24.16.0 v22.22.3 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot.

Tested by Smoke Chroot

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Smoke test results: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color passed ✅ PASS
Go env passed ✅ PASS
Go uuid passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx passed ✅ PASS
Node.js execa passed ✅ PASS
Node.js p-limit passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #4215 · sonnet46 970.4K ·

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Smoke Test: GitHub Actions Services Connectivity

Check Result
Redis PING ❌ No response
PostgreSQL pg_isready ❌ No response
PostgreSQL SELECT 1 ❌ No response

Overall: FAILhost.docker.internal is not reachable from this runner environment. Service containers appear unreachable.

🔌 Service connectivity validated by Smoke Services

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

max-model-multiplier

3 participants