Skip to content

Redact error callstacks per-line to avoid wiping the entire stack#319484

Draft
bryanchen-d wants to merge 2 commits into
mainfrom
bryanchen/fix-callstack-full-redaction
Draft

Redact error callstacks per-line to avoid wiping the entire stack#319484
bryanchen-d wants to merge 2 commits into
mainfrom
bryanchen/fix-callstack-full-redaction

Conversation

@bryanchen-d
Copy link
Copy Markdown
Contributor

Fixes #301200

Problem

Error telemetry callstacks flow through cleanData() -> removePropertiesWithPossibleUserInfo() in src/vs/platform/telemetry/common/telemetryUtils.ts. That function replaced the entire property value with a redaction marker whenever any part matched one of its secret/PII heuristics:

for (const secretRegex of userDataRegexes) {
    if (secretRegex.regex.test(property)) {
        return `<REDACTED: ${secretRegex.label}>`; // replaces the whole value
    }
}

Because a callstack is a single multi-line string, a match in one frame collapsed the whole stack into a single <REDACTED: ...> marker — leaving the dashboard bucket (e.g. #301200) with an empty-looking message and no usable stack.

The two broad rules — URL and Generic Secret (/(key|token|sig|secret|signature|password|...)[^a-zA-Z0-9]/i) — over-match on benign stack content such as getStorageKey (, computeSignature(), or file:///https:// paths in frame URLs.

Impact (from error telemetry, ddtelvscode / RawEventsVSCode)

Metric (last 30 days) Events % of all errors Distinct machines
Total UnhandledError (baseline) 2.25B 100% 30.7M
Entire callstack collapsed to a marker 91.7M 4.08% 3.06M (~10%)
Contains <REDACTED: somewhere (healthy partial redaction) 358.7M 15.97% 11.4M

Stable across windows (58-day rate 4.05% ~= 30-day 4.08%). Of the fully-wiped stacks, URL ~= 69.4% and Generic Secret ~= 30.4% account for ~99.8%; all real credential matchers combined are <0.2%.

Fix

When a match is found, redaction is now applied per line, so only the offending frame is removed and the rest of the stack is preserved. Genuine PII protection is unchanged — any line containing a real secret is still fully redacted.

Error: Something failed
<REDACTED: Generic Secret>
    at Foo.run (out/vs/workbench/foo.js:3:40)
    at Bar.baz (out/vs/workbench/bar.js:5:60)

Performance

A fast path runs the existing regexes against the whole string first; if nothing matches (the common, no-PII case) the value is returned untouched with no split/allocation — identical cost to the previous implementation. The per-line split/map/join only happens for the rarer matching values, and cleanData is not a hot path (error telemetry is batched/deduplicated and flushed every 5s).

Test

Added a unit test asserting a stack containing a getStorageKey-style frame keeps all frames and only redacts the offending one (previously the whole stack collapsed to <REDACTED: Generic Secret>).

Notes / follow-up

This change is conservative and does not increase leak risk — the already-healthy ~12%/month partial-redaction path shows in-line redaction works correctly. A separate, higher-risk follow-up could tighten the over-matching URL / Generic Secret regexes themselves to reduce even the per-line redactions.

The telemetry PII pass (removePropertiesWithPossibleUserInfo) replaced the
entire property value with a redaction marker whenever any part matched one of
the secret/PII heuristics. Because a callstack is a single multi-line string, a
match in one frame (commonly the broad 'Generic Secret' and 'URL' rules firing
on benign function names or file:// paths) collapsed the whole stack into a
single '<REDACTED: ...>' marker, making the error undebuggable.

This affects ~4% of all UnhandledError telemetry (~92M events/month across ~3M
machines), ~99.8% of it from the URL and Generic Secret rules.

Now, when a match is found, redaction is applied per line so only the offending
frame is removed and the rest of the stack is preserved. A fast path returns the
value untouched (no split/allocation) when nothing matches, keeping the common
case identical to the previous cost.

Fixes #301200

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 1, 2026 23:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates telemetry PII/secret redaction so multi-line strings (notably error callstacks) are redacted per line instead of collapsing the entire value to a single <REDACTED: ...> marker, improving the usefulness of error telemetry while preserving redaction behavior.

Changes:

  • Refactors removePropertiesWithPossibleUserInfo() to keep a fast-path for non-matching strings and redact multi-line values line-by-line.
  • Extracts the PII/secret regex list into a module-level constant and adds a helper for line redaction.
  • Adds a unit test ensuring “Generic Secret” heuristic matches don’t wipe an entire callstack.
Show a summary per file
File Description
src/vs/platform/telemetry/common/telemetryUtils.ts Implements per-line redaction for multi-line values with a fast-path for non-matches; factors out shared regexes/helpers.
src/vs/platform/telemetry/test/browser/telemetryService.test.ts Adds a regression test asserting only the offending frame is redacted while the rest of the stack is preserved.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 4

Comment thread src/vs/platform/telemetry/common/telemetryUtils.ts
Comment thread src/vs/platform/telemetry/common/telemetryUtils.ts Outdated
Comment thread src/vs/platform/telemetry/test/browser/telemetryService.test.ts
Comment thread src/vs/platform/telemetry/common/telemetryUtils.ts Outdated
- Re-append the newline delimiter when matching each line so heuristics that
  rely on a trailing non-alphanumeric boundary (e.g. Generic Secret) don't
  under-redact the last token of a line vs the previous whole-string check.
- Correct JSDoc to state matching lines are replaced with a redaction marker
  (not merely removed) and that a marker string (not a label) is returned.
- Use single-quoted string literals in the new test for consistency.
- Add a regression test for the newline-delimiter equivalence case.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Unhandled Error] <REDACTED: URL>

2 participants