Redact error callstacks per-line to avoid wiping the entire stack#319484
Draft
bryanchen-d wants to merge 2 commits into
Draft
Redact error callstacks per-line to avoid wiping the entire stack#319484bryanchen-d wants to merge 2 commits into
bryanchen-d wants to merge 2 commits into
Conversation
The telemetry PII pass (removePropertiesWithPossibleUserInfo) replaced the entire property value with a redaction marker whenever any part matched one of the secret/PII heuristics. Because a callstack is a single multi-line string, a match in one frame (commonly the broad 'Generic Secret' and 'URL' rules firing on benign function names or file:// paths) collapsed the whole stack into a single '<REDACTED: ...>' marker, making the error undebuggable. This affects ~4% of all UnhandledError telemetry (~92M events/month across ~3M machines), ~99.8% of it from the URL and Generic Secret rules. Now, when a match is found, redaction is applied per line so only the offending frame is removed and the rest of the stack is preserved. A fast path returns the value untouched (no split/allocation) when nothing matches, keeping the common case identical to the previous cost. Fixes #301200 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates telemetry PII/secret redaction so multi-line strings (notably error callstacks) are redacted per line instead of collapsing the entire value to a single <REDACTED: ...> marker, improving the usefulness of error telemetry while preserving redaction behavior.
Changes:
- Refactors
removePropertiesWithPossibleUserInfo()to keep a fast-path for non-matching strings and redact multi-line values line-by-line. - Extracts the PII/secret regex list into a module-level constant and adds a helper for line redaction.
- Adds a unit test ensuring “Generic Secret” heuristic matches don’t wipe an entire callstack.
Show a summary per file
| File | Description |
|---|---|
src/vs/platform/telemetry/common/telemetryUtils.ts |
Implements per-line redaction for multi-line values with a fast-path for non-matches; factors out shared regexes/helpers. |
src/vs/platform/telemetry/test/browser/telemetryService.test.ts |
Adds a regression test asserting only the offending frame is redacted while the rest of the stack is preserved. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 4
- Re-append the newline delimiter when matching each line so heuristics that rely on a trailing non-alphanumeric boundary (e.g. Generic Secret) don't under-redact the last token of a line vs the previous whole-string check. - Correct JSDoc to state matching lines are replaced with a redaction marker (not merely removed) and that a marker string (not a label) is returned. - Use single-quoted string literals in the new test for consistency. - Add a regression test for the newline-delimiter equivalence case. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #301200
Problem
Error telemetry callstacks flow through
cleanData()->removePropertiesWithPossibleUserInfo()insrc/vs/platform/telemetry/common/telemetryUtils.ts. That function replaced the entire property value with a redaction marker whenever any part matched one of its secret/PII heuristics:Because a callstack is a single multi-line string, a match in one frame collapsed the whole stack into a single
<REDACTED: ...>marker — leaving the dashboard bucket (e.g. #301200) with an empty-looking message and no usable stack.The two broad rules —
URLandGeneric Secret(/(key|token|sig|secret|signature|password|...)[^a-zA-Z0-9]/i) — over-match on benign stack content such asgetStorageKey (,computeSignature(), orfile:///https://paths in frame URLs.Impact (from error telemetry,
ddtelvscode/RawEventsVSCode)UnhandledError(baseline)<REDACTED:somewhere (healthy partial redaction)Stable across windows (58-day rate 4.05% ~= 30-day 4.08%). Of the fully-wiped stacks,
URL~= 69.4% andGeneric Secret~= 30.4% account for ~99.8%; all real credential matchers combined are <0.2%.Fix
When a match is found, redaction is now applied per line, so only the offending frame is removed and the rest of the stack is preserved. Genuine PII protection is unchanged — any line containing a real secret is still fully redacted.
Performance
A fast path runs the existing regexes against the whole string first; if nothing matches (the common, no-PII case) the value is returned untouched with no
split/allocation — identical cost to the previous implementation. The per-linesplit/map/joinonly happens for the rarer matching values, andcleanDatais not a hot path (error telemetry is batched/deduplicated and flushed every 5s).Test
Added a unit test asserting a stack containing a
getStorageKey-style frame keeps all frames and only redacts the offending one (previously the whole stack collapsed to<REDACTED: Generic Secret>).Notes / follow-up
This change is conservative and does not increase leak risk — the already-healthy ~12%/month partial-redaction path shows in-line redaction works correctly. A separate, higher-risk follow-up could tighten the over-matching
URL/Generic Secretregexes themselves to reduce even the per-line redactions.