Skip to content

fix(providers/anthropic): forward FilePart.Filename as document title and warn on unsupported media types#38

Open
ethanndickson wants to merge 2 commits into
coder_2_33from
anthropic-pdf-filename-context
Open

fix(providers/anthropic): forward FilePart.Filename as document title and warn on unsupported media types#38
ethanndickson wants to merge 2 commits into
coder_2_33from
anthropic-pdf-filename-context

Conversation

@ethanndickson
Copy link
Copy Markdown
Member

@ethanndickson ethanndickson commented Jun 3, 2026

Summary

The Anthropic provider currently ignores FilePart.Filename for both PDFs and text documents, and it silently drops any other media type without surfacing a warning. Claude therefore has no handle the model can use to refer back to an attachment, and unsupported attachments leave no trace for the caller.

This PR makes three small changes to providers/anthropic:

  1. PDF filename → document title. Forward file.Filename into DocumentBlockParam.Title on the application/pdf branch. The filename is sanitized first (see below).
  2. Text filename → document title. Same forwarding on the text/* branch, which also produces a DocumentBlockParam (via PlainTextSourceParam).
  3. Default branch with CallWarning. Match the other providers in this repo by emitting a fantasy.CallWarning when a FilePart media type is not handled, instead of silently dropping it.
docBlock.OfDocument.Title = anthropic.String(sanitizeAnthropicDocumentTitle(file.Filename))
default:
    warnings = append(warnings, fantasy.CallWarning{
        Type:    fantasy.CallWarningTypeOther,
        Message: fmt.Sprintf("file part media type %s not supported", file.MediaType),
    })

Why title and the sanitizer

Anthropic restricts document titles to alphanumerics, whitespace, hyphens, parentheses, and square brackets. A title that contains other runes (the . and _ characters that occur in almost every real filename, for example) is rejected with:

The document file name can only contain alphanumeric characters, whitespace characters, hyphens, parentheses, and square brackets.

The new sanitizeAnthropicDocumentTitle helper replaces disallowed runes with spaces, collapses consecutive whitespace, and trims. Empty or fully disallowed input falls back to "Document" so every attached document has a stable handle the model can refer back to.

Why the image branch is untouched

ImageBlockParam does not have a free-form filename or title slot, so there is no equivalent place to forward FilePart.Filename. The two branches touched here (application/pdf, text/*) are the full set of Anthropic content blocks that can carry a document title. The new default warning still covers everything else, including image MIME types that the existing image/* case does not handle (for example, image/heic).

Upstream

The same gaps exist in charmbracelet/fantasy. Tracked in charmbracelet#267.

Closes

CODAGT-540 (follow-up to #37)

@ethanndickson ethanndickson changed the title fix(providers/anthropic): include PDF filename in document context fix(providers/anthropic): forward FilePart.Filename to document context and warn on unsupported media types Jun 3, 2026
Anthropic's DocumentBlockParam exposes a Title field that the model
uses when it refers back to an attached document. Forward FilePart.Filename
into that field so users can ask the model about a document by name.

The title is sanitized first: Anthropic restricts titles to alphanumerics,
whitespace, hyphens, parentheses, and square brackets, and returns
'The document file name can only contain alphanumeric characters,
whitespace characters, hyphens, parentheses, and square brackets.' for
any title containing other runes. Disallowed runes are replaced with
spaces, runs of whitespace are collapsed, and the result is trimmed.
Empty or fully disallowed input falls back to 'Document' so every
attached document has a stable handle, matching the invariant the
OpenAI provider already enforces with its part-N.pdf synthetic name.

The sanitizer is a Go port of the implementation in coder/mux
(src/node/utils/messages/sanitizeAnthropicDocumentFilename.ts); prior
art for sending filename as title also includes vercel/ai's
@ai-sdk/anthropic, which sets document.title from part.filename when
no provider-options title is supplied.
…ed media types

Mirror the PDF document-title handling on the text/* document branch
so text attachments also reach Anthropic with a stable handle the model
can refer back to. The filename runs through the same sanitizer; an
empty or fully disallowed filename falls back to 'Document'.

Also add a default case to the file MediaType switch that emits a
CallWarning when a FilePart's media type is not handled. Previously
the Anthropic provider silently dropped any file with a media type
other than image/*, application/pdf, or text/*, so unsupported
attachments left no trace for the caller. The new behavior matches
the openai, openaicompat, openrouter, and vercel providers, which
already warn on unsupported FilePart media types.
@ethanndickson ethanndickson force-pushed the anthropic-pdf-filename-context branch from 4331a3f to 492b6b0 Compare June 3, 2026 07:07
@ethanndickson ethanndickson changed the title fix(providers/anthropic): forward FilePart.Filename to document context and warn on unsupported media types fix(providers/anthropic): forward FilePart.Filename as document title and warn on unsupported media types Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant