End-to-end guide for files, PDFs, and multimodal context on chat requests.

Typical flow

1. Prepare file content (local base64 encoding)
2. POST /chat/completions with type: "file" + file.file_data in messages
3. (optional) POST /files to upload and store a file
4. (optional) DELETE /files/:id to clean up

Reference: Files, Create chat completion.

Referencing files in messages

For array content, use file type with fields nested in a file object:

file fieldDescription
file_dataInline base64 data, supports data:<mime>;base64,... format
filenameFile name for MIME type inference
{
  "role": "user",
  "content": [
    { "type": "text", "text": "Summarize the attachment" },
    { "type": "file", "file": { "filename": "spec.pdf", "file_data": "data:application/pdf;base64,…" } }
  ]
}

Upload limits

  • Default max size ~32 MB per file (deployment-specific).
  • Multipart field name file; purpose defaults to assistants.
  • Files are scoped to the API key that uploaded them.

PDF preprocess (gateway)

On chat requests:

{
  "pdf_preprocess": {
    "engine": "ocr",
    "max_pages": 20
  }
}
engineBehavior
nativeForward unchanged
ocrOCR for upstream
markdownExtract Markdown

Also on Anthropic Messages (Create message).

Images

Use image_url with http/https URLs. Pick models whose catalog input_modalities include vision when needed.

Also supported: file + inline base64 for local images.

Cleanup

  • DELETE stale file_ids on a schedule.
  • Avoid sensitive data in uploads; do not log file bodies.

Related