Symptom-based guide for inference issues.

401 Unauthorized

CheckAction
Header shapeOpenAI: Authorization: Bearer <key>; Anthropic: x-api-key or Bearer
Key stateNot disabled/expired in console
Env mix-upNon-production keys used with production Base URL

403 Forbidden

CauseAction
insufficient_balance / quota_exceededConsole balance/quota
Model not allowedTenant entitlement for model id
ip_not_allowedUpdate IP allow list (Network and access)
Upstream keys exhaustedContact support

400 Bad Request

CauseAction
param under messagesFix tool_call_id, content part types
Unsupported partOnly text, image_url, input_audio, input_file
Body too large~32MB cap on chat/embeddings
Missing max_tokens (Anthropic)Required on Messages

Read error.message and param.

429

Backoff — Rate limits.

Streaming issues

SymptomAction
Burst after silenceDisable proxy buffering
Mid-stream dropTimeouts; client cancel
In-stream errorErrors — not success

5xx / timeouts

  1. Reproduce with smaller max_tokens / shorter input.
  2. Retry or switch model.
  3. Escalate with x-trace-id, time, model — Support information.

Empty model list

  • GET /models works without a key?
  • Filters too strict?
  • Model not enabled for you in the console?

Related

Errors · Production checklist