How failures are returned on the OpenAI-compatible and Anthropic-compatible APIs, and how your application should react.
OpenAI-compatible (/v1/*)
{
"error": {
"message": "Human-readable detail",
"type": "invalid_request_error",
"param": "messages[0].tool_call_id",
"code": "insufficient_balance"
}
}
| Field | Meaning |
|---|---|
message | Primary explanation |
type | e.g. invalid_request_error, rate_limit_error |
param | JSON path when validation failed |
code | Machine hint when set (insufficient_balance, quota_exceeded, ip_not_allowed, upstream_rate_limited, …) |
Some responses include error.gateway with extra structured hints.
Anthropic-compatible (/anthropic/v1/*)
{
"type": "error",
"error": { "type": "authentication_error", "message": "..." },
"request_id": "req_..."
}
Keep request_id when opening support tickets.
HTTP status codes
| Status | Meaning | Client action |
|---|---|---|
| 400 | Bad body or unsupported content | Fix payload; do not retry blindly |
| 401 | Missing/invalid/expired API key | Fix Authorization / x-api-key |
| 403 | Policy: model, balance, IP, keys | Fix config; usually no retry |
| 429 | Rate limited | Exponential backoff with jitter |
| 502/503 | Upstream or transient gateway | Limited retries |
| 504 | Timeout | Increase client timeout; check payload size |
Retry policy
- Retry: 429, 502, 503, 504 and transient network errors.
- Do not retry without change: 400, 401, most 403.
- Use idempotency for side-effecting tool calls.
- Cap retries (e.g. 3–5) with backoff from ~1s up to ~60s.
Correlation
Send x-trace-id on every request; include it in tickets with timestamp, model, and redacted headers (Support information, Request correlation).