Help guides

Streaming reduces time-to-first-token and enables incremental UI. OpenAI and Anthropic surfaces use different event shapes.

OpenAI-compatible chat

Request

Item	Value
Body `stream`	`true`
`Accept`	`text/event-stream`

curl -sSN "https://51kik.com/v1/chat/completions" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{"model":"YOUR_MODEL_ID","stream":true,"messages":[{"role":"user","content":"Hello"}]}'

SSE format

Each data: line is JSON with object: "chat.completion.chunk":

data: {"choices":[{"delta":{"content":"Hi"}}],...}
data: [DONE]

delta may include content, tool_calls, etc. For tools, merge partial tool_calls per OpenAI rules.

Usage chunk

The gateway sets stream_options.include_usage: true. When the upstream supports it, a late chunk includes usage for billing/metrics.

Mid-stream errors

If a data: JSON object contains error, stop and handle as failure (SDK: GatewaySseError). See Errors.

Anthropic-compatible Messages

POST https://51kik.com/anthropic/v1/messages with stream: true and Accept: text/event-stream. Events follow Anthropic (message_start, content_block_delta, …). See Create message.

Proxies / CDN

Symptom	Fix
Burst output after long silence	Disable buffering (`proxy_buffering off`)
Mid-stream disconnect	Raise read timeout; define retry policy
HTTP/2 + SSE	Verify middle boxes support SSE

Client checklist

Stream parser — do not buffer full body before parse
Handle [DONE] and connection close
Distinguish user cancel vs upstream error
SSE buffering disabled in production proxies