Streaming, tools, and request-body notes

client.chat.send forwards chatRequest in OpenAI Chat Completions shape: stream, tools, tool_choice, messages, response_format, etc.—anything the server schema allows is sent as JSON (SDK types are often Record<string, unknown>-style to stay aligned with upstream JSON).

Read stream vs non-streaming, streaming errors, and plugins.pdf merge in chat.send first.


Streaming: what happens when stream: true

  1. Accept is text/event-stream; body is SSE.
  2. The SDK wraps the fetch ReadableStream with EventStream, JSON.parse each data: line, and yields it as a for await chunk.
  3. data: [DONE] ends the stream cleanly (iterator completes, no value).
  4. If a parsed event has root-level error, the SDK throws GatewaySseError (distinct from GatewayHttpError on non-2xx HTTP)—Error types.

Minimal consumer (text only)

Same idea as packages/sdk/examples: use extractTextDelta per chunk for printable UTF-8.

import { RouterBrain, extractTextDelta } from "@routerbrain/sdk";

const client = new RouterBrain({ /* baseURL, auth, … */ });

const stream = await client.chat.send({
  chatRequest: {
    model: "openai/gpt-4o-mini",
    messages: [{ role: "user", content: "Explain streaming in one sentence." }],
    stream: true,
  },
});

for await (const chunk of stream) {
  const piece = extractTextDelta(chunk);
  if (piece) process.stdout.write(piece);
}

Cancel and timeout

Second argument RequestOptions.signal: attach AbortController before client.chat.send to abort at the connection layer; some chunks may already have been consumed—align UI with abort.

Failures after the stream starts

After HTTP 200 and the stream is open, upstream may signal failure via SSE error events or abnormal close—no “primary/standby route switch” at that point (Tool calls and streaming). Clients should catch GatewaySseError, log payload, decide whether to retry the whole turn.


extractTextDelta: visible text deltas only

Implementation: packages/sdk/src/chat-delta.ts, summarized:

SourceBehavior
choices[0].delta.content is a stringreturn as-is
array (multimodal parts)walk elements, concatenate text or content strings
otherwise emptyif non-empty choices[0].delta.reasoning exists, return it
else""

Intentionally not included:

  • tool_calls, function_call, refusal, etc.;
  • usage, finish_reason, often on final chunks or separate events.

So for agents / tools, extractTextDelta alone is not enough; inside for await, treat chunk as unknown (or add guards) and merge delta.tool_calls (OpenAI often streams fragments with index—merge by index into full id / type / function).

Do not use extractTextDelta on non-streaming ChatResult—read choices[0].message.*.


Tool calling (function calling) and the SDK

Request side

In chatRequest:

  • tools: OpenAI-style function definitions;
  • optional tool_choice ("auto", "none", or a specific function).

Support for parallel_tool_calls, object-shaped tool_choice, etc. depends on server and model; the SDK does not rewrite these (aside from global snake → wire JSON and pluginspdf_preprocess merge).

Message side (multi-turn)

Typical loop:

  1. User asks;
  2. Assistant returns tool_calls (and content may be null, OpenAI-aligned—Tool calls and streaming);
  3. Append role: "tool" messages per tool_call_id, name, content, etc.;
  4. chat.send again until a final assistant text.

Under streaming, tool_calls may arrive across chunks—buffer client-side before running tools or sending the next round.

Node: fromTurns and path attachments

With await client.chat.messages.fromTurns(...): only user turns may use image / pdfPath / audio / video; assistant / system are text-only (prompt, etc.). Paths on non-user roles → ContentBuildError before HTTP. See chat.messages and packages/sdk/README.md.


With multimodal and PDF

Tools can coexist with PDF / input_file, vision parts, etc.:

  • plugins.pdf → root pdf_preprocess still via stringifyChatRequestOpenAiWire + mergePdfPreprocessIntoBody;
  • With PDF but no explicit config, fromPaths / fromTurns may add default plugins.pdf (see resolvePdfPreprocessForChatRequest), slightly different from hand-written messages.

Details: chat.send, PDF preprocessing.


See also

Back to docs home