How RouterBrain fits in your inference stack and what each layer owns.
Data flow
┌─────────────┐ HTTPS ┌──────────────────┐ upstream ┌─────────────┐
│ Your backend │ ─────────────► │ RouterBrain │ ─────────────► │ Providers │
│ │ ◄───────────── │ gateway │ ◄───────────── │ (multi-route)│
└─────────────┘ JSON / SSE │ (OpenAI / Anthropic│ └─────────────┘
│ compatible HTTP) │
└──────────────────┘
│
▼
usage, policy, key auth
- Your services talk only to the gateway Base URL and API keys — not to each upstream directly.
- The gateway handles: tenant auth, model routing, upstream failover, streaming proxy, normalized errors, usage recording.
- The console handles: key lifecycle, model entitlements, balance/quota, and related settings (via the console — not inference HTTP APIs).
Choosing a protocol surface
| Situation | Recommendation |
|---|---|
| Existing OpenAI SDK / LangChain / most agents | OpenAI-compatible https://51kik.com/v1 |
| Existing Anthropic SDK / Messages API | https://51kik.com/anthropic/v1 |
| Multiple apps | One surface per integration path; shared catalog id as model |
Avoid mixing both Base URLs for the same logical conversation unless you implement explicit conversion in your app.
Keys and trust boundaries
| Location | Production API key? |
|---|---|
| Server, BFF, worker | Yes |
| Browser or mobile direct | No — proxy via your backend |
Console and admin APIs
Tenant setup, key rotation, and billing are done in the console. For inference integration, use the HTTP APIs in the diagram above (see Capabilities).