Why routing matters

Different AI providers excel in different workloads, and availability fluctuates. Transend AI’s routing engine continuously scores latency, cost, and health signals so that each request lands on the most reliable target — without you rewriting integrations.

Routing pipeline

Policy evaluation:
Incoming requests are matched against workspace rules (allowed providers, model families, regional compliance tags, and cost ceilings).
Provider scoring:
We ingest heartbeat and error metrics from every upstream provider. Scores weigh:
- rolling latency, p95/p99
- recent error budgets
- credit balances and concurrency limits
- custom weights you define per workspace
Execution & observability:
Requests are proxied to the chosen provider. We emit structured logs with routing metadata (provider, model_version, fallback_reason) for each segment.
Fallback loop:
If we detect hard failures or SLA breaches mid-flight, the payload automatically fails over to the next candidate. All fallbacks are recorded so you can audit downstream behaviour.

Configuring policies

Navigate to Routing → Policies in the dashboard:

Policy	Description	Example
Provider allowlist	Restrict traffic to vetted vendors	`anthropic, openai, gemini`
Region pinning	Keep data residency compliant	`us-east`, `eu-central`
Cost guardrails	Abort if projected cost exceeds a threshold	`$0.20 per request`
Model overrides	Prefer specific model families	`claude-3.5-sonnet`

Policies can be applied globally or scoped to API keys.

Streaming fallbacks

Streaming responses stay resilient by buffering partial tokens. When a provider fails mid-stream:

We flush the buffered completion to the client.
The engine replays the in-flight conversation with the backup provider.
The response resumes streaming, marked with a fallback annotation so your client can notify users if desired.

Visibility

Use the dashboard or the /events API to query routing events. Example:

curl "https://api.transendai.net/v1/events?routed=true" \
  -H "Authorization: Bearer $TRANSEND_API_KEY"

Each event includes route_id, primary_provider, fallback_provider, latency metrics, and cost deltas.

Best practices

Combine Smart Routing with private connectors to keep proprietary models in rotation.
Set up Slack or PagerDuty alerts for spikes in fallback rates — it’s a leading indicator of provider regressions.
Review routing analytics weekly to adjust weightings based on your workload priorities.