Introduction
Operations
Platform
Smart Routing Architecture
Understand how Transend AI evaluates providers, applies policies, and fails over traffic to maximise uptime. · Updated 2025-02-15
Why routing matters
Different AI providers excel in different workloads, and availability fluctuates. Transend AI’s routing engine continuously scores latency, cost, and health signals so that each request lands on the most reliable target — without you rewriting integrations.
Routing pipeline
-
Policy evaluation:
Incoming requests are matched against workspace rules (allowed providers, model families, regional compliance tags, and cost ceilings). -
Provider scoring:
We ingest heartbeat and error metrics from every upstream provider. Scores weigh:- rolling latency, p95/p99
- recent error budgets
- credit balances and concurrency limits
- custom weights you define per workspace
-
Execution & observability:
Requests are proxied to the chosen provider. We emit structured logs with routing metadata (provider,model_version,fallback_reason) for each segment. -
Fallback loop:
If we detect hard failures or SLA breaches mid-flight, the payload automatically fails over to the next candidate. All fallbacks are recorded so you can audit downstream behaviour.
Configuring policies
Navigate to Routing → Policies in the dashboard:
| Policy | Description | Example |
|---|---|---|
| Provider allowlist | Restrict traffic to vetted vendors | anthropic, openai, gemini |
| Region pinning | Keep data residency compliant | us-east, eu-central |
| Cost guardrails | Abort if projected cost exceeds a threshold | $0.20 per request |
| Model overrides | Prefer specific model families | claude-3.5-sonnet |
Policies can be applied globally or scoped to API keys.
Streaming fallbacks
Streaming responses stay resilient by buffering partial tokens. When a provider fails mid-stream:
- We flush the buffered completion to the client.
- The engine replays the in-flight conversation with the backup provider.
- The response resumes streaming, marked with a
fallbackannotation so your client can notify users if desired.
Visibility
Use the dashboard or the /events API to query routing events. Example:
curl "https://api.transendai.net/v1/events?routed=true" \
-H "Authorization: Bearer $TRANSEND_API_KEY"
Each event includes route_id, primary_provider, fallback_provider, latency metrics, and cost deltas.
Best practices
- Combine Smart Routing with private connectors to keep proprietary models in rotation.
- Set up Slack or PagerDuty alerts for spikes in fallback rates — it’s a leading indicator of provider regressions.
- Review routing analytics weekly to adjust weightings based on your workload priorities.
Questions? Email [email protected] for an architecture review.