HomeDocsSmart Routing Architecture
Platform

Smart Routing Architecture

Understand how Transend AI evaluates providers, applies policies, and fails over traffic to maximise uptime. · Updated 2025-02-15

Why routing matters

Different AI providers excel in different workloads, and availability fluctuates. Transend AI’s routing engine continuously scores latency, cost, and health signals so that each request lands on the most reliable target — without you rewriting integrations.

Routing pipeline

  1. Policy evaluation:
    Incoming requests are matched against workspace rules (allowed providers, model families, regional compliance tags, and cost ceilings).

  2. Provider scoring:
    We ingest heartbeat and error metrics from every upstream provider. Scores weigh:

    • rolling latency, p95/p99
    • recent error budgets
    • credit balances and concurrency limits
    • custom weights you define per workspace
  3. Execution & observability:
    Requests are proxied to the chosen provider. We emit structured logs with routing metadata (provider, model_version, fallback_reason) for each segment.

  4. Fallback loop:
    If we detect hard failures or SLA breaches mid-flight, the payload automatically fails over to the next candidate. All fallbacks are recorded so you can audit downstream behaviour.

Configuring policies

Navigate to Routing → Policies in the dashboard:

PolicyDescriptionExample
Provider allowlistRestrict traffic to vetted vendorsanthropic, openai, gemini
Region pinningKeep data residency compliantus-east, eu-central
Cost guardrailsAbort if projected cost exceeds a threshold$0.20 per request
Model overridesPrefer specific model familiesclaude-3.5-sonnet

Policies can be applied globally or scoped to API keys.

Streaming fallbacks

Streaming responses stay resilient by buffering partial tokens. When a provider fails mid-stream:

  1. We flush the buffered completion to the client.
  2. The engine replays the in-flight conversation with the backup provider.
  3. The response resumes streaming, marked with a fallback annotation so your client can notify users if desired.

Visibility

Use the dashboard or the /events API to query routing events. Example:

curl "https://api.transendai.net/v1/events?routed=true" \
  -H "Authorization: Bearer $TRANSEND_API_KEY"

Each event includes route_id, primary_provider, fallback_provider, latency metrics, and cost deltas.

Best practices

  • Combine Smart Routing with private connectors to keep proprietary models in rotation.
  • Set up Slack or PagerDuty alerts for spikes in fallback rates — it’s a leading indicator of provider regressions.
  • Review routing analytics weekly to adjust weightings based on your workload priorities.

Questions? Email [email protected] for an architecture review.