HomeDocsObservability & Cost Analytics
Operations

Observability & Cost Analytics

Instrument spend and latency across every provider with Transend AI’s dashboards, webhooks, and warehouse exports. · Updated 2025-02-28

Metrics at a glance

The Observability tab visualises usage across providers, models, and teams. Dashboards update in near real time, letting you answer:

  • Which provider handled the most tokens this hour?
  • Are latency or error rates spiking for a specific model?
  • Which teams are trending above their credit budgets?

Dashboard widgets

  • Provider Mix: stacked area chart splitting tokens by provider.
  • Latency Heatmap: p50/p95 per POP (point of presence).
  • Cost Timeline: credits consumed vs. alert thresholds.
  • Fallback Events: counts and reasons over time.

Every widget supports CSV export and scheduled email digests.

Webhooks

Set up webhooks under Observability → Alerts. We support Slack, PagerDuty, email, and generic HTTP endpoints.

{
  "type": "cost.threshold.breached",
  "workspaceId": "wrk_123",
  "threshold": 0.75,
  "amount": 820.45,
  "period": "2025-02-28",
  "topProviders": ["anthropic", "openai"]
}

All events are signed with an HMAC secret so you can verify authenticity.

Warehouse sync

Enterprise plans can push raw request logs to Snowflake, BigQuery, or S3. Enable a destination, provide credentials, and select:

  • Granularity: per-request or aggregated (5 minute buckets).
  • Fields: prompt hashes, provider, latency, cost, user ID (hashed).
  • Retention: 30, 60, or 90 days.

Note: prompts are optional. If you don’t opt in, we ship only metadata for compliance.

Querying the /usage API

Fetch usage programmatically:

curl "https://api.transendai.net/v1/usage?group_by=provider&period=7d" \
  -H "Authorization: Bearer $TRANSEND_API_KEY"

Response snippet:

{
  "period": "7d",
  "usage": [
    { "provider": "anthropic", "tokens": 18450320, "cost": 421.88 },
    { "provider": "openai", "tokens": 15220110, "cost": 389.55 }
  ]
}

Alerting tips

  • Configure credit depletion alerts at 50%, 75%, and 90% to avoid interruptions.
  • Pair latency anomaly alerts with routing policies so we auto-shift traffic.
  • Use webhook retries (max 5) to make sure downstream systems stay in sync.

Need deeper visibility? Contact [email protected] to discuss custom dashboards or Looker/Mode integrations.