Per-key budgets and cost monitoring
Production cost-control patterns: per-app keys with model_allowlist + budget_cap, plus Slack alerts when spend crosses 80%/100% of monthly budget.
Outline
- One key per service: why proliferating keys is cheaper than one shared key
- Setting
budget_cap_usd+model_allowlistvia the dashboard ornex keys create - What happens when a key crosses its budget (HTTP 402 + auto-suspend)
- Reading
/v1/usagefor cost rollups; per-key, per-model, per-day - Slack webhook recipe: cron a 5-min job that polls /v1/usage and posts to your #ai-spend channel
- Per-provider spend dashboard via
/admin/provider/spend(admin-only — for ops) - Forecasting: linear-fit the last 14 days of spend and surface "trending to overshoot budget on day N"
Ship-date guess
Sprint 8 (paired with the upcoming team-budgets feature).