// ultra-niche buildsby JoshMay 5, 20265 min read

Sentry Alerts for AI Agent Cost Anomalies (30-Minute Setup)

If you're running AI agents in production, you need a Sentry-style alert for runaway costs. Here's how to wire it in 30 minutes — no new infra, just a custom event channel.

Sentry Alerts for AI Agent Cost Anomalies (30-Minute Setup)

If you're running AI agents in production, the worst Friday afternoon is the one where you find out at 5 PM that an agent has been looping since 2 AM. The bill is $4,000 you weren't planning on. The agent is still running.

This 30-minute setup makes that Friday afternoon impossible.

The pattern

Sentry's "Issues" feature can ingest custom events. Most teams use it for errors. It works just as well for cost anomalies.

Wire every AI API call to log to Sentry with cost metadata. Set alerts on: - Per-function cost rate (alerts if any function exceeds $X/hour) - Per-agent-instance cost (alerts if a single agent run exceeds $X) - Daily total spend (alerts if daily spend exceeds $X)

When an alert fires, Sentry pages whoever is on-call. The runaway agent gets a human eye within minutes instead of hours.

The build

Step 1: Set up a Sentry project for AI events. (5 min)

You probably already have Sentry. Create a new project specifically for AI cost events (or use a tag on your existing project — either works).

Step 2: Wrap your AI client. (10 min)

If you're using the Anthropic SDK:

```typescript import Sentry from "@sentry/nextjs"; import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic(); const PRICING = { "claude-opus-4-7": { input: 15, output: 75 }, // per million tokens "claude-sonnet-4-6": { input: 3, output: 15 }, "claude-haiku-4-5": { input: 0.25, output: 1.25 }, };

export async function callClaude(opts) { const start = Date.now(); const result = await client.messages.create(opts); const elapsed = Date.now() - start;

const pricing = PRICING[opts.model] ?? PRICING["claude-sonnet-4-6"]; const costUsd = (result.usage.input_tokens / 1_000_000) * pricing.input + (result.usage.output_tokens / 1_000_000) * pricing.output;

Sentry.captureMessage("ai.call", { level: costUsd > 0.5 ? "warning" : "info", tags: { model: opts.model, agent: opts.metadata?.agentName ?? "unknown", function: opts.metadata?.functionName ?? "unknown", }, extra: { cost_usd: costUsd, input_tokens: result.usage.input_tokens, output_tokens: result.usage.output_tokens, elapsed_ms: elapsed, }, });

return result; } ```

Every Anthropic call now lands in Sentry with cost metadata.

Step 3: Set up the alerts. (10 min)

Three alerts in Sentry's UI:

Alert 1: Per-call cost. Trigger when any ai.call event has cost_usd > $2. Most calls cost cents. A $2 call is an outlier worth seeing.

Alert 2: Per-agent rate. Trigger when an agent tag generates more than 100 events in 5 minutes. That's loop territory.

Alert 3: Daily spend. A Sentry alert based on aggregate cost_usd over 24 hours. Threshold: 2x your expected daily spend.

Step 4: Route alerts to PagerDuty or Slack. (5 min)

Connect Sentry to your existing on-call routing. Anyone with eyes on the system gets paged when an alert fires.

What it caught at one team

In the first 3 months after deploying this:

  • -One agent in a loop (Sentry alert fired 4 minutes after the loop started, total cost about $12 before the on-call killed it)
  • -One prompt that ballooned context (5 calls over $10 each before the engineer noticed and fixed the prompt that was including too much chat history)
  • -One bug in a fan-out that was multiplying the call count by 3 (caught from the rate alert)
  • -One "we changed the model and didn't update the cost calc, so the bill was suddenly higher than expected" — caught the same day

Before this, none of these would have been caught until the monthly bill arrived.

What broke

Initial alert thresholds were wrong. First version of the per-call alert was set at $0.50. We got 200 alerts in the first hour. Tuned up to $2 over a week.

Aggregate spend alert wasn't visible enough. Sentry's alerts default to in-app. We added a Slack channel for cost alerts specifically so finance and engineering both see them.

Metadata wasn't being passed consistently. Some agent code didn't pass the agent name. We made the wrapper require an agent name and threw if missing. Easier to fix at the wrapper boundary than to chase missing metadata later.

What this isn't

This isn't a replacement for hard cost ceilings. You should ALSO have: - Per-function budget caps that hard-abort - Recursion limits - Time limits per task - Rate limits per agent

Sentry alerts are the last line of defense. The hard ceilings are the first line.

What to extend

Once basic alerts are in place, add:

Per-customer cost tracking. If you're charging customers for AI usage, you need per-customer cost data, not just per-function.

Quality vs cost correlation. Tag each call with a quality outcome (success / failure / retry). Sentry can chart cost-per-success across models.

Forecasting. Daily aggregate gets fed into a weekly forecast. When the forecast diverges from budget, alert finance.

But ship the basic three alerts first. Most cost runaways come from edge cases that the basic alerts catch.

Total time

30 minutes for the basic setup. Another 60-90 minutes to tune thresholds over the first week.

Return on investment: prevents at least one major cost incident per year, which usually pays for the entire observability budget on its own.

sentryai costobservabilitymonitoringlong-tail
// go deeper

Want the full guide? Check out our deep-dive page for more context, FAQs, and resources.

read the full guide
// keep reading

Related posts

// ready to ship?

Let's build yours.

Reading is the easy part. We do the work. Tell us what's broken and we'll tell you straight up whether we can help.