Execution trees, not flat logs
14 spans · 2 failed tools · 842ms avg latency
Production AI agents fail silently — infinite loops, runaway costs, broken tool calls. ProbeMetric catches every failure in real time, before your users do.
import { wrapAnthropic } from 'probemetric';
import Anthropic from '@anthropic-ai/sdk';
// drop-in wrap — all calls traced automatically
const client = wrapAnthropic(new Anthropic(), {
apiKey: process.env.PROBEMETRIC_API_KEY,
});Nested execution trees with timing, tokens, and tool payloads — rendered the moment your agent runs.
One wrapper. Full visibility. No infra to manage.
$ npm install probemetric
TypeScript / Node.js SDK. Supports OpenAI, Anthropic, and Gemini.
import { wrapAnthropic } from 'probemetric';
const client = wrapAnthropic(
new Anthropic(), { apiKey }
);Drop-in wrapper — zero code changes to your agent logic.
const trace = await probemetric .getTrace(traceId); console.log(trace.spans, trace.cost);
Nested execution trees with cost, latency, and token counts.
Built for teams debugging production AI — not toy demos.
14 spans · 2 failed tools · 842ms avg latency
Daily spend: $124.50 · 1.2M tokens/day
5 active rules · Slack · PagerDuty · Webhooks
Traditional APM tools were built for microservices. ProbeMetric was built for autonomous reasoning.
Start for free, scale with your production volume.
For solo developers and early experiments.
For small teams shipping to production.
For growing teams with production workloads.
For organizations with serious scale.
A trace is one complete agent execution — from the initial prompt through all tool calls and sub-steps. Batched operations count as individual traces.
Absolutely. Upgrade or downgrade at any time. We'll prorate the difference and apply it to your next billing cycle.
We'll notify you at 80% and 100% usage. After that, traces are buffered for 24 hours so you never lose observability during spikes.
Latency waterfalls, token spend, error rates, and loop detection — all in a single dashboard built for production AI workloads.
| Agent | Tokens | Duration | Cost | Status |
|---|---|---|---|---|
| research_agent | 18.2k | 1.2s | $0.024 | ✓ ok |
| code_reviewer | 31.0k | 3.1s | $0.041 | ✓ ok |
| vector_search | 6.1k | 500ms | $0.008 | ✗ err |
| email_drafter | 8.4k | 0.8s | $0.011 | ✓ ok |
| planner_agent | 142k | 12.4s | $0.187 | ⟳ loop |
Your users' data security is paramount. ProbeMetric's SDK supports field-level, regex-based client-side redaction. API keys, credentials, and sensitive PII never leave your infrastructure — maintaining compliance with SOC 2, GDPR, and HIPAA.
Redaction is processed client-side. Sensitive data never reaches our servers.
Choose ephemeral mode — traces evaluated in-flight, then discarded.
Detect runaway planners and auto-kill agents that exceed cost or step budgets.
Span-level visibility into vector queries, rerankers, and prompt assembly.
Inspect every tool input, output, and error payload across nested chains.
Export failing traces directly into your regression and eval pipelines.
Framework-agnostic. Drop into raw SDK calls or your favorite agent library.
ingested across production workloads
in monthly LLM spend after week one
p99 SDK latency on every wrapped call
LangSmith is great if you're all-in on LangChain. ProbeMetric is framework-agnostic — it works with raw OpenAI calls, LangChain, CrewAI, AutoGPT, or any custom agent architecture. You get the same depth of tracing without locking into one ecosystem.
Yes. Our wrappers cover Assistants, Responses, and Chat Completions, including streaming, tool calls, and parallel runs.
OpenAI, Anthropic, Gemini, Mistral, plus LangChain, LlamaIndex, CrewAI, AutoGPT, and Vercel AI SDK out of the box. Anything OTel-compatible just works.
Per event (span), with generous free tier. No per-seat lock-in. Volume tiers scale linearly so you always know what next month looks like.
Yes — set thresholds per agent, per model, or globally. Route alerts to Slack, PagerDuty, or any webhook.
Less than 10ms p99 SDK overhead. Spans are batched and shipped asynchronously off the request path.
Field-level client-side redaction, zero-data retention options, SOC 2 Type II, GDPR, and HIPAA compliant.
Yes — on-prem and VPC deployments are supported for enterprise plans.
5-minute setup. No infra to manage. No credit card required.