Auto-instruments OpenAI · Anthropic · Gemini · Bedrock

Something went wrong.
The trace shows you exactly what.

Peekr traces every LLM call — cost, latency, tokens, hallucination score, and the exact inputs and outputs at every step. Two lines of Python. No proxy. No wrappers. No architecture change.

Start debugging free →See live demo

10k spans/month free · no credit card · MIT license

peekr · trace diagnosislive

# Peekr detected a pattern automatically:

⚠ ROOT CAUSE Sequential execution

15 "entity extraction" spans ran one-after-another

Sequential (now): ████████████████ 279s

Parallel (after fix): ████ ~37s

Fix (low effort):

from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(max_workers=8) as pool:

results = list(pool.map(process, items))

trace · 95629e41 · extremis7.5× speedup available

279s

Before

→

37s

After

What Peekr shows you

Four complaints. Four traces. Four fixes.

The complaint

“My agent gave the wrong answer.”

trace output

agent.run  2100ms
  └─ tool.fetch_user  12ms
       out: null          ← returned null
  └─ openai.chat      2088ms
       in: "User profile: null…"  ← LLM got garbage

Peekr's fix ·

The trace shows exactly what the LLM received. Malformed tool output caught in seconds, not hours.

The complaint

“My agent is hallucinating.”

trace output

eval_scores: { Hallucination: 0.00 }

  ✗ contradicted  "founded in 1923"
  ✗ contradicted  "designed by Frank Lloyd Wright"
  ~ unsupported   "featured at the World's Fair"

Peekr's fix ·

Every claim verdicted: supported / contradicted / unsupported. Find the exact sentence that was wrong.

The complaint

“My API bill is too high.”

trace output

Trace 1:  18,432 tokens  · $0.018
Trace 2:  21,104 tokens  · $0.021
Trace 3:  24,891 tokens  · $0.025  ← growing

Cost by operation: chat_summary  67% of spend

Peekr's fix ·

Cost per query growing means unbounded context. Peekr shows the slope before the bill arrives.

The complaint

“My agent is too slow.”

trace output

agent.run  4300ms
  └─ tool.search_web  3800ms  ← 88% of time
  └─ tool.rerank         18ms
  └─ openai.chat        490ms  ← not your problem

Peekr's fix ·

Most teams swap models first. The trace shows it's the tool — not the LLM. 88% of time in one call.

Auto-detection

Peekr reads the traces. You fix the bugs.

6 performance pattern detectors run automatically across every trace. They surface in your Insights tab with the offending trace ID and a code fix.

⚡

Sequential execution

7.5× speedup

Same span runs N times one-after-another. Parallelising saves 242s.

Read the case study →

📊

Observer overhead

Zero latency

Eval blocking user responses. Move to background thread.

Read the case study →

🔁

Redundant embeddings

Latency cut

Same text embedded 2× in one trace. Cache the vector.

📈

Context growth

Cost reduced

Token count growing 60%+ across calls. Trim with rolling window.

🔧

Tool bottleneck

Major speedup

Single non-LLM call taking 88% of trace time. Cache it.

🔄

Retry storm

Reliability

Same span fails 3× before succeeding. Add exponential back-off.

Setup

Two lines. Everything else is automatic.

Peekr patches at the class level — every OpenAI() and Anthropic() instance is captured automatically. No wrappers. No proxy. No framework-specific config.

✓ Zero latency overhead — spans export on a background thread

✓ Works alongside JSONL / SQLite for local dev

✓ Self-hostable — MIT license, run anywhere

Start free — 10k spans/month →

agent.py

import peekr

peekr.instrument(
    exporter=peekr.HTTPExporter(
        endpoint="https://peekr.starkspherelabs.com",
        api_key="pk_live_…",
    ),
    evaluators=[peekr.eval.Hallucination(detailed=True)],
    guardrails=[peekr.guard.PIIRedact()],
)

# Your existing code unchanged
from openai import OpenAI
client = OpenAI()  # ← automatically traced

Find bugs in your AI before your users do.

Free up to 10k spans per month. No credit card. Two lines of Python. Traces appear within 5 seconds of your first LLM call.

Start debugging free →See live demo

Also need compliance? See compliance guardrails →

Something went wrong.The trace shows you exactly what.

Four complaints. Four traces. Four fixes.

Peekr reads the traces. You fix the bugs.

Two lines. Everything else is automatic.

Find bugs in your AI before your users do.

Something went wrong.
The trace shows you exactly what.