← All posts
complianceJune 2, 2026·4 min read

How a Healthcare AI Team Passed Their HIPAA Review in 3 Weeks

A clinical documentation startup needed to prove HIPAA compliance before signing an enterprise hospital contract. Here's how they built the evidence layer their auditor needed — without changing their agent code.

HIPAAhealthcare AILLM complianceAI agent observabilityHIPAA AI compliance

Note: This is a composite case study based on the HIPAA compliance patterns we see from healthcare AI teams. Company details are illustrative.


A team of 12 engineers at a clinical documentation startup had a problem most AI teams eventually hit: the product worked, the accuracy was good, the doctors liked it — but the procurement team at a 400-bed hospital system wouldn't sign until they could answer one question from the hospital's compliance officer.

"Can you prove that no patient data is being logged in your AI vendor's observability system?"

They couldn't answer it. Not because they were doing anything wrong — they weren't — but because their observability setup was a mix of stdout logs and a third-party proxy that routed all LLM calls through an external server. They didn't know exactly what was being logged, and they couldn't prove it to an auditor.

The contract was $180k ARR. They had 3 weeks before the compliance officer moved on.

What the HIPAA audit actually asks

The HIPAA Privacy Rule doesn't prohibit using AI in clinical workflows. What it requires is that you demonstrate:

  1. PHI isn't being stored in systems outside your BAA coverage
  2. Access to PHI is role-gated and logged
  3. You can produce an audit trail of who accessed what

For LLM-based systems, this translates to: every time your agent processes a query that might contain PHI (patient names, MRNs, diagnoses, medication), you need to prove that:

  • The PHI was not logged to any system outside your covered environment
  • The LLM vendor received only what was necessary (minimum necessary standard)
  • You have a record of the call — timestamp, what was sent, what came back, whether any PHI was present

Most AI teams can't answer any of these questions cleanly. They have logs, but they don't know what's in them.

The proxy problem

The team's previous setup routed every OpenAI call through a third-party observability proxy. The proxy gave them a dashboard — request/response times, error rates, token counts. Useful for debugging.

But the proxy also logged the full request and response body to their servers. From a HIPAA standpoint, this was a problem: the proxy vendor was processing PHI but wasn't covered under the hospital's BAA chain.

Switching to in-process instrumentation — where spans are captured inside the application, not by an external proxy — solved this. No data leaves the stack to reach an observability vendor's servers. Spans are batched in memory and flushed to the team's own Supabase instance.

What the compliance officer actually wanted to see

After the switch, the team could produce:

1. A data map. What Peekr captures: span name, start/end time, duration, model used, token count, status. What Peekr does NOT capture by default: the full prompt text or response text (these are opt-in via capture_io=True, and the team had it off).

2. PII redaction evidence. With PIIRedact() enabled, any PHI that leaked into a span (email, SSN, MRN pattern) was automatically replaced with [EMAIL] or [PHI] before storage. The audit log showed the redaction event, not the original value.

3. A required disclosure record. Every response that dealt with patient information automatically included "This is not a diagnosis or medical advice. Consult a licensed healthcare provider." — enforced by the HIPAA compliance pack, logged on the span.

4. A violation audit trail. The handful of times the agent generated an output that matched a HIPAA-prohibited pattern (a response that implied a diagnosis), the span was blocked and the violation was logged with the regulatory citation. The compliance officer could see that the system caught and blocked the violation before it reached the user.

The setup

Two lines in their entrypoint:

import peekr

peekr.instrument(
    exporter=peekr.HTTPExporter(
        endpoint="https://peekr.starkspherelabs.com",
        api_key="pk_live_…",
    ),
    compliance=["HIPAA"],
    guardrails=[
        peekr.guard.PIIRedact(categories=("ssn", "credit_card", "email", "phone")),
    ],
)

Three things the compliance officer got from this:

  • A data map they could hand to the hospital's legal team
  • The PIIRedact audit log showing what was removed and when
  • The HIPAA violation log showing every blocked response with the regulatory citation

The hospital signed in week 3.

What this isn't

This setup doesn't make you HIPAA compliant by itself. You still need:

  • A Business Associate Agreement with your LLM provider (OpenAI, Anthropic, etc. all offer BAAs)
  • A BAA with any data storage vendors in your chain
  • Access controls and workforce training
  • An incident response procedure

What Peekr does is handle the technical safeguards piece: the audit logging, the PHI detection, the required disclosures, and the violation records. These are the things your compliance officer needs to see during a review.

If you're in the same position

If you're shipping healthcare AI and a hospital procurement team is asking about HIPAA, the conversation usually stalls at two questions:

  1. "What exactly does your observability vendor log?"
  2. "Can you show me a record of any PHI that was blocked?"

If you can't answer those questions clearly, the deal stalls.

Enable the HIPAA compliance pack in Peekr Cloud, run it for a week, and you'll have the audit evidence your compliance officer needs. Start with the free tier — compliance packs are on Pro at $99/mo.

Add compliance guardrails in two lines of code.

Free tier — 10k spans/month. No credit card required.