Compliance Guide

HIPAA-compliant AI chatbots for patient support: a checklist

What actually makes an AI chatbot HIPAA-compliant for patient support? A buyer's checklist on BAAs, PHI handling, de-identification, and what to ask vendors.

Author
By the Open Team
|Updated June 17, 2026|9 min read

A vendor telling you their chatbot is "HIPAA-compliant" tells you almost nothing. HIPAA does not certify software. There is no government stamp, no compliance badge a product earns and displays. Compliance is something a covered entity and its vendors achieve together, through contracts and controls, and it can be undone by a single misconfigured integration. So when you are buying an AI chatbot for patient support, the useful question is not "is it compliant." It is "what does this vendor actually do with protected health information, and can they prove it."

This is a checklist for that conversation. It is written for the person who has to sign off on a patient-facing chatbot and then defend that decision if something goes wrong.

First, get the definitions straight

A chatbot that touches patient data is almost always a business associate under HIPAA. The HHS definition is broad: a business associate is any person or entity that creates, receives, maintains, or transmits protected health information on behalf of a covered entity. A booking assistant that reads a patient's name and reason for visit qualifies. So does a chatbot that just stores the transcript.

Protected health information, or PHI, is also broader than most buyers assume. It is any individually identifiable health information held or transmitted in any form, including a phone number tied to a clinic visit or an IP address logged against a page about a specific condition. The bar for "identifiable" is low.

The practical consequence: if a chatbot vendor will see PHI, you need a signed business associate agreement before any patient data flows. No BAA, no patient-facing deployment. That is the first checklist item, and it is non-negotiable. The same checklist applies when the channel is the phone rather than chat: voice AI for healthcare carries PHI from the first sentence too.

The business associate agreement is the contract that matters

The BAA is where compliance actually lives. HHS spells out the required elements at 45 CFR 164.504(e), and a serious vendor will hand you one without friction. Read it for these things specifically:

  • Permitted uses. The BAA must say exactly what the vendor may do with PHI. Watch for language that lets them use your patients' data to train shared models. That is a use, and it needs to be named and bounded.
  • Subcontractor flow-down. Most AI chatbots call a large language model they do not own. If your vendor sends PHI to a model provider, that provider is a subcontractor and needs its own BAA with the vendor, with the same obligations flowing down. Ask who the subprocessors are and whether each one has signed.
  • Breach notification timelines. The agreement should commit the vendor to telling you about a breach fast enough that you can meet your own notification deadlines.
  • Return or destruction of PHI at the end of the contract.

If a vendor says they do not sign BAAs because their tool "does not really store PHI," treat that as a finished answer. You have your answer.

Map where PHI goes, end to end

The single most useful exercise before buying is to draw the data path. A patient types a question into a widget on your site. Where does that text go next? Through the vendor's servers. To a language model. Maybe to a logging system, an analytics tool, a transcript store. Each hop is a place PHI can leak.

Two specifics worth pinning down:

Model training. Ask, in writing, whether patient conversations are used to train or fine-tune any model, including the underlying LLM. The answer you want is no, or "only inside your isolated instance, never pooled." Some vendors let you disable specific AI providers or run on a model that contractually excludes your data from training. Open.cx, for instance, redacts PII and PHI before anything reaches the model, so the LLM never sees the sensitive fields in the first place.

Logging and analytics. This is where the HHS online tracking guidance becomes relevant. In June 2024, a federal court in AHA v. Becerra vacated the part of OCR's bulletin that treated an IP address plus a visit to a public health page as automatically creating PHI. But the ruling left the core rule standing: tracking technologies on authenticated pages, like a logged-in patient portal, still need a BAA or patient authorization. A chatbot embedded in a portal is squarely inside that rule. If your vendor pipes chat events into a marketing analytics tool with no BAA, that is the exposure.

Decide how much PHI the bot needs at all

The HIPAA minimum necessary standard is a design principle, not just a legal phrase. A chatbot should ask for and retain the least PHI required to do its job. A lot of patient-support work does not need PHI at all: explaining how to prepare for an MRI, giving clinic hours, walking through a billing policy. Those answers are the same for everyone.

This is why two technical capabilities are worth more than any compliance badge:

  • Redaction. Can the bot strip identifiers out of a transcript before it lands in storage or reaches the model? PII and PHI redaction means the sensitive parts of a conversation never persist where they do not need to. Open.cx offers redaction of sensitive data as a built-in control, which keeps identifiers out of logs the rest of the stack would otherwise keep.
  • De-identification, done right. HHS defines two valid methods at 45 CFR 164.514: Safe Harbor, which means removing all 18 listed identifiers, and expert determination, where a qualified statistician certifies that re-identification risk is very small. If a vendor claims their analytics use "de-identified" data, ask which method. "We stripped the names" is not Safe Harbor.

A bot that scopes itself to the minimum necessary is easier to defend, cheaper to secure, and less catastrophic if it fails.

Scope the bot to the minimum necessary

Applying the HIPAA minimum necessary standard (45 CFR 164.502(b), 164.514(d)) to chatbot design.

Bot can handle (no / low PHI)
  • Clinic hours, location, directions
  • How to prepare for a procedure (e.g. MRI prep)
  • General billing policy
  • Generic, non-identifiable FAQs
Route to a human
  • Anything clinical (symptoms, dose, result meaning)
  • Low-confidence or out-of-scope questions
  • Requests needing PHI beyond the minimum necessary
  • Identity-sensitive account changes

The capability that compliance teams underrate: not guessing

Generative AI in a clinical context has a failure mode that is worse than being unhelpful. It can be confidently wrong. A 2025 study of medical hallucinations in foundation models surveyed clinicians and found that 91.8% had encountered a medical hallucination and 84.7% believed those hallucinations were capable of causing patient harm. A chatbot that invents a drug interaction or fabricates a policy is a patient-safety problem, not just a CX problem.

So the buyer's question is: what does the bot do when it does not know? The safe behavior is to hand off to a human rather than improvise. Open.cx's Agent 5 model is built to be conservative this way, escalating to a person when its confidence is low instead of generating a plausible-sounding answer. For patient support, a known handoff beats a smooth guess every time. That conservative posture is the throughline of safe generative AI patterns in healthcare more broadly.

Clinicians are already seeing AI medical hallucinations

Global clinician survey (n=70), Kim et al. 2025, “Medical Hallucinations in Foundation Models.”

91.8%

clinicians who had encountered a medical hallucination

84.7%

clinicians who believed those hallucinations could cause patient harm

A checklist you can hand to a vendor

Print this. Make them answer in writing.

  1. Will you sign a HIPAA business associate agreement? Send it.
  2. Which subprocessors and model providers touch PHI, and does each have a BAA?
  3. Are patient conversations ever used to train or fine-tune any model?
  4. Can we disable specific model providers if our security team requires it?
  5. Can the bot redact PHI before storage and before the model sees it?
  6. Where is PHI stored, in what region, and for how long?
  7. What happens to a transcript when we end the contract?
  8. How does the bot behave when it is not confident? Does it escalate?
  9. Is the bot embedded in any authenticated page, and is tracking on that page covered by the BAA?
  10. Can you produce a recent independent security audit?

The vendors worth your time will answer all ten without flinching. The answers, in writing, are also your audit trail. If OCR ever asks how you vetted the tool, this is the document.

Compliance is the floor, not the goal

It is easy to spend the whole evaluation on HIPAA and forget that the patient on the other end just wants a refill or a clinic address at 9pm. The same scrutiny extends the other direction too, to AI patient outreach where the practice initiates the contact and the TCPA joins HIPAA in the calculus. The compliance checklist exists so you can say yes to a tool that helps them without exposing their data. Get the BAA, map the PHI, scope to the minimum, and pick a bot that hands off when it is unsure. Do that, and the technology gets to do the easy, useful thing it is good at, while the hard, sensitive parts stay where a human can see them.

Frequently Asked Questions