AI Agents in Healthcare: HIPAA Compliance, PHI Handling, and What's Deployable Now

Healthcare AI agent deployments face a more complex compliance landscape than any other sector. Here's what actually constrains architecture choices, which use cases are deployable today, and how to evaluate an agency's compliance posture.

The Healthcare Compliance Landscape

Healthcare AI agent deployments operate under a layered compliance regime that is unlike any other sector. HIPAA is the baseline federal framework, but it's not the only constraint. The FDA's Software as a Medical Device (SaMD) guidance applies to AI tools that meet the definition of a medical device — which includes software intended to diagnose, prevent, treat, or monitor disease. If your agent provides clinical decision support that is not based solely on accepted medical practice, it may qualify as a SaMD and require FDA 510(k) clearance or De Novo classification. State laws add further complexity: California's CMIA, New York's health data law, and Texas's medical records law all impose requirements beyond HIPAA with different breach notification timelines and patient rights. Beyond regulations, payer contracts and health system governance policies often impose additional constraints — many health systems require BAAs with all vendors in a data processing chain, regardless of whether they are technically covered entities. Any agency proposing a healthcare AI deployment that doesn't explicitly address this full regulatory stack should be treated as unprepared for the environment.

What Counts as PHI and Why It Constrains Architecture

Protected Health Information (PHI) under HIPAA is broader than most technology teams assume. The 18 HIPAA identifiers include not just obvious items (name, SSN, date of birth) but also geographic subdivisions smaller than a state, all dates related to an individual except year, phone and fax numbers, email addresses, IP addresses, device identifiers, and any other unique identifying number. Critically, health information combined with any identifier becomes PHI — a list of diagnoses is not PHI, but a list of diagnoses with patient names is. This breadth has direct architectural implications. An AI agent that processes clinical notes — even for a non-clinical purpose like scheduling optimization — is handling PHI if those notes contain any of the 18 identifiers. A vector database storing embeddings of patient records is a PHI store and must be treated as such. An LLM provider receiving PHI for inference is a business associate. The de-identification path (removing all 18 identifiers, either via the Safe Harbor method or Expert Determination) can sometimes allow data to be used without BAA requirements — but de-identification of clinical text is a non-trivial NLP problem and should not be assumed to be complete without validation by a qualified privacy officer.

BAA Requirements for AI Vendors

A Business Associate Agreement (BAA) is a legal contract required under HIPAA whenever a covered entity or business associate shares PHI with a vendor that processes, stores, or transmits it on their behalf. For AI agent deployments, this means a BAA is required with every vendor in the data path: the LLM provider, the vector database provider, the orchestration platform, the cloud infrastructure provider, and any API integration that receives PHI. OpenAI offers BAAs for HIPAA-eligible services under enterprise agreements (GPT-4 and GPT-4o through Azure OpenAI, not through the standard OpenAI API). Anthropic offers BAAs for Claude through AWS Bedrock. Google offers BAAs for Vertex AI. Importantly, the standard consumer-tier APIs (OpenAI's API accessed via api.openai.com without a BAA, Anthropic's Claude via the standard API) are not covered by BAAs and cannot be used with PHI under HIPAA. The BAA specifies the permitted uses of PHI, breach notification obligations (60-day notification to covered entity), and return or destruction of PHI at contract termination. Check the /compliance-checklist for a structured evaluation of whether an agency's proposed vendor stack has BAA coverage for your specific use case.

What's Deployable Now vs Not Yet

The line between deployable and not-yet-deployable healthcare AI use cases is drawn at clinical decision-making authority. Currently deployable with manageable compliance risk: prior authorization drafting (the agent drafts the clinical justification for a human clinician to review and submit — the human makes the clinical judgment), clinical documentation assistance (scribing, note summarization, ICD code suggestion for human review), administrative automation (appointment scheduling, insurance eligibility verification, patient communication for non-clinical questions), and revenue cycle management (claim scrubbing, denial management, payment posting). Not yet deployable without FDA oversight: autonomous clinical decision support where the agent's output directly influences a treatment decision without mandatory physician review, diagnostic imaging interpretation without human radiologist sign-off, and any workflow where the agent's output can trigger a medication order without human confirmation. The not-yet-deployable category is defined not by technical capability but by regulatory pathway maturity and liability exposure. Several agencies on /readiness specialize in navigating this boundary and can structure your use case to stay in the deployable zone.

Data Residency and Encryption Requirements

Healthcare AI deployments have specific data residency and encryption requirements that affect infrastructure design. Data residency: most health systems require that PHI remain within US borders. Azure OpenAI and AWS Bedrock both support US-only data residency configurations; the standard OpenAI API does not guarantee data residency. If you're building for a health system with data sovereignty requirements, verify your entire vendor stack's data residency commitments in writing — including subprocessors. Encryption requirements: HIPAA requires encryption of PHI at rest and in transit, but does not specify encryption standard — however, NIST guidelines (which OCR uses for audits) recommend AES-256 at rest and TLS 1.2+ in transit. For AI agent pipelines, this means the vector database must encrypt embeddings at rest, the orchestration layer must use encrypted channels for all API calls, and the LLM provider's data handling agreement must specify encryption standards. Encryption key management is often overlooked: for health systems with stringent control requirements, customer-managed keys (CMEK in GCP, CMK in AWS, BYOK in Azure) allow the organization to control the encryption keys independently of the cloud provider. This adds complexity to the infrastructure but is increasingly required by health system security teams.

Audit Trail Obligations for AI-Assisted Decisions

HIPAA's Access Controls and Audit Controls technical safeguards require that covered entities implement hardware, software, and procedural mechanisms to record and examine access and activity in information systems containing PHI. For AI agent systems, this translates to: every LLM call that processes PHI must be logged with the input, output, user identity, timestamp, and purpose. Every action the agent takes (document access, system update, message sent) must be traceable to a specific user session and use purpose. For clinical decision support specifically, audit trails must capture what information the agent presented, what the clinician's decision was, and the temporal relationship between agent output and clinical action. This level of audit logging is technically straightforward (most LLM orchestration frameworks support callback-based logging) but has storage and cost implications — a health system running 10,000 agent interactions per day at 2,000 tokens per interaction generates significant log volume that must be retained (HIPAA requires a minimum 6-year retention for documentation). Structure your audit architecture before building, not as an afterthought. Agencies with healthcare experience will typically have an audit logging module ready to integrate; those without it are likely underestimating compliance build cost by 30-50%.

Evaluating a Healthcare AI Agency's Compliance Posture

When evaluating an agency for a healthcare AI engagement, compliance posture is as important as technical capability — and harder to assess from a pitch deck. The questions that reveal genuine compliance expertise: Can they produce a data flow diagram showing every system that touches PHI and the BAA status of each? Have they completed a HIPAA Security Risk Assessment for a prior healthcare client? Do they have a written incident response plan for PHI breaches, and can they demonstrate it's been tested? Are they familiar with OCR audit protocols and can they speak to the controls they implement to address each safeguard category? Do they have a dedicated privacy officer or healthcare compliance advisor, or are they relying on general legal counsel? The /interview-questions resource includes a full set of compliance-specific questions to use in agency evaluations. The /scorecard tool allows you to rate agencies on compliance dimensions alongside technical and commercial criteria. Health systems making significant AI investments should also use the /readiness assessment to evaluate their own internal compliance maturity before engaging external agencies — the most common compliance failures happen at the client's security and governance layer, not the agency's.

Agentic Use Cases With Minimal Regulatory Risk

The safest entry points for healthcare AI agents are use cases that never touch PHI at all, or that touch only de-identified or aggregate data. Revenue cycle analytics on de-identified claims data, staff scheduling optimization using role/shift data without patient identifiers, facility operations (environmental monitoring, equipment maintenance scheduling), and medical education tools (case studies using de-identified or synthetic patient scenarios) all carry minimal regulatory risk. Within PHI-touching use cases, the safest tier is administrative automation where the agent's output is reviewed by a human before any action is taken: prior auth drafting reviewed by a clinical coordinator, appointment reminders approved by patient services staff, claim edits reviewed by a billing specialist. The risk profile of these use cases is similar to using a word processor or spreadsheet to assist the same tasks — the AI assists a human decision, it doesn't replace it. Organizations that start in this tier, build their compliance infrastructure and BAA stack properly, and demonstrate measurable ROI create the organizational foundation and regulatory track record needed to expand into higher-complexity use cases over time. Use the /search filters for healthcare-specific agencies with HIPAA compliance experience to identify partners who have already navigated this landscape.

Related Resources

Find agencies that specialize in the frameworks and use cases covered in this article.

Compliance Checklist →Readiness Assessment →Search All Agencies →Agency Scorecard →Interview Questions →

Use Case Guide

AI Automation for Sales Teams: How Agencies Build Pipeline, Outreach, and CRM Agents

Read →

Use Case Guide

AI Workflow Automation for Business: From Process Mapping to Production Agent Deployment

Read →

Use Case Guide

AI Agents for Customer Support: Architecture, Costs, and What Actually Works

Read →

Explore the Directory

Find the right AI agent agency for your project.