AI COMPLIANCEMarch 26, 2026·12 min read

SOC 2 for AI Companies:
What Auditors Actually Check
in Your Code

Enterprise customers ask for SOC 2 before signing. Most AI companies reach the audit unprepared for the new LLM-specific questions. This guide covers exactly what CC6, CC7, and availability criteria require at the code level in 2026 — and how to generate the evidence automatically.

Bottom Line Up Front

SOC 2 is now table stakes for enterprise AI sales. The 2025–2026 audit cycle introduces AI-specific sub-criteria under CC6 and CC7: auditors check whether LLM inference endpoints are authenticated, whether model outputs are logged, and whether you have incident response procedures for prompt injection attacks. custodia scan . generates a SOC 2-mapped security report that documents code-level evidence for each criterion — formatted for auditor submission.

Why SOC 2 Changed for AI Companies

SOC 2 was designed for traditional SaaS — database access, API authentication, encryption. When your product processes user requests through an LLM, you introduce a new attack surface that the original trust criteria weren't written to cover.

AICPA (the body that governs SOC 2) has updated audit guidance to explicitly include AI system controls. Auditors with AI-company experience now ask: How do you prevent prompt injection? What happens to user data in the inference payload? Who can modify the system prompt? Is there a kill switch?

Without code-level answers to these questions, you fail the AI-specific sub-criteria even if your traditional SOC 2 controls are solid.

Trust Criteria Breakdown — Code-Level Requirements

Focus: CC6, CC7, and Availability — the criteria where AI companies most often have gaps.

CC6

Logical & Physical Access Controls

CC6.1Logical access security software, infrastructure, and architectures are implemented

AI-specific auditor focus: Auditors check: is LLM inference gated by authentication? Are training data buckets access-controlled with least privilege? Is there role-based access to model configs?

❌ Audit Finding

// ❌ CC6.1 — LLM inference endpoint unprotected
// No auth middleware → anyone can call your model
export async function POST(req: Request) {
  const { prompt } = await req.json();
  const result = await openai.chat.completions.create({
    messages: [{ role: 'user', content: prompt }],
    model: 'gpt-4o',
  });
  return Response.json({ result });
}

✅ Audit Evidence

// ✅ CC6.1 — Auth gating on inference endpoint
import { auth } from '@clerk/nextjs/server';

export async function POST(req: Request) {
  const { userId, orgId } = await auth();
  if (!userId) return new Response('Unauthorized', { status: 401 });

  // Log for audit trail (CC6.2)
  await auditLog.write({
    event: 'llm.inference',
    userId, orgId,
    timestamp: Date.now(),
  });

  const { prompt } = await req.json();
  const result = await openai.chat.completions.create({
    messages: [{ role: 'user', content: prompt }],
    model: 'gpt-4o',
  });
  return Response.json({ result });
}

CC6.2New internal and external users are registered and authorized

AI-specific auditor focus: All user provisioning goes through a defined process; access is removed on offboarding. For AI systems: who can add new models, fine-tune, or update system prompts?

CC6.8Measures against malware and unauthorized software

AI-specific auditor focus: Your LLM inference pipeline must not execute arbitrary code from model outputs. Prompt injection leading to code execution is a CC6.8 finding.

CC7

System Operations & Threat Monitoring

CC7.1Vulnerabilities are identified and the risk is managed

AI-specific auditor focus: Auditors will ask: when did you last run a security scan? Do you have a process for finding and remediating OWASP vulnerabilities? Can you show scan history?

CC7.2Security events are identified and respond to

AI-specific auditor focus: You need monitoring on LLM inference: anomalous prompt lengths, high-frequency requests from single IPs, unusual output patterns indicating prompt injection. Rate limiting is evidence for this criterion.

❌ Audit Finding

// ❌ CC7.2 — No monitoring on inference
// Model DoS and injection attempts go undetected
export async function POST(req: Request) {
  const { prompt } = await req.json();
  // No rate limiting, no logging, no anomaly detection
  const result = await model.infer(prompt);
  return Response.json(result);
}

✅ Audit Evidence

// ✅ CC7.2 — Monitoring + rate limiting
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';

const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(20, '1 m'),
});

export async function POST(req: Request) {
  const ip = req.headers.get('x-forwarded-for') ?? 'unknown';
  const { success } = await ratelimit.limit(ip);
  if (!success) {
    securityLog.warn({ event: 'rate_limit.exceeded', ip });
    return new Response('Too Many Requests', { status: 429 });
  }

  const { prompt } = await req.json();
  securityLog.info({ event: 'llm.inference', ip, promptLen: prompt.length });

  return Response.json(await model.infer(prompt));
}

CC7.4Incidents are identified, managed, and documented

AI-specific auditor focus: An incident response plan is required documentary evidence. For AI: your plan must cover prompt injection attacks, model output failures, and data leakage via inference.

Availability

A1.1Capacity planning processes are in place

AI-specific auditor focus: For LLM-dependent systems: you must demonstrate that you have max_tokens limits, fallback behavior when the upstream model API is unavailable, and documented SLA targets.

❌ Audit Finding

// ❌ A1.1 — No fallback, no token cap
// Provider outage = your app returns 500 to all users
const result = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: prompt }],
  // No max_tokens, no timeout, no fallback
});

✅ Audit Evidence

// ✅ A1.1 — Timeout + fallback + token cap
const result = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: prompt }],
  max_tokens: 1000,
}).catch(async (err) => {
  // Fallback to cheaper model on failure
  logger.error({ event: 'llm.primary_failed', err });
  return openai.chat.completions.create({
    model: 'gpt-3.5-turbo',
    messages: [{ role: 'user', content: prompt }],
    max_tokens: 500,
  });
});

A1.2Environmental protections and monitoring procedures

AI-specific auditor focus: Availability monitoring: uptime checks on inference endpoints, alerting on error rate spikes, runbooks for LLM provider outages.

SOC 2 Evidence Checklist for AI Companies

Auditors collect evidence through a list of requests (ERL — Evidence Request List). For AI companies, expect these to appear in your ERL:

□Authentication logs showing all LLM inference API calls with user IDsCC6.1

□Access control policy for training data repositoriesCC6.1

□Rate limiting configuration on inference endpointsCC7.2

□Audit log showing who modified system prompts and whenCC6.2

□Security scan report showing OWASP Top 10 findings (last 6 months)CC7.1

□Incident response plan including prompt injection attack sectionCC7.4

□Uptime monitoring dashboard or SLA report for inference endpointsA1.2

□Data retention and deletion policy for LLM inference payloadsCC6.7

□Encryption configuration: TLS in transit, AES-256 at restCC6.1

□Vendor assessment for LLM API provider (OpenAI/Anthropic SOC 2 cert)CC9.2

AUDIT EVIDENCE GENERATOR

Generate SOC 2 Evidence
From Your Codebase.

Custodia maps every finding to SOC 2 trust criteria. The PDF report is formatted for auditor review — no consultant required for the evidence collection phase.

// custodia scan . --framework soc2

$ custodia scan . --framework soc2

  ┌──────────────────────────────────────────────────────┐
  │  CUSTODIA.DEV  //  SOC 2 COMPLIANCE ANALYSIS         │
  └──────────────────────────────────────────────────────┘

  [TRIAGE]   AI system detected: OpenAI + LangChain
  [SCOPE]    Applying AI-specific SOC 2 sub-criteria

  ── SOC 2 FINDINGS ────────────────────────────────────

  [CC6.1] HIGH    LLM Endpoint Unauthenticated
           src/api/chat/route.ts:14
           No auth middleware on inference endpoint

  [CC7.2] MEDIUM  No Rate Limiting on Inference
           src/api/chat/route.ts:14
           Anomalous usage undetectable without rate limits

  [CC6.2] MEDIUM  System Prompt Modifications Unlogged
           src/lib/prompts.ts:8
           No audit trail for system prompt changes

  [A1.1]  INFO    No LLM Fallback Configured
           Availability criteria: add fallback for uptime SLA

  ───────────────────────────────────────────────────────
  SOC 2 SCORE:    68/100
  CRITERIA CHECKED: CC6, CC7, A1
  AUDIT EVIDENCE:  PDF ready at custodia.dev/reports/soc2_p9Kx

Generate SOC 2 Report Free View Demo Report

Frequently Asked Questions

Do AI companies need SOC 2?

If you sell software to enterprise customers, you will be asked for SOC 2 Type II. Enterprise buyers increasingly require it before signing contracts. For AI companies specifically, SOC 2 is now often required alongside EU AI Act and NIST AI RMF documentation in vendor security questionnaires.

What is CC6 in SOC 2?

CC6 (Logical and Physical Access Controls) covers how your system restricts access to data and functionality. For AI systems, auditors specifically check: is LLM inference gated by authentication? Are training data buckets access-controlled with least privilege? Is there role-based access to model configs? Are system prompt changes logged?

How long does SOC 2 Type II take?

SOC 2 Type I (point-in-time assessment) typically takes 1-2 months. SOC 2 Type II (audit over observation period) requires a minimum 6-month observation period plus 1-2 months for audit fieldwork. The observation period starts as soon as controls are in place — implement controls now to start your clock.

What code changes does SOC 2 require?

Authentication on all data endpoints (CC6), audit logging of access events (CC6, CC7), encryption of data in transit and at rest (CC6), monitoring and alerting on anomalies (CC7), incident response procedures (CC7), and availability metrics. For AI systems: logging of model inference requests, access controls on training pipelines, and data retention policies for model inputs/outputs.

Can Custodia generate SOC 2 evidence reports?

Yes. Custodia maps security findings to SOC 2 trust criteria. The PDF report from custodia scan includes a SOC 2 mapping section showing which CC criteria have gaps in your code, suitable for sharing with auditors and for internal compliance tracking.

EU AI Act for Developers

Articles 9, 13, 14, and 52 — what your codebase must do to comply and how to automate it.

OWASP LLM Top 10 Scanner

Full OWASP LLM coverage — all 10 AI vulnerability categories and why Snyk misses every one of them.

Enterprise-Ready?