Most teams validate user input obsessively. But when an LLM generates output — SQL queries, JSON configs, executable code — they pass it straight through. The model is trusted implicitly. This is the equivalent of executing user input without sanitization.
Your "natural language to SQL" feature generates a query for "show me last month's revenue." The LLM returns a valid-looking query that includes DROP TABLE users. Your app executes it because the SQL is syntactically correct.
Your API uses an LLM to generate configuration JSON. The model hallucinates an extra field: "admin": true. Downstream code checks for that field and grants elevated privileges.
Your AI coding assistant generates a helper function. Buried in the implementation is a fetch() call to an external URL that exfiltrates environment variables. It looks like a logging utility.
Output has invalid syntax. Unclosed JSON braces, malformed SQL, broken code. Your parser throws an error — or worse, silently corrupts data.
LLM returns { "users": [{ "name": "Alice" } — missing closing brackets. JSON.parse() throws. If you catch and retry, you burn tokens. If you don't catch, your app crashes.
Output is syntactically valid but means the wrong thing. The SQL query runs but returns wrong data. The JSON parses but contains fabricated values.
Asked for "active users this month," LLM generates SELECT * FROM users (no date filter, no active check). Query succeeds. Dashboard shows 10x the real number. Nobody notices for weeks.
Attacker influences the LLM to produce output that bypasses your downstream validation. The output is designed to exploit whatever system consumes it.
Attacker's prompt injection causes the LLM to output JSON with "role": "admin" or SQL with UNION SELECT password FROM credentials. The output is syntactically perfect — it's the intent that's malicious.
Create explicit schemas (JSON Schema, SQL grammar subset, AST patterns) for every type of LLM output your app accepts.
const OutputSchema = z.object({
query: z.string().max(500),
parameters: z.array(z.string()),
tables: z.array(z.enum(["users", "orders", "products"])),
});Never use LLM output as a raw string. Always parse it through a strict validator before any downstream use.
const parsed = OutputSchema.safeParse(llmOutput);
if (!parsed.success) {
logger.warn("LLM output rejected", parsed.error);
return fallbackResponse();
}If your LLM generates SQL, use an allowlist of permitted operations. Block DROP, DELETE, UPDATE, ALTER, and any DDL.
const BLOCKED = /\b(DROP|DELETE|ALTER|TRUNCATE|INSERT|UPDATE|GRANT|EXEC)\b/i;
if (BLOCKED.test(generatedSQL)) {
throw new Error("Destructive SQL blocked");
}If your LLM generates code, parse the AST and check for dangerous patterns: network calls, file system access, eval(), dynamic imports.
// Check for dangerous patterns in generated code const DANGEROUS = [ /\bfetch\s*\(/, // Network calls /\beval\s*\(/, // Code execution /require\s*\(/, // Dynamic imports /process\.env/, // Env access ];
If you must execute LLM-generated code or SQL, run it in a sandboxed environment with no network access, read-only data, and strict timeouts.
// Execute in isolated context
const result = await sandbox.execute(validatedCode, {
timeout: 5000,
network: false,
filesystem: "read-only",
maxMemory: "128MB",
});Custodia detects LLM output used without validation — unvalidated SQL generation, raw JSON parsing, unsandboxed code execution, and more.
Scan Your Code Free →