Notes
A deterministic SQL guard for NL2SQL
Natural-language-to-SQL is a great demo and a terrifying production feature. The moment a model writes queries against your warehouse, two bad things become possible: a query that mutates or destroys data, and a query that's technically read-only but scans everything or quietly returns the wrong number. You can't prompt your way out of either — “only write safe SELECTs” is a suggestion, not a guarantee.
The guard in the analytics agent takes a firm position:
validate the parse tree, not the string. Regex blocklists are a
trap — /*x*/DROP, casing, whitespace, and nested subqueries all slip
past them. So every candidate query is parsed with SQLGlot into an AST, and the
guard reasons about structure, where formatting tricks don't exist.
def guard_sql(sql, scope, dialect) -> GuardResult:
statements = parse(sql, dialect) # SQLGlot AST
if len(statements) != 1:
return reject("syntax_guard", "one statement only")
expr = statements[0]
if blocked_command(sql, dialect): # DROP / INSERT / UPDATE / ...
return reject("operation_guard", "command not allowed")
if not isinstance(expr, exp.Select): # root node must be SELECT
return reject("operation_guard", "SELECT only")
if blocked_function(expr): # file / system functions
return reject("function_guard", "function not allowed")
for table in expr.find_all(exp.Table): # every table in the allowlist
if table.name not in scope.allowed_tables:
return reject("scope_guard", f"table {table.name} not allowed")
if fanout_risk(expr): # join would double-count amounts
return reject("fanout_guard", "fan-out risk")
expr = apply_auto_limit(expr) # AST rewrite, not string append
return GuardResult(allowed=True, sql=expr.sql(dialect)) Layers, each closing a class of failure
- One statement, SELECT only. The root of the tree must be a
Selectnode; multiple statements andUNIONare rejected. Because it's checked on node type,SELECTdressed up with comments or trailing; DROP …doesn't parse to something allowed. - Blocked commands & functions. DDL/DML and file/system functions are refused outright — the things a read-only analytics path should never touch.
- Table scope. Every
Tablenode in the tree — including ones buried in subqueries and CTEs — must be in an allowlist. The model can't reach tables it wasn't given. - Fan-out risk. This one is about correctness, not safety: a join that double-counts an order amount produces a confident, wrong total. The guard detects that structural shape and rejects it. Wrong numbers are as dangerous as unsafe ones.
- Auto-LIMIT. A bound is injected by rewriting the AST (not by
string-appending
LIMIT), unless the query is a scalar aggregate that returns a single row.
Fail closed, then repair
The guard returns a structured GuardResult with a category —
scope_guard, fanout_guard, and so on. That category feeds
a bounded repair loop: a rejected query goes back to the model
with the specific reason, it gets a small number of attempts to fix it, and if it
still won't pass, the request fails closed rather than guessing. Nothing reaches
the warehouse until the tree is provably safe.
Two things make this practical. A Qdrant-backed semantic layer feeds the model only the relevant tables and columns instead of the whole schema — which cut prompt context by ~73% and, as a side effect, shrank what the model could even attempt to reference. And a result-equivalence smoke suite runs the same questions across DuckDB and ClickHouse, so a guard or prompt change that breaks an answer is caught before it ships.
The model proposes SQL; the guard decides whether it runs. Safe by construction, not by instruction.