Flagship case study

One ERP AI layer

Three production AI systems for a single enterprise ERP client — Operations Copilot, NL2SQL Data Agent, and Knowledge Assistant — built on one shared safety and evaluation practice. Independently deployed; wired together only where a real dependency exists.

~73% prompt-context cutdeterministic SQL guardsingle-use approved writestrace-grounded answers

The problem

An ERP holds the data a business actually runs on — inventory, orders, suppliers, money. Bolting a chatbot onto that is easy; building one that can act on it without corrupting state or inventing numbers is the hard part. The bar for shipping was concrete: every write passes through human approval, every analytical query is safe by construction, and every answer carries grounding you can audit.

Architecture

Operations Copilot is the centerpiece. It routes to role-scoped domain specialists and calls governed tools over MCP — including the optional NL2SQL Data Agent service. Knowledge Assistant is a sibling system in the same ERP AI layer, not orchestrated by the copilot.

Operator Console — React + ECharts

↓

Operations Copilot FastAPI + DeepAgents auth · sessions · streaming · approvals · proactive monitors · trace grounding

↓

Router → role-scoped specialists

sales-analyst · order-manager · purchasing · inventory · customer-insights · data-warehouse-analyst*

MCP clients

Spring Boot MCP Java 21 · Spring AI MySQL business data · 10 read + 4 write tools · approval execution (payload hash · actor/session/tool binding · 15-min TTL · one-time use)

NL2SQL MCP* SQLGlot guard · Qdrant semantic layer · ClickHouse / DuckDB

supporting: MongoDB (sessions · traces · audits · alerts) · Docker sandbox (isolated Python analysis) · eval harnesses (routing · tool-choice · grounding · live-smoke)

Sibling system · not orchestrated by the copilot

Knowledge Assistant RAG LangGraph + Milvus hybrid retrieval · citation grounding + strict-evidence refusal · RBAC

* optional / read-only paths.

How it works

Operations Copilot

A FastAPI service running a DeepAgents harness. A router classifies each request to a role-scoped specialist that receives only the tools it needs — selected from a static catalog and tags, not prompt guesswork. Answers are labeled authoritative, derived, or unverified from captured tool traces.

Governed tools & approval boundary

A Spring AI / Java 21 MCP server owns the MySQL business data and exposes 10 read + 4 write tools. Approval is deliberately not an agent tool: the model can propose a write, but execution runs through a human-controlled REST path bound to a single-use, payload-hashed, TTL'd approval.

Analytics Agent (NL2SQL)

Self-service analytics over the warehouse, reached by Operations Copilot via optional MCP. A deterministic SQLGlot guard (SELECT-only, scope/fanout checks, auto-LIMIT) and a Qdrant semantic layer that cut prompt context ~73% vs full-schema dumps; bounded SQL repair and result-equivalence regression evals across DuckDB and ClickHouse.

Knowledge Assistant (RAG)

A standalone LangGraph-orchestrated RAG system over enterprise documents: Milvus hybrid retrieval with intent-routed strategies, citation grounding with strict-evidence refusal, RBAC, and LLM-judge / citation evals with per-query observability.

Outcomes & evidence

~73% prompt-context reduction via the Qdrant semantic layer vs full-schema dumps.
Deterministic SQLGlot guard — SELECT-only, scope/fanout checks, auto-LIMIT; dangerous SQL is blocked before it reaches the warehouse.
Single-use, cryptographically-bound approval on every write — validated for actor/session/tool binding, payload hash, expiry, and one-time use before execution.
Trace-grounded answers marked authoritative / derived / unverified, with source evidence from tool calls.
Eval harnesses for routing, tool-choice, grounding, result-equivalence, and live smoke — agent behavior you can measure before shipping.

Stack

PythonFastAPIDeepAgentsLangGraphLangChain MCP adaptersSpring Boot / Spring AIJava 21MyBatis-PlusMySQLMongoDBClickHouseDuckDBMilvusQdrantSQLGlotReactEChartsDocker

Per-system deep dives

Operations Copilot →Analytics Agent (NL2SQL) →Knowledge Assistant (RAG) →

Source

ecommerce-agent ecommerce-mcp-server nl2sql-data-agent enterprise_rag