Raw SQL, shell, or code pass-through tools
A run_sql(query) or exec(command) tool hands the model a single free-text field of code, which is the widest possible blast radius and the easiest tool to weaponize through injection, because the model writes the payload itself.
see_it · fix_it
The failure, then the fix
Each verdict below is the actual MCP & Agent Tool Auditor run on the snippet, not a description of one.
[
{
"name": "run_sql",
"description": "Run a read-only SQL query against the analytics replica and return the rows. Use this when the user asks for a metric.",
"input_schema": {
"type": "object",
"properties": { "sql": { "type": "string", "description": "The SQL query to run" } },
"required": ["sql"]
}
}
]Warns · auditor verdictPass-through tool(s) that hand the model a single free-text field of code, SQL, or shell: run_sql. A run_sql(query) or exec(command) tool has the widest possible blast radius and is the easiest to weaponize through injection, since the model writes the payload itself.
[
{
"name": "get_orders_by_status",
"description": "Return the count and total value of orders in a given status over a date range. Use this when the user asks for order metrics.",
"input_schema": {
"type": "object",
"properties": {
"status": { "type": "string", "enum": ["pending", "paid", "shipped", "refunded"], "description": "Order status to filter by" },
"since": { "type": "string", "description": "Start date, ISO-8601 like 2026-01-01" }
},
"required": ["status", "since"]
}
}
]Passes · auditor verdictNo tool exposes a raw code, SQL, or shell field, so the model can only act through specific, typed operations.
fix · Replace the free-text pass-through with narrow, intent-specific tools (get_orders_by_status instead of run_sql), or constrain it to a vetted allow-list of operations.
why_it_matters
A pass-through tool trades safety for convenience: instead of specific operations, it exposes one free-text field (query, command, code, script) that does anything. That means an injected instruction doesn't have to find a dangerous tool, it just has to get the model to write a dangerous string into a tool you already shipped. run_sql can read every table; exec can do anything the process can. The model authoring the payload is exactly what makes it dangerous.
Replace the pass-through with narrow, intent-specific tools: get_orders_by_status instead of run_sql, restart_service(name) instead of exec. If a general capability is genuinely needed, constrain it to a vetted allow-list of operations and validate hard in code. The auditor flags a tool whose only parameters are a single free-text code or query field on a high-impact operation.
more_failure_modes
Related ways tools break
Destructive tools with no guardrail
Read itTool selectionDescriptions that don't say when to use the tool
Read itSee all 6 failure modes, or read what tool calling is.
faq
Questions & answers
- Is it safe to give an LLM a run_sql or exec tool?
- It's the riskiest tool you can ship, because the model writes the payload, so an injected instruction only has to get a malicious string into a tool you already exposed. Prefer narrow, intent-specific tools. If you must offer a general one, make it strictly read-only, constrain it to an allow-list, and validate every call in code.
- What's the alternative to a generic database or shell tool?
- Decompose it into the specific operations the agent actually needs: get_orders_by_status rather than run_sql, restart_service(name) rather than exec(command). Typed, scoped tools let the model act without handing it a blank check, and they're far easier to audit and rate-limit.
Spotting one failure is easy. Hardening the whole agent is the work.
I review which tools the loop can reach autonomously, how you fence destructive calls behind confirmation, idempotency on the side effects, and the evals that catch a wrong tool call before users do. Book a call, or leave your email.
Prefer proof first? See how this plays out in real case studies →