Skip to content

agent_tools

The 6 ways an agent's tools fail

A tool-calling agent never sees your code. It chooses what to call from the names, descriptions, and schemas in the tool block, so that's where it quietly breaks. Here are the failure modes, each with a real before/after fix, or audit your own set with the free tool.

Tool selection

Two tools the model can't tell apart

When two tools have similar names or descriptions, the model has no reliable basis to choose between them, so it picks one at random — which shows up as flaky, hard-to-reproduce agent behavior.

The failure and the fix
Tool selection

Descriptions that don't say when to use the tool

The tool description is the model's only basis for choosing a tool. A missing or one-line description that says what the tool is but not when to use it leaves the model guessing, so the tool gets called at the wrong time or never.

The failure and the fix
Parameter schema

Parameters with no description

A parameter with no description is one the model fills by guessing the format. Undescribed parameters are where malformed tool calls come from: the wrong date format, the wrong id, the wrong unit.

The failure and the fix
Parameter schema

Schemas that don't mark anything required

With no required array, the model treats every argument as optional and routinely omits the ones the tool actually needs, so the call fails at the boundary instead of doing the work.

The failure and the fix
Safety & injection

Destructive tools with no guardrail

A tool that deletes, pays, deploys, or runs code, with nothing in its contract about confirmation or scope, is the exact target a prompt-injection payload steers toward. One bad or injected call does irreversible damage.

The failure and the fix
Safety & injection

Raw SQL, shell, or code pass-through tools

A run_sql(query) or exec(command) tool hands the model a single free-text field of code, which is the widest possible blast radius and the easiest tool to weaponize through injection, because the model writes the payload itself.

The failure and the fix