free_tool
Where do your agent's tokens go?
An agent re-sends the whole conversation on every model turn, so a handful of tool calls can quietly multiply your token bill. Paste a trace and see the turns, the re-sent context tax, an estimated cost, and the loops and fat tool results that drive it.
Anthropic Messages or OpenAI chat array. Runs entirely in your browser. Nothing is uploaded, sent to a server, or stored.
Estimated cost for this run
$0.0033
anthropic format · Claude Sonnet
Re-sent context tax
$0.0014
5.4x conversation
Input tokens re-sent per turn
the agentic context taxEach bar is one model turn. The input re-sent climbs because every turn re-reads the whole prior conversation: turn 5 alone bills 189 input tokens.
Most input tokens are re-sent context, not new content
highAcross 5 model turns the agent is billed about 5.4x the size of the conversation in input tokens (81% of input spend is re-sent prefix). Each turn re-reads everything before it, so input grows faster than the work does. Trim what rides along: summarize old turns, drop stale tool results, and keep the system prompt lean.
Compact the running history once it gets long: replace older tool results with a short summary, and prune blocks the next turn does not need before the next model call.
Possible redundant tool call
highThe same tool call appears twice with identical arguments: check_stock(m-101). If the second call is a deliberate re-check before acting, this is fine. If not, cache the first result instead of paying for it again.
If the second call is not a deliberate re-check, cache the first result by arguments so a repeat returns the cached value instead of re-billing it.
Tool-call count drives the turn count
4 tool calls. That is a reasonable count. Adding more multiplies the re-sent context, so keep the loop tight.
First message stays lean
The first message is about 20 tokens, a reasonable share of the conversation. Keep it tight, since it is re-sent on every turn.
No single tool result bloats every later turn
The largest tool result is about 46 tokens, small enough that re-sending it on later turns is not a major cost.
These are estimates from the trace text
Tokens are approximated at about 4 characters per token, and the billed input per turn is modeled as the running sum of every message up to that turn. Real provider tokenizers and prompt caching will shift the exact numbers, but the shape of the context-tax curve holds.
Token cost is where agents quietly get expensive, and where most of the spend is re-sent context nobody looks at. I help teams cut that down with caching, context pruning and a tighter loop. That's the kind of review I do.
Cut your agent's token bill: book a callTokens are estimated at about 4 characters per token, since no exact tokenizer runs here, and the billed input per turn is modeled as the running sum of the conversation so far. Real provider tokenizers and prompt caching shift the exact figures, but the shape of the context-tax curve holds. It runs entirely in your browser and uploads nothing.
why_it_matters
The expensive part of an agent is the context it re-sends
Every tool call adds a model turn, and every turn re-reads the whole prior conversation before it can act. So input tokens do not grow linearly with the work, they grow with roughly the square of the tool calls. A trace that looks like a short chat can bill several times its own size in input tokens, and most of that is text the model has already seen.
That re-sent context is also context-window pressure: a fat system prompt or one huge tool result rides along on every following turn, eating budget and crowding out room for new reasoning. This analyzer reconstructs each turn, draws the context-tax curve, prices it, and flags the patterns, a possible loop, a bloated first message, an oversized tool result, that you can actually cut.
faq
Questions & answers
- What does the AI Agent Trace Analyzer measure?
- It parses your agent trace and reconstructs each model turn, then reports the number of turns, the tool calls, the total characters, the estimated tokens, the input tokens re-sent across turns (the agentic context tax), and an estimated cost against the model you pick. It also flags steep context-tax growth, repeated identical tool calls, a bloated first or system message, and an oversized tool result that rides along on every later turn.
- What trace formats can I paste?
- Two: an Anthropic Messages array, where content is a string or an array of blocks with tool_use and tool_result, and an OpenAI chat array, where assistant messages carry tool_calls and results come back as role tool messages. It detects which one you pasted, parses both defensively, and also accepts an object that wraps the array under a messages key.
- What is the agentic context tax it keeps talking about?
- Every tool call adds a model turn, and each turn re-reads the whole conversation before it can act, so the input tokens you are billed grow with roughly the square of the tool calls rather than linearly. The analyzer sums the input re-sent on every turn and shows it as a multiple of the conversation size, so you can see how much of your spend is text the model has already seen.
- How accurate are the token counts and cost?
- They are estimates. No exact tokenizer runs here, so tokens are approximated at about 4 characters per token, and the billed input per turn is modeled as the running sum of the conversation up to that turn. Real provider tokenizers and prompt caching shift the exact numbers, but the shape of the context-tax curve and the relative findings hold.
- How does it decide a tool call is a loop?
- It builds a signature from each tool call's name and its normalized arguments and counts repeats. When the same tool is called with the same arguments two or more times, it flags a possible loop, because that usually means the agent is retrying instead of using the result it already has, which re-bills the same call on every later turn.
- Is my trace uploaded or stored anywhere?
- No. The whole analysis runs in your browser, so the trace you paste is never sent to a server or stored. It is safe to check a trace that contains internal prompts, tool arguments, or customer data.
Want your agent's token bill cut for real?
The analyzer shows where the tokens go. I'll review your actual traces and tighten the loop: prompt caching, context pruning, batched tools and a step budget that keeps cost flat as you scale. Book a call, or leave your email.
Prefer proof first? See how this plays out in real case studies →