What does the AI Agent Trace Analyzer measure?

It parses your agent trace and reconstructs each model turn, then reports the number of turns, the tool calls, the total characters, the estimated tokens, the input tokens re-sent across turns (the agentic context tax), and an estimated cost against the model you pick. It also flags steep context-tax growth, repeated identical tool calls, a bloated first or system message, and an oversized tool result that rides along on every later turn.

What trace formats can I paste?

Two: an Anthropic Messages array, where content is a string or an array of blocks with tool_use and tool_result, and an OpenAI chat array, where assistant messages carry tool_calls and results come back as role tool messages. It detects which one you pasted, parses both defensively, and also accepts an object that wraps the array under a messages key.

What is the agentic context tax it keeps talking about?

Every tool call adds a model turn, and each turn re-reads the whole conversation before it can act, so the input tokens you are billed grow with roughly the square of the tool calls rather than linearly. The analyzer sums the input re-sent on every turn and shows it as a multiple of the conversation size, so you can see how much of your spend is text the model has already seen.

How accurate are the token counts and cost?

They are estimates. No exact tokenizer runs here, so tokens are approximated at about 4 characters per token, and the billed input per turn is modeled as the running sum of the conversation up to that turn. Real provider tokenizers and prompt caching shift the exact numbers, but the shape of the context-tax curve and the relative findings hold.

How does it decide a tool call is a loop?

It builds a signature from each tool call's name and its normalized arguments and counts repeats. When the same tool is called with the same arguments two or more times, it flags a possible loop, because that usually means the agent is retrying instead of using the result it already has, which re-bills the same call on every later turn.

Is my trace uploaded or stored anywhere?

No. The whole analysis runs in your browser, so the trace you paste is never sent to a server or stored. It is safe to check a trace that contains internal prompts, tool arguments, or customer data.

free_tool

Where do your agent's tokens go?

An agent re-sends the whole conversation on every model turn, so a handful of tool calls can quietly multiply your token bill. Paste a trace and see the turns, the re-sent context tax, an estimated cost, and the loops and fat tool results that drive it.

Paste your agent trace (JSON)

[
  {
    "role": "user",
    "content": "Find the cheapest in-stock 27-inch 4K monitor under $400 and add it to my cart."
  },
  {
    "role": "assistant",
    "content": [
      { "type": "text", "text": "I'll search the catalog for matching monitors." },
      { "type": "tool_use", "name": "search_products", "input": { "query": "27 inch 4K monitor", "max_price": 400 } }
    ]
  },
  {
    "role": "user",
    "content": [
      { "type": "tool_result", "content": "[{\"id\":\"m-101\",\"title\":\"Acme U27 4K\",\"price\":379,\"stock\":12},{\"id\":\"m-102\",\"title\":\"Beta Pro 27\",\"price\":349,\"stock\":0},{\"id\":\"m-103\",\"title\":\"Gamma View 4K\",\"price\":389,\"stock\":4}]" }
    ]
  },
  {
    "role": "assistant",
    "content": [
      { "type": "text", "text": "Beta Pro is cheapest but out of stock. Let me confirm stock on the next cheapest." },
      { "type": "tool_use", "name": "check_stock", "input": { "id": "m-101" } }
    ]
  },
  {
    "role": "user",
    "content": [
      { "type": "tool_result", "content": "{\"id\":\"m-101\",\"in_stock\":true,\"qty\":12,\"warehouse\":\"us-east\"}" }
    ]
  },
  {
    "role": "assistant",
    "content": [
      { "type": "text", "text": "Double-checking stock before I add it." },
      { "type": "tool_use", "name": "check_stock", "input": { "id": "m-101" } }
    ]
  },
  {
    "role": "user",
    "content": [
      { "type": "tool_result", "content": "{\"id\":\"m-101\",\"in_stock\":true,\"qty\":12,\"warehouse\":\"us-east\"}" }
    ]
  },
  {
    "role": "assistant",
    "content": [
      { "type": "text", "text": "Confirmed in stock at $379. Adding it to the cart now." },
      { "type": "tool_use", "name": "add_to_cart", "input": { "id": "m-101", "qty": 1 } }
    ]
  },
  {
    "role": "user",
    "content": [
      { "type": "tool_result", "content": "{\"cart_id\":\"c-555\",\"items\":1,\"subtotal\":379}" }
    ]
  },
  {
    "role": "assistant",
    "content": "Done. I added the Acme U27 4K ($379, in stock) to your cart. The cheaper Beta Pro was out of stock."
  }
]

Price against model

Anthropic Messages or OpenAI chat array. Runs entirely in your browser. Nothing is uploaded, sent to a server, or stored.

Estimated cost for this run

$0.0033

anthropic format · Claude Sonnet

Re-sent context tax

$0.0014

5.4x conversation

Model turns5

Tool calls4

Messages10839 chars

Conversation214 toksent once

Billed input585 tokre-sent across turns

Billed output105 tok

Input tokens re-sent per turn

the agentic context tax

Each bar is one model turn. The input re-sent climbs because every turn re-reads the whole prior conversation: turn 5 alone bills 189 input tokens.

Most input tokens are re-sent context, not new content

high

Across 5 model turns the agent is billed about 5.4x the size of the conversation in input tokens (81% of input spend is re-sent prefix). Each turn re-reads everything before it, so input grows faster than the work does. Trim what rides along: summarize old turns, drop stale tool results, and keep the system prompt lean.

Compact the running history once it gets long: replace older tool results with a short summary, and prune blocks the next turn does not need before the next model call.

Possible redundant tool call

high

The same tool call appears twice with identical arguments: check_stock(m-101). If the second call is a deliberate re-check before acting, this is fine. If not, cache the first result instead of paying for it again.

If the second call is not a deliberate re-check, cache the first result by arguments so a repeat returns the cached value instead of re-billing it.

Tool-call count drives the turn count

4 tool calls. That is a reasonable count. Adding more multiplies the re-sent context, so keep the loop tight.

First message stays lean

The first message is about 20 tokens, a reasonable share of the conversation. Keep it tight, since it is re-sent on every turn.

No single tool result bloats every later turn

The largest tool result is about 46 tokens, small enough that re-sending it on later turns is not a major cost.

These are estimates from the trace text

Tokens are approximated at about 4 characters per token, and the billed input per turn is modeled as the running sum of every message up to that turn. Real provider tokenizers and prompt caching will shift the exact numbers, but the shape of the context-tax curve holds.

Token cost is where agents quietly get expensive, and where most of the spend is re-sent context nobody looks at. I help teams cut that down with caching, context pruning and a tighter loop. That's the kind of review I do.

Cut your agent's token bill: book a call

Tokens are estimated at about 4 characters per token, since no exact tokenizer runs here, and the billed input per turn is modeled as the running sum of the conversation so far. Real provider tokenizers and prompt caching shift the exact figures, but the shape of the context-tax curve holds. It runs entirely in your browser and uploads nothing.

why_it_matters

The expensive part of an agent is the context it re-sends

Every tool call adds a model turn, and every turn re-reads the whole prior conversation before it can act. So input tokens do not grow linearly with the work, they grow with roughly the square of the tool calls. A trace that looks like a short chat can bill several times its own size in input tokens, and most of that is text the model has already seen.

That re-sent context is also context-window pressure: a fat system prompt or one huge tool result rides along on every following turn, eating budget and crowding out room for new reasoning. This analyzer reconstructs each turn, draws the context-tax curve, prices it, and flags the patterns, a possible loop, a bloated first message, an oversized tool result, that you can actually cut.

See how I can help →

faq

Questions & answers

What does the AI Agent Trace Analyzer measure?: It parses your agent trace and reconstructs each model turn, then reports the number of turns, the tool calls, the total characters, the estimated tokens, the input tokens re-sent across turns (the agentic context tax), and an estimated cost against the model you pick. It also flags steep context-tax growth, repeated identical tool calls, a bloated first or system message, and an oversized tool result that rides along on every later turn.
What trace formats can I paste?: Two: an Anthropic Messages array, where content is a string or an array of blocks with tool_use and tool_result, and an OpenAI chat array, where assistant messages carry tool_calls and results come back as role tool messages. It detects which one you pasted, parses both defensively, and also accepts an object that wraps the array under a messages key.
What is the agentic context tax it keeps talking about?: Every tool call adds a model turn, and each turn re-reads the whole conversation before it can act, so the input tokens you are billed grow with roughly the square of the tool calls rather than linearly. The analyzer sums the input re-sent on every turn and shows it as a multiple of the conversation size, so you can see how much of your spend is text the model has already seen.
How accurate are the token counts and cost?: They are estimates. No exact tokenizer runs here, so tokens are approximated at about 4 characters per token, and the billed input per turn is modeled as the running sum of the conversation up to that turn. Real provider tokenizers and prompt caching shift the exact numbers, but the shape of the context-tax curve and the relative findings hold.
How does it decide a tool call is a loop?: It builds a signature from each tool call's name and its normalized arguments and counts repeats. When the same tool is called with the same arguments two or more times, it flags a possible loop, because that usually means the agent is retrying instead of using the result it already has, which re-bills the same call on every later turn.
Is my trace uploaded or stored anywhere?: No. The whole analysis runs in your browser, so the trace you paste is never sent to a server or stored. It is safe to check a trace that contains internal prompts, tool arguments, or customer data.

Want your agent's token bill cut for real?

The analyzer shows where the tokens go. I'll review your actual traces and tighten the loop: prompt caching, context pruning, batched tools and a step budget that keeps cost flat as you scale. Book a call, or leave your email.

Book a call

Prefer proof first? See how this plays out in real case studies →