What does a coding agent cost to run?
An agent that reads files, runs commands, and edits code across several tool calls per task. Lower request volume but heavy per-request: large context and a long tool loop, which is where the agentic context tax bites hardest. On Claude Sonnet this works out to about $5,488 a month; here is the figure across every model and what drives it.
assumptions
A planning estimate for this shape of workload. Tune any of it in the calculator.
- 400 tasks a day
- ~4,000 prompt tokens of code and instructions per request
- ~1,200 output tokens per turn
- 8 tool calls per task (read, edit, run, repeat)
- A flat serving line for a warm endpoint
monthly_cost · Claude Sonnet
$5,488/ month
- Input tokens97.2k/req · agentic context
- $3,499
- Output tokens10.8k/req
- $1,944
- GCP servingwarm endpoint, flat
- $45.00
97.2k input · 10.8k output · 9 LLM turns / request
cost_by_model
A coding agent across every model
| model | cost / month |
|---|---|
| Gemini 1.5 FlashGoogle (Vertex) | $171cheapest |
| GPT-4o miniOpenAI | $298 |
| Claude HaikuAnthropic | $1,497 |
| Gemini 1.5 ProGoogle (Vertex) | $2,151 |
| GPT-4oOpenAI | $4,257 |
| Claude SonnetAnthropic · shown above | $5,488 |
| Claude OpusAnthropic | $27,261 |
cheapest · public list prices as of 2026-06 · planning estimate, not a quote
what_drives_it
Where the money goes
Each of the 8 tool calls re-sends the whole conversation, so input tokens grow with roughly the square of the tool calls. This is the agentic context tax, and it is the dominant cost.
The cheapest option here, Gemini 1.5 Flash, comes to about $171 a month against $5,488 on Claude Sonnet. Whether the cheaper model fits is a question for your evaluation set, not the price sheet. The bigger lever is usually the workload itself: caching re-sent context, trimming what each turn carries, and capping the tool loop move the bill more than swapping models does.
faq
Questions & answers
- How much does a coding agent cost per month?
- On Claude Sonnet, about $5,488 a month at 400 requests a day with the assumptions below. The cheapest model compared here, Gemini 1.5 Flash, runs about $171 for the same workload. Your real figure moves with volume and tokens, so tune it in the calculator.
- What makes a coding agent expensive?
- Each of the 8 tool calls re-sends the whole conversation, so input tokens grow with roughly the square of the tool calls. This is the agentic context tax, and it is the dominant cost.
- Which model is cheapest for a coding agent?
- Gemini 1.5 Flash, at about $171 a month for this workload. Cheaper is not automatically better: a model that needs retries or longer prompts can cost more in practice, so test the candidates on your own evaluation set before committing.
A cost estimate is a start. Making an agent cheap in production is the work.
Prompt caching, context trimming, and the right model per step usually cut an agent's bill by more than half. Book a call, or leave your email and I'll reach out.
Prefer proof first? See how this plays out in real case studies →