Question 1

Is GPT-4o or Gemini 1.5 Flash cheaper?

Accepted Answer

Gemini 1.5 Flash is cheaper at list price. It runs $0.08 per million input tokens and $0.30 per million output tokens, against $2.50 and $10.00 for GPT-4o. On a typical agent workload that works out to about 97% less per month.

Question 2

What is the price difference between GPT-4o and Gemini 1.5 Flash?

Accepted Answer

GPT-4o is $2.50 in and $10.00 out per million tokens; Gemini 1.5 Flash is $0.08 in and $0.30 out. Output tokens cost several times more than input on both, so the gap that matters most depends on how much your workload generates versus reads.

Question 3

Should I switch from GPT-4o to Gemini 1.5 Flash to cut cost?

Accepted Answer

Possibly. Gemini 1.5 Flash is about 97% cheaper on the same workload, and the saving grows with volume and tool calls because each tool call re-sends the context. But a cheaper model that needs retries or longer prompts can cost more in practice, so price both on your own evaluation set and your actual token mix before you switch.

metric	GPT-4oOpenAI · Mid tier	Gemini 1.5 FlashGoogle (Vertex) · Cheap tier
Input priceper 1M tokens	$2.50	$0.08
Output priceper 1M tokens	$10.00	$0.30
Blended price3:1 input:output mix, per 1M	$4.38	$0.13
One chat request1.5k in / 600 out, no tools	$0.0097	$0.0003
Agent workload / month2,000 req/day, 3 tool calls, RAG on	$4,532	$138

GPT-4o vs Gemini 1.5 Flash

Which one should you pick?

Questions & answers

Picking a model is the easy part. Making it cheap in production is the work.