free_tool
Throughput & Concurrency Calculator
How many instances does your traffic actually need? Enter requests per second and latency and Little's Law gives you the in-flight concurrency to support — and the fleet size to serve it with headroom, instead of guessing and over-provisioning.
Instances needed
12@ 70% util
actual utilization ≈ 66.7% once rounded up
- Required concurrencyL = λ·W
- 400
- Throughput / instance
- 625 rps
- Capacity at this fleet
- 7,500 rps
Sizing a fleet or chasing a scaling cliff? I'll pressure-test the numbers against your real traffic shape and autoscaling config.
Size it with me — book a callA first-order model (L = λ·W). Real fleets also see queueing, GC pauses and connection limits — but this is the number every capacity plan starts from. Share the link to compare scenarios.
how_it_works
One identity behind every capacity plan
Little's Law says the average number of requests in flight equals arrival rate × time in system: L = λ·W. At 5,000 req/s and 80ms latency you always have ~400 requests in flight — no matter how you slice it.
From there it's division: a single instance holding 50 concurrent requests clears 50 ÷ 0.08s ≈ 625 req/s, so you need enough instances to cover 5,000 req/s at your target utilization. The lever that moves it most isn't more boxes — it's cutting W. Halve latency and you halve the fleet.