Little's Law
Little's Law says the average number of requests in a system equals arrival rate times average time in the system (L = λ × W), which is how you size concurrency from throughput and latency.
also: Little's Law · L = λW · queueing
In capacity terms: concurrency = throughput × latency. If you serve 200 requests per second and each takes 250ms, you have 200 × 0.25 = 50 requests in flight on average. That 50 is the concurrency your service must hold at any instant, and it is what decides how many threads, connections, or instances you need.
The law is unreasonably general. It holds for any stable system regardless of the arrival distribution, so you can apply it to HTTP handlers, database connection pools, queue workers, or a coffee shop line without modelling the internals. The catch is the word stable: it describes steady state, so size for peak throughput and tail latency, not the average, or you will be under-provisioned exactly when it matters.
related_terms
faq
Questions & answers
- What is Little's Law used for?
- Capacity planning. It converts a throughput target and a latency figure into the concurrency you must support, which in turn tells you the thread count, connection pool size, or number of instances to provision. It also works in reverse to find the throughput a fixed pool can sustain.
- Does Little's Law need a specific arrival pattern?
- No, that is its strength. It holds for any stable queueing system regardless of how requests arrive or how service times are distributed. It only describes the long-run average, so use peak rates and tail latency when you size for the worst minute, not the mean.
Want this applied to your stack, not just defined?
The free tools run the numbers; an audit tells you where the real cost and risk are. Book a call, or leave your email and I'll reach out.
Prefer proof first? See how this plays out in real case studies →