glossary
The terms behind a system that holds up
Plain-English definitions for the reliability, performance, AI, and security concepts I work with every day. Each one links to a free tool that puts the number to work on your own stack.
reliability_scale
Reliability & scale
The targets, budgets, and capacity math behind a service that stays up under load.
Availability (the Nines)
Availability is the fraction of time a service is up, usually quoted in 'nines': 99.9% (three nines) is about 43 minutes of downtime a month, 99.99% (four nines) about 4 minutes.
Read definitionBurn Rate
Burn rate is how fast you are spending an error budget relative to spending it evenly: 1x exhausts it exactly at the end of the window, 14.4x exhausts it in about two days.
Read definitionConcurrency
Concurrency is the number of requests a system is handling at the same instant, which sets how many workers, connections, and instances you need, distinct from throughput (requests finished per second).
Read definitionCost of Downtime
The cost of downtime is what an outage costs you per hour: lost revenue while you are down, plus the engineering time spent firefighting, scaled by how long the outage lasts.
Read definitionError Budget
An error budget is the amount of failure an SLO permits: the share of requests or minutes you are allowed to lose before you have to stop shipping and fix reliability.
Read definitionLittle's Law
Little's Law says the average number of requests in a system equals arrival rate times average time in the system (L = λ × W), which is how you size concurrency from throughput and latency.
Read definitionService Level Objective (SLO)
A Service Level Objective is the target reliability you commit a service to, written as a number like 99.9% of requests succeeding over a 30-day window.
Read definitionperformance_delivery
Performance & delivery
How fast a page feels, where the time goes, and how rendering and caching change it.
Cache Hit Ratio
Cache hit ratio is the share of requests served from cache rather than the origin: a higher ratio means less origin load, lower latency, and lower egress and compute bills.
Read definitionCold Start
A cold start is the extra latency the first request pays when a serverless instance or container has to be created and initialised from nothing before it can serve traffic.
Read definitionContent Delivery Network (CDN)
A CDN is a network of edge servers that cache and serve your content close to users, cutting latency and origin load by answering most requests without a round trip to your servers.
Read definitionCore Web Vitals
Core Web Vitals are Google's three user-experience metrics: Largest Contentful Paint (loading), Interaction to Next Paint (responsiveness), and Cumulative Layout Shift (visual stability).
Read definitionLatency Budget
A latency budget is a total p95 response-time target split across the hops a request takes (network, app, database, cache, third parties) so each layer knows the time it is allowed to spend.
Read definitionp95 Latency
p95 latency is the response time that 95% of requests come in under: a tail-latency measure that, unlike an average, reflects what your slowest and most-affected users actually experience.
Read definitionRendering Strategies (SSR, SSG, ISR, CSR)
Rendering strategies decide where and when your HTML is built: at build time (SSG), per request on the server (SSR), regenerated on a schedule (ISR), or in the browser (CSR). Each trades freshness, speed, and cost differently.
Read definitionTime to First Byte (TTFB)
Time to First Byte is how long from a request until the first byte of the response arrives: it captures DNS, connection, and server processing, and sets the floor for every page-load metric after it.
Read definitionai_agents
AI & agents
What you pay for, what the model can see, and what breaks when an LLM gets tools.
Agentic Context Tax
The agentic context tax is the way an AI agent's cost grows faster than its work: every tool call adds a turn, and each turn re-sends the whole conversation, so input tokens scale with roughly the square of the tool calls.
Read definitionContext Window
The context window is the maximum number of tokens a model can consider at once, covering the prompt, any retrieved or conversation history, and the response. Exceed it and the oldest content is dropped or the call fails.
Read definitionIdempotency
An operation is idempotent if running it twice has the same effect as running it once. It is what makes retries safe, so a duplicated request, message, or tool call does not double-charge or double-act.
Read definitionPrompt Injection
Prompt injection is an attack where untrusted text the model reads (a web page, a document, a tool result) contains instructions that hijack the model into ignoring its task or misusing its tools.
Read definitionRetrieval-Augmented Generation (RAG)
RAG is the pattern of fetching relevant documents at query time and putting them in the prompt so the model answers from your data instead of its training, without retraining the model.
Read definitionTokens (LLM)
A token is the unit a language model reads and writes: a chunk of text (often a word piece) produced by the model's tokenizer. Pricing, context limits, and speed are all measured in tokens, not words or characters.
Read definitionsecurity_seo
Security & SEO
The headers, tokens, and markup that decide how safe a page is and how it is read.
Content Security Policy (CSP)
A Content Security Policy is an HTTP header that tells the browser which sources of scripts, styles, images, and other content it is allowed to load, which is the strongest defence against cross-site scripting.
Read definitionJSON Web Token (JWT)
A JWT is a compact, signed token (header, payload, signature) that carries claims like who the user is and when the token expires, so a server can verify a session without a database lookup.
Read definitionStructured Data (JSON-LD / Schema.org)
Structured data is machine-readable markup, usually JSON-LD following schema.org vocabulary, that describes what a page is about so search engines can understand it and show rich results.
Read definitionPast the definitions, where does your stack actually stand?
I run a fixed-scope review across reliability, performance, cost, and AI readiness, and hand you a prioritized roadmap. Book a call to talk it through.
Book a call