Knowledge Guide
HomeSystem DesignCapacity Estimation

CPU & Server-Count Playbook

The two formulas nobody teaches

This is the dimension almost every guide skips. Two equivalent lenses:

Little’s Law: concurrency N = λ × W (arrival rate × latency).
CPU cores: cores = (QPS × latency_ms) ÷ (1000 × target_utilization)

Worked example

10,000 QPS, each request burns ~10 ms of CPU, target 70% utilization:

CPU-bound vs IO-bound — this decides HOW you scale

CPU-boundIO-bound
BottleneckComputation (encoding, ML, compression)Waiting on DB / network / disk
Concurrency vs coresN ≈ coresN ≫ cores (threads mostly wait)
Scale byAdd cores / machinesAdd concurrency (threads/async) + fix the slow dependency

The kitchen analogy (why 70%, not 100%)

You hire 30 chefs for a 20-dish peak so a sudden rush doesn’t halt the kitchen. Run servers at ~70% so latency stays sane under bursts — queues explode as utilization approaches 100%.


Formulas are standard/public-domain engineering math. Approach and reference-table format adapted from the System Design Primer (CC BY 4.0), Jeff Dean’s latency numbers, the DesignGurus capacity-estimation guide, and Little’s Law.

🤖 Don't fully get this? Learn it with Claude

Stuck on CPU & Server-Count Playbook? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.

🎨 Explain it visually

Build the mental picture, not memorization.

I just read a lesson on **CPU & Server-Count Playbook** (System Design) and want to truly understand it. Explain CPU & Server-Count Playbook from first principles using ONE vivid real-world analogy and a visual mental model — draw it as ASCII art or a clear step-by-step diagram — with a concrete example using real numbers. Then ask me one question to check I got the mental picture, and wait for my reply. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🤔 Walk me through it (interactive)

Socratic — adapts to where you're stuck.

Teach me **CPU & Server-Count Playbook** interactively. Ask me ONE guiding question at a time, wait for my answer, and adapt to my confusion — build the idea with me step by step instead of explaining it all at once. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🧪 Quiz me & fix my gaps

Active recall exposes what you missed.

Quiz me on **CPU & Server-Count Playbook** with 5 questions, easy to tricky, ONE at a time. Tell me if each answer is right; at the end, explain clearly what I got wrong and why. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🧠 Make it stick

Intuition + hook + flashcards for long-term memory.

Help me remember **CPU & Server-Count Playbook** for the long term: give the one-sentence intuition, a memorable hook/mnemonic, a tiny worked example, and 3 active-recall flashcards (Q -> A). If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

📝 My notes