Knowledge Guide
HomeDatabasesIndexing & Storage

How Indexes Work — B+tree Internals

Why a lookup needs an index at all

WHERE id = 42 on an unindexed table is a full table scan — read every row, O(n). An index is a separate sorted structure that turns that into O(log n). Almost every relational database uses a B+tree for it — here's why and how.

Why a B+tree, not a binary tree or a hash

Disks (and SSDs) read in pages (~8 KB), and the cost is the number of page reads. A binary tree wastes that — one key per node, ~30 levels for a billion rows = ~30 reads. A B+tree packs hundreds of keys into one page (high fanout), so the tree is only ~3–4 levels deep for a billion rows → ~3–4 page reads per lookup. A hash index does equality in O(1) but can't do ranges or ORDER BY — which is why B+tree is the default.

A B+tree: root routing node points to sorted leaf nodes linked left-to-right; a lookup of 60 descends root to the middle leaf, range scans follow the leaf links
A B+tree: root routing node points to sorted leaf nodes linked left-to-right; a lookup of 60 descends root to the middle leaf, range scans follow the leaf links

Structure & the two operations

Point lookup (=): start at the root, follow the pointer for your key's range down to the leaf — ~3–4 reads. Range scan / ORDER BY / BETWEEN: descend to the start leaf, then walk the leaf links sideways — this sideways chain is exactly why B+tree beats hash for ranges and sorted output.

Inserts stay balanced via splits

Insert into a leaf; if it overflows, split it in two and push the middle key up to the parent (which may split too, up to the root). This keeps every leaf the same depth — lookups stay O(log n) forever. The cost: writes touch the index too (and may split pages) — write amplification.

Clustered vs secondary (the hop that surprises people)

Takeaways


Re-authored for this guide; B+tree diagram hand-authored as SVG. Follows CMU 15-445 (Andy Pavlo) and DDIA ch. 3. See also: Indexes in Practice, How a Query Executes.

🤖 Don't fully get this? Learn it with Claude

Stuck on How Indexes Work — B+tree Internals? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.

🎨 Explain it visually

Build the mental picture, not memorization.

I just read a lesson on **How Indexes Work — B+tree Internals** (Databases) and want to truly understand it. Explain How Indexes Work — B+tree Internals from first principles using ONE vivid real-world analogy and a visual mental model — draw it as ASCII art or a clear step-by-step diagram — with a concrete example using real numbers. Then ask me one question to check I got the mental picture, and wait for my reply. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🤔 Walk me through it (interactive)

Socratic — adapts to where you're stuck.

Teach me **How Indexes Work — B+tree Internals** interactively. Ask me ONE guiding question at a time, wait for my answer, and adapt to my confusion — build the idea with me step by step instead of explaining it all at once. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🧪 Quiz me & fix my gaps

Active recall exposes what you missed.

Quiz me on **How Indexes Work — B+tree Internals** with 5 questions, easy to tricky, ONE at a time. Tell me if each answer is right; at the end, explain clearly what I got wrong and why. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🧠 Make it stick

Intuition + hook + flashcards for long-term memory.

Help me remember **How Indexes Work — B+tree Internals** for the long term: give the one-sentence intuition, a memorable hook/mnemonic, a tiny worked example, and 3 active-recall flashcards (Q -> A). If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

📝 My notes