Heap
Step 12 in the DSA path · 4 concepts · 9 problems
📘 Learn Heap from zero
A heap is a binary tree kept in an array where every parent beats its children (max-heap: parent largest; min-heap: parent smallest), so the extreme element sits at the root and is readable in O(1), with push/pop in O(log n). Reach for a heap whenever you repeatedly need the smallest or largest of a changing set — top-K, merging K sorted lists, scheduling by priority, or running statistics over a stream. The canonical example is Find Median from a Data Stream: keep a max-heap for the lower half and a min-heap for the upper half so the median is always at the two roots.
✨ Added by the guide to build intuition — not from the source course.
🏗️ Visual walkthrough — trace it step by step
lo = MAX-heap (lower half, root = largest of low)
hi = MIN-heap (upper half, root = smallest of high)
INVARIANTS we must always restore:
(1) every value in lo <= every value in hi
(2) size(lo) == size(hi) OR size(lo) == size(hi)+1
lo (max-heap) hi (min-heap)
[ empty ] [ empty ]
root=- root=-
Median rule:
odd total -> lo.root
even total -> (lo.root + hi.root) / 2We split the data so the two middle elements live at the two roots. Because lo holds the smaller half (largest on top) and hi holds the larger half (smallest on top), the median is always one peek (or two) away — no sorting needed. Stream to process: 5, 15, 1, 3.
add 5: lo empty -> push to lo, then rebalance.
lo (max): [5] hi (min): []
^root=5 root=-
sizes: lo=1, hi=0 -> invariant (2) holds (lo = hi+1)
roots: lo.root=5 (no hi to compare) -> invariant (1) holds
total=1 (odd) -> median = lo.root = 5First element goes to lo. With one element the lone value is trivially the median. The push is O(log n) and the peek that gives the median is O(1) — this is the whole point of using heaps here.
15 <= lo.root(5)? no -> belongs in upper half, push to hi.
lo (max): [5] hi (min): [15]
^root=5 ^root=15
Check invariant (1): lo.root(5) <= hi.root(15) OK
Check invariant (2): lo=1, hi=1 -> equal, OK
total=2 (even) -> median = (5 + 15)/2 = 10Each new value is routed by comparing to lo's root: anything not <= it goes to the upper heap. Because the roots already satisfy lo.root <= hi.root and sizes are equal, no rebalance is needed — the invariants are safe.
1 <= lo.root(5) -> belongs in lower half, push to lo.
lo (max): [5,1] hi (min): [15]
root=5 ^root=15
sift: 1<5, stays child
5
/
1
sizes: lo=2, hi=1 -> lo is ahead by 1, still OK (lo = hi+1)
roots: lo.root(5) <= hi.root(15) OK
total=3 (odd) -> median = lo.root = 51 is small, so it sinks into lo; the max-heap sift keeps 5 on top. Sizes are lo=2, hi=1, which is the allowed "lo leads by one" state, so the single middle element is exactly lo's root. Safe — no rebalance.
3 <= lo.root(5) -> push to lo first.
lo (max): [5,1,3] hi (min): [15]
5
/ \
1 3 sizes: lo=3, hi=1 <-- VIOLATES invariant (2)!
(lo leads by 2, allowed max is 1)The naive push overfills lo. Whenever size(lo) > size(hi)+1 we must move lo's root across to hi — this is the rebalance step that keeps the two halves the same size.
pop max from lo (=5), push 5 into hi.
lo (max): [3,1] hi (min): [5,15]
3 5
/ \
1 15
WHY this is safe:
- 5 was the LARGEST of the low half, so it is <= everything
already in hi -> invariant (1) still holds (3 <= 5).
- sizes now lo=2, hi=2 -> invariant (2) holds.
total=4 (even) -> median = (lo.root + hi.root)/2
= (3 + 5)/2 = 4Moving lo's root is always legal because it was the boundary value — the biggest of the small half is the smallest thing that can join the big half. Each transfer is O(log n), and after it the two roots straddle the true median: sorted data is 1,3,5,15 → median (3+5)/2 = 4.
function addNum(x): if lo empty OR x <= lo.root: lo.push(x) else: hi.push(x) # rebalance sizes (lo may lead by at most 1) if size(lo) > size(hi)+1: hi.push(lo.popMax()) elif size(hi) > size(lo): lo.push(hi.popMin()) function findMedian(): if size(lo) > size(hi): return lo.root else: return (lo.root + hi.root)/2 Cost: addNum O(log n), findMedian O(1)
The whole pattern reduces to: route by comparison, then restore the size invariant by shuffling one root across. Replaying 5,15,1,3 through this gives running medians 5 → 10 → 5 → 4, exactly matching our trace — the heap roots are the streaming median for free.
🎯 Guided practice
Problem 1 (easy) — "Take Gifts From the Richest Pile" (greedy-with-a-heap). Given gift piles [25, 64, 9, 4, 100], repeat 4 times: take the richest pile, replace it with floor(sqrt(pile)), put it back. Return the total remaining.
- Recognize the signal: "repeatedly take the largest, modify, reinsert" → priority queue. We need the max each round over changing data → a max-heap.
- Build: heapify all gifts into a max-heap in O(n) (in most languages: negate values into a min-heap). Heap =
{100,64,25,9,4}. - Round 1: pop 100, push
floor(sqrt(100))=10. Round 2: pop 64, push 8. Round 3: pop 25, push 5. Round 4: heap is{10,9,8,5,4}, pop 10, push 3. - Result: remaining heap holds
{9,8,5,4,3}→ sum = 29. Each round is one pop + one push = O(log n); total O(n + k log n), space O(n). Lesson: the heap silently re-orders after every mutation, so the next "max" is always one peek away.
Problem 2 (medium) — Find the Median of a Number Stream (Two Heaps). Numbers arrive one at a time; after each, report the median. Stream: 5, 2, 8, 1.
- Recognize the signal: "running median over a stream" → the Two Heaps pattern. Keep a max-heap
lofor the smaller half and a min-heaphifor the larger half, with the median at their seam. - Invariant: every element in
lo≤ every element inhi, and0 ≤ size(lo) - size(hi) ≤ 1. To add x: push x ontolo, movelo's max tohi, then ifhiis now larger thanlo, movehi's min back tolo. This keeps both halves correctly split at the seam. - Trace (heaps shown as multisets). Add 5: lo={5}, hi={}. Median = 5. Add 2: lo={2}, hi={5}. Median = (2+5)/2 = 3.5. Add 8: lo={2,5}, hi={8}. Median = top of lo = 5. Add 1: lo={1,2}, hi={5,8}. Median = (2+5)/2 = 3.5.
- Read the median in O(1): if
lois larger, it'slo's root; if sizes are equal, average the two roots. Insert is O(log n) (rebalancing is a constant number of heap ops); space O(n). Lesson: when one extreme isn't enough but the middle is, two opposing heaps turn an O(n) median scan into O(log n) per element.
✨ Added by the guide — work these before the full problem set.
Lessons in this topic
- ○Introduction to Heap
- ○Heap Operations
- ○Introduction to Heap
- ○Heap Operations
- ○easy Take Gifts From the Richest Pile
- ○easy Sort Characters By Frequency
- ○medium Minimum Cost to Connect Sticks
- ○medium Find the Median of a Number Stream
- ○medium K-th Smallest Prime Fraction
- ○medium Sort Characters By Frequency
- ○hard Smallest Range Covering Elements from K Lists
- ○hard Rearrange String
- ○hard Median of Two Sorted Arrays