Minimum Spanning Tree — Kruskal & Prim
Minimum Spanning Tree
A minimum spanning tree (MST) connects every vertex of a weighted, undirected, connected graph using exactly V-1 edges with the smallest possible total weight, and both classic algorithms work because of one structural fact — the cut property: for any partition of the vertices into two non-empty sets, the single lightest edge crossing that cut must belong to some MST (if two crossing edges tie, either may be chosen). Kruskal and Prim are two different disciplined ways of repeatedly finding and taking a cut-crossing minimum edge until the graph is fully connected.
Recognize the pattern
- The problem says "connect all nodes/cities/machines at minimum total cost" over an undirected, weighted graph.
- You need one tree touching every vertex — not a shortest path between two fixed nodes (Dijkstra) and not all-pairs distances (Floyd–Warshall).
- Edges may have negative weights; MST is unaffected by sign because it never accumulates a running distance, it just compares raw edge weights.
- Signal phrases: "minimum cost to connect all", "minimum cost network / wiring / pipeline", "remove the most expensive redundant connections while staying connected".
Brute force vs. the greedy algorithms
Naive baseline: enumerate every spanning tree (or every subset of V-1 edges), reject the ones with a cycle or a disconnected component, keep the cheapest survivor. A complete graph on V vertices has V^(V-2) spanning trees (Cayley's formula), so this is super-exponential and only viable for toy graphs.
Kruskal and Prim both replace enumeration with a single greedy pass, using the cut property to certify every choice is safe without ever backtracking:
- Kruskal — edge-centric. Sort all
Eedges by weight ascending. Walk the sorted list; add an edge if and only if its two endpoints are currently in different components (checked/merged with a Union-Find/DSU structure). Stop afterV-1edges are added. It builds a forest that merges into one tree. - Prim — vertex-centric. Start from any vertex; maintain one growing tree and a min-heap of "frontier" edges (one endpoint inside the tree, one outside). Repeatedly pop the minimum-weight frontier edge, add its outside endpoint to the tree, and push that vertex's new edges into the heap. Stop when all vertices are in the tree.
Both are correct MST algorithms; they differ only in which cut they examine at each step — Kruskal looks at the cut implied by "the two components an edge would merge", Prim always looks at the cut between "tree so far" and "everything else".
Kruskal in code
class DSU {
int[] parent, rank;
DSU(int n) {
parent = new int[n];
rank = new int[n];
for (int i = 0; i < n; i++) parent[i] = i;
}
int find(int x) {
while (parent[x] != x) {
parent[x] = parent[parent[x]]; // path compression
x = parent[x];
}
return x;
}
boolean union(int a, int b) {
int ra = find(a), rb = find(b);
if (ra == rb) return false; // would create a cycle
if (rank[ra] < rank[rb]) { int t = ra; ra = rb; rb = t; }
parent[rb] = ra;
if (rank[ra] == rank[rb]) rank[ra]++;
return true;
}
}
class Edge {
int u, v, w;
Edge(int u, int v, int w) { this.u = u; this.v = v; this.w = w; }
}
int kruskalMST(int n, java.util.List<Edge> edges) {
// Use Integer.compare, not "a.w - b.w": subtraction on extreme
// int weights (e.g. one near Integer.MAX_VALUE, one very negative)
// can overflow and silently produce a wrong sort order.
edges.sort((a, b) -> Integer.compare(a.w, b.w));
DSU dsu = new DSU(n);
int total = 0, used = 0;
for (Edge e : edges) {
if (used == n - 1) break;
if (dsu.union(e.u, e.v)) {
total += e.w;
used++;
}
}
if (used != n - 1) throw new IllegalStateException("graph is disconnected");
return total;
}Prim in code
int primMST(int n, java.util.List<java.util.List<int[]>> adj) {
// adj.get(u) = list of {neighbor, weight}
boolean[] inTree = new boolean[n];
// Same rule as Kruskal: compare with Integer.compare, never a[0] - b[0],
// to avoid overflow when weights are near the int range's edges.
java.util.PriorityQueue<int[]> heap =
new java.util.PriorityQueue<>((a, b) -> Integer.compare(a[0], b[0]));
heap.add(new int[]{0, 0}); // {weight, vertex}, start at vertex 0
int total = 0, added = 0;
while (!heap.isEmpty() && added < n) {
int[] top = heap.poll();
int w = top[0], u = top[1];
if (inTree[u]) continue; // stale heap entry, discard
inTree[u] = true;
total += w;
added++;
for (int[] edge : adj.get(u)) {
int v = edge[0], vw = edge[1];
if (!inTree[v]) heap.add(new int[]{vw, v});
}
}
if (added != n) throw new IllegalStateException("graph is disconnected");
return total;
}Complexity, derived from first principles
Kruskal: sorting the edge list costs O(E log E). The main loop visits every edge once, doing one find and possibly one union, each nearly O(1) amortized (inverse-Ackermann, O(α(V)), with path compression + union by rank) — so the loop is O(E α(V)). Sorting dominates, giving O(E log E) time, which is the same as O(E log V) since E ≤ V^2 implies log E ≤ 2 log V. Space is O(V + E) for the DSU arrays and edge list.
Prim (binary-heap version): every vertex is popped once and every edge can push at most one heap entry, so there are O(E) heap insertions/removals, each O(log E) = O(log V) (heap size is bounded by E, and log E = O(log V)) — total O(E log V) time. Space is O(V + E) for the adjacency list plus O(E) for heap entries (including stale ones, which is why the if (inTree[u]) continue; check in the code above is required for correctness, not just efficiency).
Both reduce to the same asymptotic bound; the practical gap shows up in the constant factor and in which structure the graph's density favors (see "When to use" below).
Traced worked example
Graph: vertices A,B,C,D,E with edges A-C=1, B-C=2, D-E=2, A-B=4, B-D=5, C-D=8, C-E=10 (matches the diagram above — note B-D=5 and D-E=2, not the other way around).
Kruskal trace (sorted edges, Union-Find)
| Step | Edge (w) | find(u) vs find(v) | Action | Components after |
|---|---|---|---|---|
| 1 | A-C (1) | different | add (total=1) | {A,C} {B} {D} {E} |
| 2 | B-C (2) | different | add (total=3) | {A,B,C} {D} {E} |
| 3 | D-E (2) | different | add (total=5) | {A,B,C} {D,E} |
| 4 | A-B (4) | same | reject — cycle | unchanged |
| 5 | B-D (5) | different | add (total=10) | {A,B,C,D,E} — done, 4 edges |
Prim trace (start at A, min-heap frontier)
| Step | Tree so far | Frontier edges considered | Pop (min) | Total |
|---|---|---|---|---|
| 1 | {A} | A-B(4), A-C(1) | A-C (1) | 1 |
| 2 | {A,C} | A-B(4), B-C(2), C-D(8), C-E(10) | B-C (2) | 3 |
| 3 | {A,C,B} | B-D(5), C-D(8), C-E(10) [A-B, B-C stale-skip] | B-D (5) | 8 |
| 4 | {A,C,B,D} | D-E(2), C-D stale, C-E(10) | D-E (2) | 10 |
Both algorithms land on the same edge set {A-C, B-C, D-E, B-D} and the same total weight, 10 — as guaranteed, since every safe edge choice obeys the cut property regardless of which cut you examine first.
Pitfalls
- Forgetting the graph must be connected. If it isn't, Kruskal will finish its edge list with fewer than
V-1edges added and Prim's heap will empty with vertices unvisited — both must explicitly detect and report this (a "minimum spanning forest" otherwise), never silently return a partial-weight answer. - Skipping Union-Find and checking cycles by DFS/BFS instead. Correct but turns Kruskal from near-linear-after-sort into
O(E·V)— a common performance bug in interview code. - Forgetting path compression / union by rank. Without them DSU degrades toward
O(V)per operation on adversarial input (a long chain), turning Kruskal's loop intoO(E·V). - Stale heap entries in Prim. A vertex can be pushed onto the heap multiple times at different weights as the frontier shrinks; forgetting the
if (inTree[u]) continue;guard causes wrong totals or reprocessing. - Subtraction-based comparators overflow. Writing
(a, b) -> a.w - b.w(Kruskal) or(a, b) -> a[0] - b[0](Prim) is a classic Java anti-pattern: if weights can be nearInteger.MIN_VALUE/MAX_VALUE, the subtraction itself overflows and silently produces an inconsistent sort order. Always useInteger.compare(a, b)instead — it is the same cost and immune to overflow. The code above has already been written this way; treat any subtraction comparator you see elsewhere as a bug to fix. - Directed graphs. Neither algorithm is defined for directed graphs — that problem is the (harder) Minimum Arborescence, solved by Edmonds'/Chu–Liu.
- Assuming MST is unique. With tied edge weights there can be multiple valid MSTs (all with the same total weight) — do not assert a unique edge set unless the problem guarantees distinct weights.
- Negative weights are fine for MST correctness (unlike Dijkstra) since the algorithms never rely on distances accumulating along a path — only compare raw edge weights.
When to use / when NOT — trade-offs
| Situation | Prefer | Why |
|---|---|---|
Sparse graph, edge list already available (E ≈ O(V)) | Kruskal | O(E log E) sort dominates and is cheap when E is small; DSU overhead is tiny. |
Dense graph (E ≈ O(V^2)), adjacency matrix/list given | Prim (array-based, no heap) | An O(V^2) array-scan Prim beats O(E log V) = O(V^2 log V) heap-based Prim or Kruskal's sort when edges vastly outnumber vertices. |
| Streaming/dynamic edges, need MST-so-far incrementally | Kruskal | Union-Find naturally supports adding edges over time; Prim needs the tree to grow from a fixed frontier. |
| Need shortest path between two specific nodes, not a spanning tree | Neither — use Dijkstra (non-negative weights) or Bellman-Ford (negative weights) | MST minimizes total tree weight, not any single-pair path; MST paths between two nodes can be far longer than the true shortest path. |
| Need MST but edges can be negative and you also want shortest-path guarantees | MST (Kruskal/Prim) still correct for the MST part; do not conflate it with Bellman-Ford, which solves a different problem | MST tolerates negative weights natively; shortest-path algorithms need special handling (Bellman-Ford) or forbid negative weights (Dijkstra) — the two problems are not interchangeable even though both are "greedy-ish" on a graph. |
Rule of thumb interview line: "Kruskal wins on sparse graphs because sorting E edges is cheap and Union-Find is near-constant; Prim (heap-based) wins on dense graphs because its cost is driven by E log V and, in its array form, can even hit O(V^2), avoiding the sort entirely."
Takeaways
- Both algorithms are correct because of the cut property: the lightest edge across any cut is always safe to add.
- Kruskal is edge-centric and needs Union-Find; Prim is vertex-centric and needs a min-heap frontier with stale-entry handling.
- Complexity: Kruskal
O(E log E)from sorting; PrimO(E log V)from heap operations — asymptotically equal, but density decides the practical winner (sparse → Kruskal, dense → array-based Prim). - MST tolerates negative edge weights; it is not a shortest-path algorithm and must not be substituted for Dijkstra/Bellman-Ford.
- Always compare with
Integer.compare, never raw subtraction, in any comparator over weights that could be extreme.
Recall question
You are given a graph with 10,000 vertices and 49,995,000 edges (essentially complete). Which algorithm and which implementation variant minimizes runtime, and why?
Authored for this guide from first principles and standard treatments of greedy MST algorithms; cite Cormen/Leiserson/Rivest/Stein (CLRS), Chapters on Minimum Spanning Trees, and Sedgewick & Wayne, Algorithms, for further reading.
🤖 Don't fully get this? Learn it with Claude
Stuck on Minimum Spanning Tree — Kruskal & Prim? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.
Build the mental picture, not memorization.
I just read a lesson on **Minimum Spanning Tree — Kruskal & Prim** (DSA) and want to truly understand it. Explain Minimum Spanning Tree — Kruskal & Prim from first principles using ONE vivid real-world analogy and a visual mental model — draw it as ASCII art or a clear step-by-step diagram — with a concrete example using real numbers. Then ask me one question to check I got the mental picture, and wait for my reply. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Socratic — adapts to where you're stuck.
Teach me **Minimum Spanning Tree — Kruskal & Prim** interactively. Ask me ONE guiding question at a time, wait for my answer, and adapt to my confusion — build the idea with me step by step instead of explaining it all at once. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Active recall exposes what you missed.
Quiz me on **Minimum Spanning Tree — Kruskal & Prim** with 5 questions, easy to tricky, ONE at a time. Tell me if each answer is right; at the end, explain clearly what I got wrong and why. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Intuition + hook + flashcards for long-term memory.
Help me remember **Minimum Spanning Tree — Kruskal & Prim** for the long term: give the one-sentence intuition, a memorable hook/mnemonic, a tiny worked example, and 3 active-recall flashcards (Q -> A). If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.