Introduction to Graph
A graph models pairwise relationships by storing a set of vertices V and a set of edges E connecting them, and every graph algorithm's cost is bounded by how many vertices and edges it has to touch — so the first design decision, before any traversal, is how you store the edges, because that choice fixes the cost of every later operation.
Recognize the pattern
- The problem talks about entities and pairwise relationships between them (friends, routes, dependencies, links) rather than a linear or hierarchical order.
- You need to ask reachability questions: "can I get from X to Y", "are these connected", "what's the shortest hop count".
- Relationships can be one-way (prerequisite → course) or two-way (mutual friendship) — that tells you directed vs undirected.
- A tree that shows up in a problem is secretly a graph with V−1 edges and no cycles — if cycles or multiple parents are possible, you're in graph territory, not tree territory.
Representation: brute force vs optimal
Adjacency matrix ("brute force"): a V × V boolean/weight grid, matrix[u][v] = 1 if an edge exists. Checking whether an edge exists is O(1), but the matrix costs O(V²) space and O(V²) time just to build or scan, even for a sparse graph with only a handful of edges.
Adjacency list (the standard optimal choice): each vertex keeps a list of its neighbors. Space is O(V + E) — you only pay for edges that actually exist. Checking if a specific edge exists degrades to O(degree(u)) instead of O(1), but for the traversal and search problems that dominate interviews (BFS, DFS, shortest path, connectivity), you never need that O(1) lookup — you always want "give me all neighbors of u", which the list does directly.
Convention: an undirected self-loop at vertex X is stored twice in X's own adjacency list (standard practice, e.g. Sedgewick's algs4 Graph.addEdge) — once for each "end" of the loop — so that the total length of all adjacency lists always equals the sum of degrees, which equals 2|E|. Skipping this convention silently breaks that identity.
Rule of thumb: matrix wins only when the graph is dense (E ≈ V²) or you need frequent O(1) edge-existence checks (e.g. Floyd–Warshall). Otherwise, default to the adjacency list.
Complexity, derived
Take BFS/DFS traversal as the baseline operation on an adjacency list:
- Each of the
Vvertices is enqueued/visited exactly once → V unit operations. - While processing a vertex, you iterate its adjacency list; summed across all vertices, every undirected edge appears twice in the combined lists (once from each endpoint) → the total number of list entries scanned is 2|E|, which is still O(E), a constant factor away from |E|.
- Total work = V + O(E) → time O(V + E). Note 2|E| and V+E are two different numbers that are both O(V+E) — 2|E| is not itself "V+E", it collapses into the same asymptotic bucket only because constants don't matter for big-O.
- Space: the adjacency list itself is O(V + E); the visited-set / queue / recursion stack adds O(V) → space O(V + E).
With an adjacency matrix, step 2 becomes "scan an entire row of V entries to find neighbors" regardless of actual degree → V vertices × V scan = O(V²) time, no matter how sparse the graph is. This is the concrete reason sparse graphs must avoid the matrix.
Worked example
Graph: vertices {A,B,C,D,E}, undirected edges: A–B, A–C, B–C, C–D, C–E, and a self-loop at C. That's |E| = 6 edges and |V| = 5 vertices, so V + E = 11.
| Node | Adjacency list | Degree |
|---|---|---|
| A | [B, C] | 2 |
| B | [A, C] | 2 |
| C | [A, B, D, E, C(self), C(self)] | 1+1+1+1+2 = 6 |
| D | [C] | 1 |
| E | [C] | 1 |
Following the self-loop convention above, C's list stores the self-loop twice, so C's printed list has 6 entries — matching its degree of 6. Summing the printed list lengths: 2+2+6+1+1 = 12 entries total, and summing degrees gives the same 2+2+6+1+1 = 12 = 2 × |E| (6 edges, self-loop counted twice). This handshake identity — total adjacency-list entries = sum of degrees = 2|E| — is exactly why summing adjacency-list scans across all vertices gives O(E) and not O(V·E). It is a different number from V+E=11: 2|E|=12 counts each undirected edge's two appearances in the combined lists, while V+E=11 is the asymptotic-cost expression (V vertex visits plus E edges, each counted once in the cost model). Both are O(V+E) up to constants, but 12 ≠ 11 and the two should never be presented as the same number.
BFS from A: visit A (dist 0), scan A's 2-entry list, enqueue B and C → visit B (dist 1), scan B's 2-entry list, both neighbors already visited/queued → visit C (dist 1), scan C's 6-entry list, enqueue D and E → visit D (dist 2), scan D's 1-entry list → visit E (dist 2), scan E's 1-entry list. Vertices touched: 5. Total list entries scanned: 2+2+6+1+1 = 12, i.e. 2|E| — the full sum-of-degrees bound, since BFS from A eventually visits every vertex and thus scans every adjacency-list entry exactly once. The asymptotic cost is still reported as O(V+E) = O(11): the 12 is the concrete constant-factor number (2E), not a literal restatement of V+E, and the two must not be conflated.
Implementation: adjacency list + BFS
Both implementations below follow the same self-loop convention as the worked example: adding an edge (u, v) appends v to u's list and u to v's list; when u == v (a self-loop), that naturally appends the same vertex twice, keeping total list length equal to the sum of degrees.
Java
import java.util.*;
class Graph {
private final Map<Integer, List<Integer>> adj = new HashMap<>();
void addEdge(int u, int v) {
adj.computeIfAbsent(u, k -> new ArrayList<>()).add(v);
adj.computeIfAbsent(v, k -> new ArrayList<>()).add(u);
// when u == v (self-loop), both calls above target the same
// list, so v is appended twice -- matches the convention.
}
List<Integer> bfs(int start) {
List<Integer> order = new ArrayList<>();
Set<Integer> visited = new HashSet<>();
Deque<Integer> queue = new ArrayDeque<>();
queue.add(start);
visited.add(start);
while (!queue.isEmpty()) {
int u = queue.poll();
order.add(u);
for (int v : adj.getOrDefault(u, List.of())) {
if (visited.add(v)) queue.add(v);
}
}
return order; // time O(V+E), space O(V+E)
}
}Go
package main
type Graph struct {
adj map[int][]int
}
func NewGraph() *Graph {
return &Graph{adj: make(map[int][]int)}
}
func (g *Graph) AddEdge(u, v int) {
g.adj[u] = append(g.adj[u], v)
g.adj[v] = append(g.adj[v], u)
// u == v (self-loop) appends to the same slice twice,
// same convention as the Java version above.
}
func (g *Graph) BFS(start int) []int {
order := []int{}
visited := map[int]bool{start: true}
queue := []int{start}
for len(queue) > 0 {
u := queue[0]
queue = queue[1:]
order = append(order, u)
for _, v := range g.adj[u] {
if !visited[v] {
visited[v] = true
queue = append(queue, v)
}
}
}
return order // time O(V+E), space O(V+E)
}Both versions do the same two things the derivation above relies on: every vertex is enqueued once (V operations), and every adjacency-list entry is scanned once as it's iterated in the inner for loop (2|E| entries total across the whole run, which collapses to O(E)) — giving O(V+E) time and O(V+E) space in either language.
Pitfalls
- Forgetting that an undirected edge must be added to both endpoints' adjacency lists — a common off-by-one that silently makes BFS/DFS miss reachable nodes.
- Double-counting or forgetting self-loops when computing degree (an undirected self-loop adds 2 to degree, not 1) — and correspondingly forgetting to store it twice in the adjacency list, which breaks the sum-of-degrees = list-length identity used above.
- Using a matrix for a large sparse graph (e.g. V = 10⁵ with a few hundred thousand edges) — O(V²) memory blows up long before any algorithm runs.
- Confusing degree in a directed graph: total degree ≠ in-degree; interview questions on directed graphs almost always want in-degree and out-degree tracked separately (e.g. topological sort via in-degree).
- Conflating 2|E| (sum of degrees, a concrete count of adjacency-list entries) with the asymptotic expression V+E — they are different numbers that both happen to be O(V+E); don't state one as if it were a literal restatement of the other.
When to use / when not — trade-offs
Use an adjacency list (the default) whenever the graph is sparse or you mainly need traversal/neighbor enumeration — this covers the vast majority of interview problems (BFS, DFS, Dijkstra, topological sort, connected components).
Use an adjacency matrix instead when the graph is dense (E close to V²), when you need O(1) "is there an edge between u and v" checks repeatedly, or for algorithms like Floyd–Warshall that naturally iterate over all pairs anyway. Trade-off: matrix trades O(V²) memory for O(1) edge lookup; list trades O(degree) edge lookup for O(V+E) memory. There's also the edge list (a flat array of (u, v[, weight]) triples) — cheapest to build and best for algorithms that sort edges globally (Kruskal's MST), but poor for "give me neighbors of u" queries since that requires a full scan.
Takeaways
- A graph is just V and E — everything else (directed/undirected, weighted, cyclic) is a property layered on top.
- Representation choice is the real complexity lever: adjacency list gives O(V+E) traversal; adjacency matrix forces O(V²) regardless of sparsity.
- Sum of degrees = total adjacency-list entries = 2|E| in an undirected graph (self-loops stored twice) — this identity is why summed adjacency-list work is O(E), not O(V·E). It is a distinct number from V+E; both are O(V+E) asymptotically, but they must not be equated as literal values.
- Default to adjacency list unless you have a concrete reason (density, O(1) lookups, all-pairs algorithms) to reach for a matrix.
Recall: why does BFS/DFS on an adjacency list run in O(V+E) rather than O(V·E) or O(V²), and why is the sum of adjacency-list lengths (2|E|) a different number from V+E even though both are O(V+E)?
Synthesized from standard graph-theory references (CLRS, Sedgewick's Algorithms 4th ed.) and the original page's terminology (vertices, edges, adjacency, degree, digraphs).
🤖 Don't fully get this? Learn it with Claude
Stuck on Introduction to Graph? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.
Build the mental picture, not memorization.
I just read a lesson on **Introduction to Graph** (DSA) and want to truly understand it. Explain Introduction to Graph from first principles using ONE vivid real-world analogy and a visual mental model — draw it as ASCII art or a clear step-by-step diagram — with a concrete example using real numbers. Then ask me one question to check I got the mental picture, and wait for my reply. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Socratic — adapts to where you're stuck.
Teach me **Introduction to Graph** interactively. Ask me ONE guiding question at a time, wait for my answer, and adapt to my confusion — build the idea with me step by step instead of explaining it all at once. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Active recall exposes what you missed.
Quiz me on **Introduction to Graph** with 5 questions, easy to tricky, ONE at a time. Tell me if each answer is right; at the end, explain clearly what I got wrong and why. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Intuition + hook + flashcards for long-term memory.
Help me remember **Introduction to Graph** for the long term: give the one-sentence intuition, a memorable hook/mnemonic, a tiny worked example, and 3 active-recall flashcards (Q -> A). If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.