Knowledge Guide
HomeDatabases

SQL Fundamentals

Step 9 in the Databases path · 39 concepts · 0 problems

0 / 39 complete

📘 Learn SQL Fundamentals from zero

Start from zero. A relational database (RDBMS) stores data in tables: a table is a grid, each row is one record (one customer), each column is one attribute (name, city). SQL (Structured Query Language) is the language you use to ask questions of that data.

Analogy: think of a table as a giant spreadsheet, and SQL as a precise way to say "show me only these rows, only these columns, sorted this way, grouped like that." You describe what you want, and the database figures out how to fetch it — that is why SQL is called declarative.

Worked example. Suppose a sales table:

To get total paid sales per region, keeping only regions whose paid total exceeds 120:

SELECT region, SUM(amount) AS total FROM sales WHERE status = 'paid' GROUP BY region HAVING SUM(amount) > 120;

Walk the engine's logical execution order (the single most important mental model): FROM (load rows) → WHERE (drop the void row — a per-row filter that runs before any grouping) → GROUP BY (bucket by region) → HAVING (keep buckets where SUM(amount) > 120 — an aggregate filter) → SELECT (project columns) → ORDER BYLIMIT. Result: East, 300. West (50) is dropped by HAVING; the 999 void sale never reaches the sum because WHERE removed it first. Notice WHERE and HAVING do genuinely different jobs here.

The order is not the order you write it. You write SELECT first but it runs almost last — which is why a column alias defined in SELECT (like total) can't be referenced in WHERE or in most engines' HAVING, but can be used in ORDER BY.

Key insight: SQL is declarative and runs in a fixed logical order — FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT. Internalize that order and most "why doesn't my query work" questions answer themselves.

✨ Added by the guide to build intuition — not from the source course.

🎯 Guided practice

  1. Easy — filtering and NULLs. Given employees(id, name, manager_id, salary), list the names of employees who have no manager.

    Reasoning: "no manager" means manager_id is unknown, i.e. NULL. The trap is writing manager_id = NULL, which evaluates to UNKNOWN (never true) because under SQL's three-valued logic an unknown compared to anything — even another unknown — is unknown. The correct test for NULL is IS NULL.

    Answer: SELECT name FROM employees WHERE manager_id IS NULL;

    Pattern learned: NULL is a marker for "unknown," not a value — always test it with IS NULL / IS NOT NULL, never = or <>.

  2. Medium — JOIN + GROUP BY + HAVING, the workhorse pattern. Given customers(id, name) and orders(id, customer_id, amount), find the names of customers who have placed more than 2 orders, with their order count, highest first.

    Step 1 — what entities combine? We need customer names (in customers) and a count of orders (in orders), so we JOIN on customers.id = orders.customer_id. An INNER JOIN is correct because a customer with zero orders can never satisfy "more than 2" — keeping them via a LEFT JOIN would only add rows we are about to filter out.

    Step 2 — aggregate per customer. "Per customer" signals GROUP BY. We group by the customer and COUNT their order rows.

    Step 3 — filter the groups. "More than 2 orders" is a condition on an aggregate, so it goes in HAVING, not WHERE (a per-row filter runs before grouping and can't see COUNT(*)).

    Step 4 — sort. "Highest first" → ORDER BY order_count DESC (the SELECT alias is legal in ORDER BY because ORDER BY runs after SELECT).

    Answer:
    SELECT c.name, COUNT(*) AS order_count
    FROM customers c
    JOIN orders o ON c.id = o.customer_id
    GROUP BY c.id, c.name
    HAVING COUNT(*) > 2
    ORDER BY order_count DESC;

    Pattern learned: the JOIN → GROUP BY → HAVING → ORDER BY pipeline is the backbone of analytic SQL. WHERE filters rows before grouping, HAVING filters groups after. Group by the primary key (c.id) and carry c.name along — grouping by name alone would merge two distinct customers who share a name.

✨ Added by the guide — work these before the full problem set.

Lessons in this topic