Data Modeling

Step 4 in the Databases path · 4 concepts · 0 problems

0 / 4 complete

📘 Learn Data Modeling from zero

Data modeling is the discipline of deciding what data you will store and how the pieces relate, before you build the database. Think of it as the blueprint an architect draws before anyone pours concrete: you would never build a house by stacking bricks ad hoc, and you should not build a database by inventing columns on the fly.

First principles. Every model is built from three primitives: entities (the nouns/things you track), attributes (the properties of each thing), and relationships (how things connect, described by their cardinality: 1:1, 1:N, or M:N).

The three levels. You refine the model in stages: a conceptual model (business-facing, technology-agnostic — just entities and relationships), a logical model (adds attributes, primary keys, foreign keys, and normalization, still vendor-neutral), and a physical model (actual tables, data types, indexes, and partitioning for a specific engine like PostgreSQL).

Worked example — a library. Conceptual: entities Member, Book, and Loan; a Member borrows Books. Because one member borrows many books and one book is borrowed by many members over time, that is an M:N relationship — so you resolve it with a Loan associative (junction) entity that also carries its own attribute, due_date. Logical: Member(member_id PK, name), Book(book_id PK, title), Loan(loan_id PK, member_id FK, book_id FK, due_date). Physical: pick BIGINT for ids, DATE for due_date, and add an index on Loan.member_id for fast per-member lookups.

Key insight: a good data model is access-pattern-aware structure decided before code — get the entities, cardinalities, and level of normalization right, and queries become simple; get them wrong, and no amount of query tuning saves you.

✨ Added by the guide to build intuition — not from the source course.

🎯 Guided practice

Easy — Identify cardinality and resolve it.
Problem: A blogging platform has Authors and Posts. Each post is written by exactly one author; an author can write many posts. Model this.
Step 1 — name entities: Author and Post.
Step 2 — determine cardinality: one author → many posts, each post → one author. That is 1:N.
Step 3 — apply the rule: for 1:N, place the foreign key on the "many" side. No junction table is needed.
Answer: Author(author_id PK, name) and Post(post_id PK, title, author_id FK). The author_id living on Post is the whole pattern.
Medium — Normalize to 3NF and spot the M:N.
Problem: An orders sheet has columns: order_id, customer_name, customer_email, product1, product2, product3, product_category. Redesign it properly.
Step 1 — 1NF (atomic values, no repeating groups): the product1/2/3 columns are a repeating group. Split products into their own rows/entity. Now an order can contain many products and a product can appear in many orders → this is an M:N relationship.
Step 2 — resolve M:N with a junction table: introduce OrderItem(order_id FK, product_id FK) with a composite PK (order_id, product_id).
Step 3 — 2NF then 3NF (remove partial, then transitive dependencies): customer_email depends on the customer, not the order, so extract Customer. product_category depends on the product (a transitive dependency via the product, not on the order key), so it belongs on Product.
Answer: Customer(customer_id PK, name, email), Order(order_id PK, customer_id FK), Product(product_id PK, name, category), OrderItem(order_id FK, product_id FK, PK(order_id, product_id)).
Core pattern learned: repeating columns → split into a new entity; M:N → junction table with a composite key; an attribute that depends on something other than the whole key → move it to where its key lives.

✨ Added by the guide — work these before the full problem set.

Data Modeling

📘 Learn Data Modeling from zero

🎯 Guided practice

Lessons in this topic