High School Attendance
Problem
Table: Students
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| student_id | int |
| student_name | varchar |
| grade | int |
+---------------+---------+
student_id is the primary key for this table.
This table contains the information of students in a high school.
Table: Attendance
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| student_id | int |
| attendance_date | date |
| status | enum |
+---------------+---------+
(student_id, attendance_date) is the primary key for this table.
status is an ENUM of type ('Present', 'Absent').
Problem Definition
Write a solution to find the students who were absent for three or more consecutive days.
Return the result table ordered by student_id and start_date, where start_date is the first date of the consecutive absences.
Example
Input:
Students table:
+------------+--------------+-------+
| student_id | student_name | grade |
+------------+--------------+-------+
| 1 | Alice | 10 |
| 2 | Bob | 11 |
| 3 | Charlie | 10 |
+------------+--------------+-------+
Attendance table:
+------------+-----------------+--------+
| student_id | attendance_date | status |
+------------+-----------------+--------+
| 1 | 2020-11-01 | Absent |
| 1 | 2020-11-02 | Absent |
| 1 | 2020-11-03 | Absent |
| 2 | 2020-11-01 | Present|
| 2 | 2020-11-02 | Absent |
| 2 | 2020-11-03 | Absent |
| 2 | 2020-11-04 | Absent |
| 3 | 2020-11-01 | Present|
| 3 | 2020-11-02 | Present|
+------------+-----------------+--------+
Output:
+------------+-------------+
| student_id | start_date |
+------------+-------------+
| 1 | 2020-11-01 |
| 2 | 2020-11-02 |
+------------+-------------+
Try It Yourself
-- TODO: Write your user queries here
Solution
To identify students who were absent for three or more consecutive days, we can utilize SQL's window functions and Common Table Expressions (CTEs). This approach allows us to efficiently analyze attendance records, detect consecutive absences, and extract the required information for each qualifying student.
- Filter Absent Records: Isolate records where the student's status is 'Absent' to focus solely on absences.
- Assign Sequential Ranks: Use
ROW_NUMBER()to assign a sequential number to each absence per student based on theattendance_date. - Identify Consecutive Absences: Calculate a grouping identifier by subtracting the row number from the sequential rank. This helps in grouping consecutive absences.
- Aggregate Consecutive Absence Groups: Group the records based on the student and the calculated grouping identifier to identify sequences of consecutive absences.
- Filter Groups with Three or More Consecutive Absences: Select only those groups where the count of consecutive absences is three or more.
- Determine the Start Date of Absence Streaks: For each qualifying group, identify the earliest
attendance_dateas thestart_dateof the consecutive absence streak. - Order the Results: Present the final list ordered by
student_idandstart_datefor clarity.
SQL Query
WITH RankedAttendance AS ( SELECT student_id, attendance_date, status, ROW_NUMBER() OVER (PARTITION BY student_id ORDER BY attendance_date) AS s_rank FROM Attendance WHERE status = 'Absent' ), ConsecutiveAttendance AS ( SELECT student_id, MIN(attendance_date) AS start_date, COUNT(*) AS consecutive_days FROM ( SELECT student_id, attendance_date, s_rank - ROW_NUMBER() OVER (PARTITION BY student_id ORDER BY attendance_date) AS grouping_col FROM RankedAttendance ) AS grouped_absences GROUP BY student_id, grouping_col HAVING COUNT(*) >= 3 ) SELECT student_id, start_date FROM ConsecutiveAttendance ORDER BY student_id, start_date;
Step-by-Step Approach
Step 1: Filter Absent Records and Assign Sequential Ranks
Extract all absence records and assign a sequential rank to each absence per student based on the attendance_date. This ranking facilitates the detection of consecutive absences.
SQL Query:
WITH RankedAttendance AS ( SELECT student_id, attendance_date, status, ROW_NUMBER() OVER (PARTITION BY student_id ORDER BY attendance_date) AS s_rank FROM Attendance WHERE status = 'Absent' )
Explanation:
WITH RankedAttendance AS (...):- Defines a Common Table Expression (CTE) named
RankedAttendanceto store intermediate results.
- Defines a Common Table Expression (CTE) named
SELECT student_id, attendance_date, status, ROW_NUMBER() OVER (...) AS s_rank:student_id&attendance_date&status:- Selects relevant columns from the
Attendancetable.
- Selects relevant columns from the
ROW_NUMBER() OVER (PARTITION BY student_id ORDER BY attendance_date) AS s_rank:- Assigns a sequential number (
s_rank) to each absence perstudent_idordered byattendance_date. This helps in identifying the sequence of absences.
- Assigns a sequential number (
FROM Attendance:- Specifies the
Attendancetable as the data source.
- Specifies the
WHERE status = 'Absent':- Filters the records to include only those where the student's status is 'Absent'.
Output After Step 1:
Assuming the example input provided, the RankedAttendance CTE would produce:
+------------+-----------------+--------+--------+ | student_id | attendance_date | status | s_rank | +------------+-----------------+--------+--------+ | 1 | 2020-11-01 | Absent | 1 | | 1 | 2020-11-02 | Absent | 2 | | 1 | 2020-11-03 | Absent | 3 | | 2 | 2020-11-02 | Absent | 1 | | 2 | 2020-11-03 | Absent | 2 | | 2 | 2020-11-04 | Absent | 3 | +------------+-----------------+--------+--------+
Step 2: Calculate Grouping Identifier to Detect Consecutive Absences
Compute a grouping identifier (grouping_col) to help group consecutive absences. By subtracting the row number from the sequential rank, consecutive dates will have the same grouping_col.
SQL Query:
ConsecutiveAttendance AS ( SELECT student_id, MIN(attendance_date) AS start_date, COUNT(*) AS consecutive_days FROM ( SELECT student_id, attendance_date, s_rank - ROW_NUMBER() OVER (PARTITION BY student_id ORDER BY attendance_date) AS grouping_col FROM RankedAttendance ) AS grouped_absences GROUP BY student_id, grouping_col HAVING COUNT(*) >= 3 )
Explanation:
SELECT student_id, MIN(attendance_date) AS start_date, COUNT(*) AS consecutive_days:student_id:- Identifies the student.
MIN(attendance_date) AS start_date:- Determines the earliest date in the consecutive absence streak.
COUNT(*) AS consecutive_days:- Counts the number of consecutive absent days in the group.
FROM (...) AS grouped_absences:- Utilizes a subquery to compute the
grouping_col.
- Utilizes a subquery to compute the
SELECT student_id, attendance_date, s_rank - ROW_NUMBER() OVER (...) AS grouping_col:s_rank - ROW_NUMBER() OVER (...) AS grouping_col:- Calculates the
grouping_colby subtracting the row number from the sequential rank. Consecutive dates will result in the samegrouping_col, effectively grouping them together.
- Calculates the
FROM RankedAttendance:- References the
RankedAttendanceCTE from Step 1.
- References the
GROUP BY student_id, grouping_col:- Groups the records by
student_idandgrouping_colto identify distinct consecutive absence streaks.
- Groups the records by
HAVING COUNT(*) >= 3:- Filters the groups to include only those where the count of consecutive absences is three or more.
Output After Step 2:
Based on the example input, the ConsecutiveAttendance CTE would produce:
+------------+-------------+-----------------+ | student_id | start_date | consecutive_days| +------------+-------------+-----------------+ | 1 | 2020-11-01 | 3 | | 2 | 2020-11-02 | 3 | +------------+-------------+-----------------+
Step 3: Select and Order the Final Results
Retrieve the student_id and start_date of each qualifying student who was absent for three or more consecutive days. Order the results by student_id and start_date.
SQL Query:
SELECT student_id, start_date FROM ConsecutiveAttendance ORDER BY student_id, start_date;
Explanation:
SELECT student_id, start_date:- Selects the
student_idand the correspondingstart_dateof the consecutive absence streak.
- Selects the
FROM ConsecutiveAttendance:- Utilizes the
ConsecutiveAttendanceCTE from Step 2, which contains the grouped consecutive absences.
- Utilizes the
ORDER BY student_id, start_date:- Orders the final output first by
student_idin ascending order and then bystart_dateto organize the results clearly.
- Orders the final output first by
Final Output:
+------------+-------------+ | student_id | start_date | +------------+-------------+ | 1 | 2020-11-01 | | 2 | 2020-11-02 | +------------+-------------+
🤖 Don't fully get this? Learn it with Claude
Stuck on High School Attendance? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.
Build the mental picture, not memorization.
I just read a lesson on **High School Attendance** (Databases) and want to truly understand it. Explain High School Attendance from first principles using ONE vivid real-world analogy and a visual mental model — draw it as ASCII art or a clear step-by-step diagram — with a concrete example using real numbers. Then ask me one question to check I got the mental picture, and wait for my reply. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Socratic — adapts to where you're stuck.
Teach me **High School Attendance** interactively. Ask me ONE guiding question at a time, wait for my answer, and adapt to my confusion — build the idea with me step by step instead of explaining it all at once. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Active recall exposes what you missed.
Quiz me on **High School Attendance** with 5 questions, easy to tricky, ONE at a time. Tell me if each answer is right; at the end, explain clearly what I got wrong and why. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Intuition + hook + flashcards for long-term memory.
Help me remember **High School Attendance** for the long term: give the one-sentence intuition, a memorable hook/mnemonic, a tiny worked example, and 3 active-recall flashcards (Q -> A). If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.