Game Play Analysis V
Problem
Table: Activity
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id, event_date) is the primary key (combination of columns with unique values) of this table.
This table shows the activity of players of some games.
Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on someday using some device.
Problem Definition
The install date of a player is the first login day of that player.
We define day one retention of some date x to be the number of players whose install date is x and they logged back in on the day right after x, divided by the number of players whose install date is x, rounded to 2 decimal places.
Write a solution to report for each install date, the number of players that installed the game on that day, and the day one retention.
Example
Output
Try It Yourself
-- TODO: Write your user queries here
Solution
To determine the day one retention for each install date in the Activity table, we need to analyze player login patterns. Specifically, we want to identify players who installed the game on a particular day (their first login day) and then check if they returned to log in the day immediately following their install date. We can efficiently accomplishes this by leveraging window functions and aggregation techniques. Below is a comprehensive breakdown of the approach, step-by-step execution, and detailed explanations for each line of the SQL query.
Approach Overview
-
Identify Install Dates:
- Determine the first login day (
install_dt) for each player.
- Determine the first login day (
-
Calculate Installations per Date:
- Count the number of players who installed the game on each
install_dt.
- Count the number of players who installed the game on each
-
Determine Day One Retention:
- For each
install_dt, calculate the proportion of players who logged in again on the day immediately after their install date (install_dt + 1 day).
- For each
-
Present the Results:
- Display each
install_dtalongside the number of installations and the corresponding day one retention rate, rounded to two decimal places.
- Display each
SQL Query
SELECT install_dt, COUNT(DISTINCT player_id) AS installs, ROUND( SUM(CASE WHEN event_date = install_dt + INTERVAL 1 DAY THEN 1 ELSE 0 END) / COUNT(DISTINCT player_id), 2 ) AS Day1_retention FROM ( SELECT *, MIN(event_date) OVER (PARTITION BY player_id) AS install_dt FROM Activity ) TEMP GROUP BY install_dt;
Step-by-Step Approach
Step 1: Identify Install Dates
Determine the install date (install_dt) for each player, which is defined as the first day the player logged in.
SQL Snippet:
SELECT *, MIN(event_date) OVER (PARTITION BY player_id) AS install_dt FROM Activity
Explanation:
-
SELECT *,- Selects all columns from the
Activitytable for further processing.
- Selects all columns from the
-
MIN(event_date) OVER (PARTITION BY player_id) AS install_dtMIN(event_date) OVER (...): Utilizes a window function to calculate the minimumevent_datefor eachplayer_id. This represents the player's first login day.PARTITION BY player_id: Divides the data into partitions based onplayer_id, ensuring that the minimum date is calculated separately for each player.AS install_dt: Aliases the resulting minimum date asinstall_dt.
-
FROM Activity- Specifies the
Activitytable as the data source.
- Specifies the
Intermediate Output After Step 1:
+-----------+-----------+------------+--------------+------------+ | player_id | device_id | event_date | games_played | install_dt | +-----------+-----------+------------+--------------+------------+ | 1 | 2 | 2016-03-01 | 5 | 2016-03-01 | | 1 | 2 | 2016-03-02 | 6 | 2016-03-01 | | 2 | 3 | 2017-06-25 | 1 | 2017-06-25 | | 3 | 1 | 2016-03-01 | 0 | 2016-03-01 | | 3 | 4 | 2016-07-03 | 5 | 2016-03-01 | +-----------+-----------+------------+--------------+------------+
Step 2: Calculate Installations per Date and Day One Retention
For each install_dt, count the number of players who installed the game on that day and calculate the day one retention, which is the proportion of those players who logged back in on the day immediately following their install date.
SQL Snippet:
SELECT install_dt, COUNT(DISTINCT player_id) AS installs, ROUND( SUM(CASE WHEN event_date = install_dt + INTERVAL 1 DAY THEN 1 ELSE 0 END) / COUNT(DISTINCT player_id), 2 ) AS Day1_retention FROM ( -- Subquery from Step 1 ) TEMP GROUP BY install_dt;
Explanation:
-
SELECT install_dt,- Selects the
install_dtto group the results by each install date.
- Selects the
-
COUNT(DISTINCT player_id) AS installs,- Counts the number of unique players (
player_id) who haveinstall_dtequal to the currentinstall_dtbeing processed. COUNT(DISTINCT player_id)ensures that each player is only counted once per install date.AS installsaliases the count asinstalls.
- Counts the number of unique players (
-
ROUND( SUM(CASE WHEN event_date = install_dt + INTERVAL 1 DAY THEN 1 ELSE 0 END) / COUNT(DISTINCT player_id), 2 ) AS Day1_retentionSUM(CASE WHEN event_date = install_dt + INTERVAL 1 DAY THEN 1 ELSE 0 END):- For each row, checks if the
event_dateis exactly one day after theinstall_dt. - If true, adds
1to the sum; otherwise, adds0. - This effectively counts the number of players who returned on the day following their install date.
- For each row, checks if the
/ COUNT(DISTINCT player_id):- Divides the number of players who returned on day one by the total number of installations on that date.
- This calculates the retention rate as a decimal.
ROUND(..., 2):- Rounds the retention rate to two decimal places for readability.
AS Day1_retention:- Aliases the rounded value as
Day1_retention.
- Aliases the rounded value as
-
FROM ( SELECT ..., install_dt FROM Activity ) TEMP- References the subquery (
TEMP) from Step 1, which contains all activity records along with each player'sinstall_dt.
- References the subquery (
-
GROUP BY install_dt;- Groups the results by each
install_dtto perform aggregations (COUNTandSUM) within each group.
- Groups the results by each
Final Output:
+------------------------+----------+----------------+ | install_dt | installs | Day1_retention | +------------------------+----------+----------------+ | 2016-03-01T00:00:00.000Z | 2 | 0.50 | | 2017-06-25T00:00:00.000Z | 1 | 0.00 | +------------------------+----------+----------------+
🤖 Don't fully get this? Learn it with Claude
Stuck on Game Play Analysis V? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.
Build the mental picture, not memorization.
I just read a lesson on **Game Play Analysis V** (Databases) and want to truly understand it. Explain Game Play Analysis V from first principles using ONE vivid real-world analogy and a visual mental model — draw it as ASCII art or a clear step-by-step diagram — with a concrete example using real numbers. Then ask me one question to check I got the mental picture, and wait for my reply. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Socratic — adapts to where you're stuck.
Teach me **Game Play Analysis V** interactively. Ask me ONE guiding question at a time, wait for my answer, and adapt to my confusion — build the idea with me step by step instead of explaining it all at once. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Active recall exposes what you missed.
Quiz me on **Game Play Analysis V** with 5 questions, easy to tricky, ONE at a time. Tell me if each answer is right; at the end, explain clearly what I got wrong and why. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Intuition + hook + flashcards for long-term memory.
Help me remember **Game Play Analysis V** for the long term: give the one-sentence intuition, a memorable hook/mnemonic, a tiny worked example, and 3 active-recall flashcards (Q -> A). If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.