Knowledge Guide
HomeDatabasesSQL Practice Problems

Game Play Analysis V

Problem

Table: Activity

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| player_id    | int     |
| device_id    | int     |
| event_date   | date    |
| games_played | int     |
+--------------+---------+
(player_id, event_date) is the primary key (combination of columns with unique values) of this table.
This table shows the activity of players of some games.
Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on someday using some device.

Problem Definition

The install date of a player is the first login day of that player.

We define day one retention of some date x to be the number of players whose install date is x and they logged back in on the day right after x, divided by the number of players whose install date is x, rounded to 2 decimal places.

Write a solution to report for each install date, the number of players that installed the game on that day, and the day one retention.

Example

Image
Image

Output

Image
Image

Try It Yourself

java
-- TODO: Write your user queries here

Solution

To determine the day one retention for each install date in the Activity table, we need to analyze player login patterns. Specifically, we want to identify players who installed the game on a particular day (their first login day) and then check if they returned to log in the day immediately following their install date. We can efficiently accomplishes this by leveraging window functions and aggregation techniques. Below is a comprehensive breakdown of the approach, step-by-step execution, and detailed explanations for each line of the SQL query.

Approach Overview

  1. Identify Install Dates:

    • Determine the first login day (install_dt) for each player.
  2. Calculate Installations per Date:

    • Count the number of players who installed the game on each install_dt.
  3. Determine Day One Retention:

    • For each install_dt, calculate the proportion of players who logged in again on the day immediately after their install date (install_dt + 1 day).
  4. Present the Results:

    • Display each install_dt alongside the number of installations and the corresponding day one retention rate, rounded to two decimal places.

SQL Query

SELECT install_dt, COUNT(DISTINCT player_id) AS installs, ROUND( SUM(CASE WHEN event_date = install_dt + INTERVAL 1 DAY THEN 1 ELSE 0 END) / COUNT(DISTINCT player_id), 2 ) AS Day1_retention FROM ( SELECT *, MIN(event_date) OVER (PARTITION BY player_id) AS install_dt FROM Activity ) TEMP GROUP BY install_dt;

Step-by-Step Approach

Step 1: Identify Install Dates

Determine the install date (install_dt) for each player, which is defined as the first day the player logged in.

SQL Snippet:

SELECT *, MIN(event_date) OVER (PARTITION BY player_id) AS install_dt FROM Activity

Explanation:

  1. SELECT *,

    • Selects all columns from the Activity table for further processing.
  2. MIN(event_date) OVER (PARTITION BY player_id) AS install_dt

    • MIN(event_date) OVER (...): Utilizes a window function to calculate the minimum event_date for each player_id. This represents the player's first login day.
    • PARTITION BY player_id: Divides the data into partitions based on player_id, ensuring that the minimum date is calculated separately for each player.
    • AS install_dt: Aliases the resulting minimum date as install_dt.
  3. FROM Activity

    • Specifies the Activity table as the data source.

Intermediate Output After Step 1:

+-----------+-----------+------------+--------------+------------+ | player_id | device_id | event_date | games_played | install_dt | +-----------+-----------+------------+--------------+------------+ | 1 | 2 | 2016-03-01 | 5 | 2016-03-01 | | 1 | 2 | 2016-03-02 | 6 | 2016-03-01 | | 2 | 3 | 2017-06-25 | 1 | 2017-06-25 | | 3 | 1 | 2016-03-01 | 0 | 2016-03-01 | | 3 | 4 | 2016-07-03 | 5 | 2016-03-01 | +-----------+-----------+------------+--------------+------------+

Step 2: Calculate Installations per Date and Day One Retention

For each install_dt, count the number of players who installed the game on that day and calculate the day one retention, which is the proportion of those players who logged back in on the day immediately following their install date.

SQL Snippet:

SELECT install_dt, COUNT(DISTINCT player_id) AS installs, ROUND( SUM(CASE WHEN event_date = install_dt + INTERVAL 1 DAY THEN 1 ELSE 0 END) / COUNT(DISTINCT player_id), 2 ) AS Day1_retention FROM ( -- Subquery from Step 1 ) TEMP GROUP BY install_dt;

Explanation:

  1. SELECT install_dt,

    • Selects the install_dt to group the results by each install date.
  2. COUNT(DISTINCT player_id) AS installs,

    • Counts the number of unique players (player_id) who have install_dt equal to the current install_dt being processed.
    • COUNT(DISTINCT player_id) ensures that each player is only counted once per install date.
    • AS installs aliases the count as installs.
  3. ROUND( SUM(CASE WHEN event_date = install_dt + INTERVAL 1 DAY THEN 1 ELSE 0 END) / COUNT(DISTINCT player_id), 2 ) AS Day1_retention

    • SUM(CASE WHEN event_date = install_dt + INTERVAL 1 DAY THEN 1 ELSE 0 END):
      • For each row, checks if the event_date is exactly one day after the install_dt.
      • If true, adds 1 to the sum; otherwise, adds 0.
      • This effectively counts the number of players who returned on the day following their install date.
    • / COUNT(DISTINCT player_id):
      • Divides the number of players who returned on day one by the total number of installations on that date.
      • This calculates the retention rate as a decimal.
    • ROUND(..., 2):
      • Rounds the retention rate to two decimal places for readability.
    • AS Day1_retention:
      • Aliases the rounded value as Day1_retention.
  4. FROM ( SELECT ..., install_dt FROM Activity ) TEMP

    • References the subquery (TEMP) from Step 1, which contains all activity records along with each player's install_dt.
  5. GROUP BY install_dt;

    • Groups the results by each install_dt to perform aggregations (COUNT and SUM) within each group.

Final Output:

+------------------------+----------+----------------+ | install_dt | installs | Day1_retention | +------------------------+----------+----------------+ | 2016-03-01T00:00:00.000Z | 2 | 0.50 | | 2017-06-25T00:00:00.000Z | 1 | 0.00 | +------------------------+----------+----------------+
🤖 Don't fully get this? Learn it with Claude

Stuck on Game Play Analysis V? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.

🎨 Explain it visually

Build the mental picture, not memorization.

I just read a lesson on **Game Play Analysis V** (Databases) and want to truly understand it. Explain Game Play Analysis V from first principles using ONE vivid real-world analogy and a visual mental model — draw it as ASCII art or a clear step-by-step diagram — with a concrete example using real numbers. Then ask me one question to check I got the mental picture, and wait for my reply. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🤔 Walk me through it (interactive)

Socratic — adapts to where you're stuck.

Teach me **Game Play Analysis V** interactively. Ask me ONE guiding question at a time, wait for my answer, and adapt to my confusion — build the idea with me step by step instead of explaining it all at once. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🧪 Quiz me & fix my gaps

Active recall exposes what you missed.

Quiz me on **Game Play Analysis V** with 5 questions, easy to tricky, ONE at a time. Tell me if each answer is right; at the end, explain clearly what I got wrong and why. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🧠 Make it stick

Intuition + hook + flashcards for long-term memory.

Help me remember **Game Play Analysis V** for the long term: give the one-sentence intuition, a memorable hook/mnemonic, a tiny worked example, and 3 active-recall flashcards (Q -> A). If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

📝 My notes