medium Employee Collaboration Networks
Problem Statement
Table: CollaborationAccepted
Each row in this table records an instance where one employee invites another to collaborate on a project, and the invitation is accepted.
+----------------+---------+
| Column Name | Type |
+----------------+---------+
| inviter_id | int |
| invitee_id | int |
| project_id | int |
| accept_date | date |
+----------------+---------+
The combination of (inviter_id, invitee_id, project_id) is the primary key for this table.
This table includes the ID of the inviter, the ID of the invitee, the project they are collaborating on, and the date the invitation was accepted.
Develop a solution to find the employee(s) who have collaborated on the most number of unique projects. In case of a tie, list all such employees.
Example
Input:
CollaborationAccepted table: +------------+------------+------------+-------------+ | inviter_id | invitee_id | project_id | accept_date | +------------+------------+------------+-------------+ | 1 | 2 | 100 | 2020-05-01 | | 1 | 3 | 101 | 2020-05-02 | | 2 | 3 | 102 | 2020-05-03 | | 3 | 4 | 103 | 2020-05-04 | | 2 | 4 | 104 | 2020-05-05 | +------------+------------+------------+-------------+
Output:
+----+--------------+ | id | num_projects | +----+--------------+ | 2 | 3 | | 3 | 3 | +----+--------------+
In this example, both employees 2 and 3 have collaborated on 3 unique projects, which is the highest among all employees.
Try It yourself
-- TODO: Write your user queries here
Solution
To identify the employee(s) who have collaborated on the most number of unique projects, we analyze the CollaborationAccepted table. We can accomplishes this by leveraging Common Table Expressions (CTEs) and set operations to aggregate and determine the desired results. Below is a comprehensive breakdown of the approach, step-by-step execution, and detailed explanations for each line of the SQL query.
Approach Overview
- Combine Collaborations: Merge the inviter and invitee collaborations to treat all collaborations uniformly.
- Count Unique Projects per Employee: Calculate the number of distinct projects each employee has collaborated on.
- Identify Maximum Project Count: Determine the highest number of unique projects any employee has collaborated on.
- Select Top Collaborators: Retrieve the employee(s) whose project count matches the maximum identified.
- Present the Final Results: Display the employee IDs alongside their respective number of unique projects.
SQL Query
WITH AllCollaborations AS ( -- Combine inviter and invitee collaborations SELECT inviter_id AS id, project_id FROM CollaborationAccepted UNION SELECT invitee_id AS id, project_id FROM CollaborationAccepted ), ProjectCounts AS ( -- Count unique projects per employee SELECT id, COUNT(DISTINCT project_id) AS num_projects FROM AllCollaborations GROUP BY id ) -- Select employees with the maximum number of projects SELECT id, num_projects FROM ProjectCounts WHERE num_projects = ( SELECT MAX(num_projects) FROM ProjectCounts ) ORDER BY id;
Step-by-Step Approach
Step 1: Combine Collaborations (AllCollaborations)
Merge the inviter and invitee collaborations to create a unified view of all collaborations, treating both roles equally.
SQL Snippet:
WITH AllCollaborations AS ( -- Combine inviter and invitee collaborations SELECT inviter_id AS id, project_id FROM CollaborationAccepted UNION SELECT invitee_id AS id, project_id FROM CollaborationAccepted ),
Explanation:
-
WITH AllCollaborations AS (- Initiates a Common Table Expression (CTE) named
AllCollaborations. CTEs allow for temporary result sets that can be referenced within the main query.
- Initiates a Common Table Expression (CTE) named
-
SELECT inviter_id AS id, project_id- Selects the
inviter_idfrom theCollaborationAcceptedtable and aliases it asid. - Retrieves the associated
project_idfor each invitation.
- Selects the
-
UNION- Combines the results of the first
SELECTwith the secondSELECT. - The
UNIONoperator ensures that duplicate records are eliminated, resulting in a list of unique (id, project_id) pairs.
- Combines the results of the first
-
SELECT invitee_id AS id, project_id- Selects the
invitee_idfrom theCollaborationAcceptedtable and aliases it asid. - Retrieves the associated
project_idfor each accepted invitation.
- Selects the
Intermediate Output After Step 1 (AllCollaborations):
+----+------------+ | id | project_id | +----+------------+ | 1 | 100 | | 2 | 100 | | 1 | 101 | | 3 | 101 | | 2 | 102 | | 3 | 102 | | 3 | 103 | | 4 | 103 | | 2 | 104 | | 4 | 104 | +----+------------+
Step 2: Count Unique Projects per Employee (ProjectCounts)
Calculate the number of distinct projects each employee has collaborated on.
SQL Snippet:
ProjectCounts AS ( -- Count unique projects per employee SELECT id, COUNT(DISTINCT project_id) AS num_projects FROM AllCollaborations GROUP BY id )
Explanation:
-
ProjectCounts AS (- Initiates a second CTE named
ProjectCounts.
- Initiates a second CTE named
-
SELECT id, COUNT(DISTINCT project_id) AS num_projects- Selects the
id(employee ID) from theAllCollaborationsCTE. - Counts the number of distinct
project_ids each employee has collaborated on. - Aliases the count as
num_projects.
- Selects the
-
GROUP BY id- Groups the data by
idto perform the aggregation for each employee.
- Groups the data by
Intermediate Output After Step 2 (ProjectCounts):
+----+--------------+ | id | num_projects | +----+--------------+ | 1 | 2 | | 2 | 3 | | 3 | 3 | | 4 | 2 | +----+--------------+
Step 3: Identify Maximum Project Count
Determine the highest number of unique projects any employee has collaborated on.
SQL Snippet:
SELECT id, num_projects FROM ProjectCounts WHERE num_projects = ( SELECT MAX(num_projects) FROM ProjectCounts )
Explanation:
-
SELECT id, num_projects- Selects the
idandnum_projectscolumns from theProjectCountsCTE.
- Selects the
-
WHERE num_projects = (- Introduces a condition to filter employees whose
num_projectsequals the maximum found.
- Introduces a condition to filter employees whose
-
SELECT MAX(num_projects)- Selects the highest value of
num_projectsacross all employees.
- Selects the highest value of
-
ORDER BY id;- Orders the final result set by
idin ascending order for clarity.
- Orders the final result set by
Final Output:
+----+--------------+ | id | num_projects | +----+--------------+ | 2 | 3 | | 3 | 3 | +----+--------------+
Explanation of Output:
-
Employee 2:
- Collaborated on 3 unique projects (100, 102, 104)
-
Employee 3:
- Collaborated on 3 unique projects (101, 102, 103)
🤖 Don't fully get this? Learn it with Claude
Stuck on Employee Collaboration Networks? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.
Progressively stronger hints — you still solve it.
I'm working on the problem **Employee Collaboration Networks** (Databases). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
See the technique, not just code.
Explain the optimal approach to **Employee Collaboration Networks** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Catch bugs, edge cases, sub-optimality.
I'll paste my solution to **Employee Collaboration Networks**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Lock in recognition with look-alikes.
Give me 2 problems that use the SAME underlying pattern as **Employee Collaboration Networks**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.