Knowledge Guide
HomeDSAAdvanced Patterns

medium Distinct elements in subarray

Problem Statement

You are given an integer array nums[] and a list of queries queries[], where each query is defined as queries[i] = [lefti, righti]. For each query, determine the number of distinct integers in the subarray of nums that starts at index lefti and ends at index righti for each query.

Return an array answer[] where answer[i] is the result for the ith query.

Examples

  1. Example 1:

    • Input: nums = [2, 1, 2, 3, 4], queries = [[0, 2], [1, 3], [2, 4]]
    • Expected Output: [2, 3, 3]
    • Explanation:
      • For the first query [0, 2], the subarray is [2, 1, 2], which has 2 distinct elements: {2, 1}.
      • For the second query [1, 3], the subarray is [1, 2, 3], which has 3 distinct elements: {1, 2, 3}.
      • For the third query [2, 4], the subarray is [2, 3, 4], which has 3 distinct elements: {2, 3, 4}.
  2. Example 2:

    • Input: nums = [5, 3, 5, 7, 5, 3, 8], queries = [[0, 5], [2, 4], [3, 6]]
    • Expected Output: [3, 2, 4]
    • Explanation:
      • For the first query [0, 5], the subarray is [5, 3, 5, 7, 5, 3], which has 3 distinct elements: {5, 3, 7}.
      • For the second query [2, 4], the subarray is [5, 7, 5], which has 2 distinct elements: {5, 7}.
      • For the third query [3, 6], the subarray is [7, 5, 3, 8], which has 4 distinct elements: {7, 5, 3, 8}.
  3. Example 3:

    • Input: nums[] = [9, 9, 8, 8, 7, 6, 6, 5], queries[] = [[0, 3], [2, 5], [4, 7]]
    • Expected Output:answer[] = [2, 3, 3]
    • Explanation:
      • For the first query [0, 3], the subarray is [9, 9, 8, 8], which has 2 distinct elements: {9, 8}.
      • For the second query [2, 5], the subarray is [8, 8, 7, 6], which has 3 distinct elements: {8, 7, 6}.
      • For the third query [4, 7], the subarray is [7, 6, 6, 5], which has 3 distinct elements: {7, 6, 5}.

Try it yourself

Try solving this question here:

✅ Solution Distinct elements in subarray

Problem Statement

You are given an integer array nums[] and a list of queries queries[], where each query is defined as queries[i] = [lefti, righti]. For each query, determine the number of distinct integers in the subarray of nums that starts at index lefti and ends at index righti for each query.

Return an array answer[] where answer[i] is the result for the ith query.

Examples

  1. Example 1:

    • Input: nums = [2, 1, 2, 3, 4], queries = [[0, 2], [1, 3], [2, 4]]
    • Expected Output: [2, 3, 3]
    • Explanation:
      • For the first query [0, 2], the subarray is [2, 1, 2], which has 2 distinct elements: {2, 1}.
      • For the second query [1, 3], the subarray is [1, 2, 3], which has 3 distinct elements: {1, 2, 3}.
      • For the third query [2, 4], the subarray is [2, 3, 4], which has 3 distinct elements: {2, 3, 4}.
  2. Example 2:

    • Input: nums = [5, 3, 5, 7, 5, 3, 8], queries = [[0, 5], [2, 4], [3, 6]]
    • Expected Output: [3, 2, 4]
    • Explanation:
      • For the first query [0, 5], the subarray is [5, 3, 5, 7, 5, 3], which has 3 distinct elements: {5, 3, 7}.
      • For the second query [2, 4], the subarray is [5, 7, 5], which has 2 distinct elements: {5, 7}.
      • For the third query [3, 6], the subarray is [7, 5, 3, 8], which has 4 distinct elements: {7, 5, 3, 8}.
  3. Example 3:

    • Input: nums[] = [9, 9, 8, 8, 7, 6, 6, 5], queries[] = [[0, 3], [2, 5], [4, 7]]
    • Expected Output:answer[] = [2, 3, 3]
    • Explanation:
      • For the first query [0, 3], the subarray is [9, 9, 8, 8], which has 2 distinct elements: {9, 8}.
      • For the second query [2, 5], the subarray is [8, 8, 7, 6], which has 3 distinct elements: {8, 7, 6}.
      • For the third query [4, 7], the subarray is [7, 6, 6, 5], which has 3 distinct elements: {7, 6, 5}.

Solution

To solve this problem, we use Mo's algorithm, an efficient method designed to handle range queries. Mo's algorithm works by sorting the queries based on the block in which the left index falls (using the square root of the array length as the block size) and then sorting by the right index within each block.

Once the queries are sorted, the algorithm processes each query by adjusting the currentLeft and currentRight pointers to match the query's range. For each element added to or removed from the current subarray, the algorithm updates a frequency array to track how many times each element appears. If an element is added for the first time, the count of distinct elements is incremented; if it is removed completely, the count is decremented. After adjusting the pointers, the current count of distinct elements is stored as the result for the query.

This approach ensures that we compute the distinct element counts efficiently across multiple queries, leveraging the reduced pointer movement to optimize performance.

Step-by-Step Algorithm

  1. Initialization:

    • Step 1.1: Calculate the size of the input array nums[] and store it in n.
    • Step 1.2: Calculate the number of queries in queries[] and store it in q.
    • Step 1.3: Create an array answer[] of size q to store the results of each query.
    • Step 1.4: Determine the block size for Mo's algorithm by calculating the square root of n and store it in sqrtN.
    • Step 1.5: Create an array queryArr[] of size q to store the queries as Query objects with additional information (left, right, and index).
  2. Query Conversion:

    • Step 2.1: For each query in queries[], convert it into a Query object by storing left, right, and the original index of the query.
    • Step 2.2: Store each Query object into queryArr[].
  3. Sorting Queries:

    • Step 3.1: Sort queryArr[] based on the block number of left (left / sqrtN) and, within the same block, by right.
    • Step 3.2: This sorting helps in minimizing the movements of the pointers when processing the queries, which is key to optimizing Mo's algorithm.
  4. Frequency Array Initialization:

    • Step 4.1: Create a frequency array freq[] of size 100001 to keep track of the frequency of elements in the current subarray.
    • Step 4.2: Initialize two pointers currentLeft and currentRight to 0. These pointers represent the current range of the subarray.
    • Step 4.3: Initialize currentDistinctCount to 0. This variable tracks the number of distinct elements in the current subarray.
  5. Processing Queries:

    • Step 5.1: For each query in queryArr[], get the left and right values.
    • Step 5.2: Expand currentRight: Move the currentRight pointer from its current position to right.
      • Step 5.2.1: For each element added to the subarray, if its frequency in freq[] was 0 before adding, increment currentDistinctCount.
      • Step 5.2.2: Update the frequency of the element in freq[] by incrementing it.
    • Step 5.3: Shrink currentRight: Move the currentRight pointer leftwards if it exceeds right.
      • Step 5.3.1: For each element removed from the subarray, if its frequency in freq[] becomes 0 after removing, decrement currentDistinctCount.
      • Step 5.3.2: Update the frequency of the element in freq[] by decrementing it.
    • Step 5.4: Expand currentLeft: Move the currentLeft pointer from its current position to left.
      • Step 5.4.1: For each element added to the subarray, if its frequency in freq[] was 0 before adding, increment currentDistinctCount.
      • Step 5.4.2: Update the frequency of the element in freq[] by incrementing it.
    • Step 5.5: Shrink currentLeft: Move the currentLeft pointer rightwards if it exceeds left.
      • Step 5.5.1: For each element removed from the subarray, if its frequency in freq[] becomes 0 after removing, decrement currentDistinctCount.
      • Step 5.5.2: Update the frequency of the element in freq[] by decrementing it.
    • Step 5.6: Store the value of currentDistinctCount in answer[] at the index corresponding to the original query.
  6. Return Result:

    • Step 6.1: Return the answer[] array containing the number of distinct elements for each query.

Algorithm Walkthrough

Let's go through the algorithm step by step for the given input:

  • Input:
    • nums = [2, 1, 2, 3, 4]
    • queries = [[0, 2], [1, 3], [2, 4]]
Image
Image

Initial Setup

  • n = 5 (length of nums)
  • q = 3 (number of queries)
  • sqrtN = 2 (block size, calculated as sqrt(5))
  • Initialize answer[] = [0, 0, 0]
  • Create queryArr[] with the following entries:
    • Query(left=0, right=2, index=0)
    • Query(left=1, right=3, index=1)
    • Query(left=2, right=4, index=2)

Sorting Queries

The queries are sorted by their left block and then by the right index. After sorting:

  • queryArr[] = [Query(left=0, right=2, index=0), Query(left=1, right=3, index=1), Query(left=2, right=4, index=2)]

Processing Queries

Query 1: [0, 2]

  • Current State:
    • currentLeft = 0
    • currentRight = 0
    • currentDistinctCount = 0
    • freq[] = [0, 0, 0, 0, 0, ...]
  • Operations:
    • Expand currentRight to 2:
      • currentRight = 0: nums[0] = 2
        • freq[2] = 0, Update currentDistinctCount = 1
        • Update freq[] = [0, 0, 1, 0, 0, ...]
      • currentRight = 1: nums[1] = 1
        • freq[1] = 0, Update currentDistinctCount = 2
        • Update freq[] = [0, 1, 1, 0, 0, ...]
      • currentRight = 2: nums[2] = 2
        • freq[2] = 1, currentDistinctCount = 2 (no change, 2 was already in the subarray)
        • Update freq[] = [0, 1, 2, 0, 0, ...]
  • Store Result:
    • answer[0] = 2

Query 2: [1, 3]

  • Current State:
    • currentLeft = 0
    • currentRight = 2
    • currentDistinctCount = 2
    • freq[] = [0, 1, 2, 0, 0, ...]
  • Operations:
    • Expand currentRight to 3:
      • currentRight = 3: nums[3] = 3
        • freq[3] = 0, Update currentDistinctCount = 3
        • Update req[] = [0, 1, 2, 1, 0, ...]`
    • Shrink currentLeft to 1:
      • currentLeft = 0: Remove nums[0] = 2
        • freq[2] = 2, currentDistinctCount = 3 (no change, 2 still exists in the subarray)
        • Update req[] = [0, 1, 1, 1, 0, ...]
  • Store Result:
    • answer[1] = 3

Query 3: [2, 4]

  • Current State:
    • currentLeft = 1
    • currentRight = 3
    • currentDistinctCount = 3
    • freq[] = [0, 1, 1, 1, 0, ...]
  • Operations:
    • Expand currentRight to 4:
      • currentRight = 4: nums[4] = 4
        • freq[4] = 0, Update currentDistinctCount = 4
        • Update req[] = [0, 1, 1, 1, 1, ...]
    • Shrink currentLeft to 2:
      • currentLeft = 1: Remove nums[1] = 1
        • freq[1] = 1, Update currentDistinctCount = 3 (decrement, 1 is no longer in the subarray)
        • Update req[] = [0, 0, 1, 1, 1, ...]
  • Store Result:
    • answer[2] = 3

Final Result

  • The answer[] array will contain [2, 3, 3].

Code

java
import java.util.*;

class Solution {

  static class Query {

    int left, right, index;

    // Constructor to initialize query details
    Query(int left, int right, int index) {
      this.left = left;
      this.right = right;
      this.index = index;
    }
  }

  public int[] distinctElements(int[] nums, int[][] queries) {
    int n = nums.length;
    int q = queries.length;
    int[] answer = new int[q];
    int sqrtN = (int) Math.sqrt(n); // Determine the block size for Mo's algorithm

    Query[] queryArr = new Query[q];
    for (int i = 0; i < q; i++) {
      queryArr[i] = new Query(queries[i][0], queries[i][1], i);
    }

    // Sort queries by the block of `left` and then by `right` within each block
    Arrays.sort(queryArr, (a, b) -> {
      if (a.left / sqrtN != b.left / sqrtN) {
        return a.left / sqrtN - b.left / sqrtN;
      } else {
        return a.right - b.right;
      }
    });

    int[] freq = new int[100001]; // Frequency array for tracking element counts
    int currentLeft = 0,
      currentRight = 0,
      currentDistinctCount = 0;

    for (Query query : queryArr) {
      int left = query.left;
      int right = query.right;

      // Expand the current right boundary
      while (currentRight <= right) {
        if (freq[nums[currentRight]] == 0) currentDistinctCount++;
        freq[nums[currentRight]]++;
        currentRight++;
      }

      // Shrink the current right boundary
      while (currentRight > right + 1) {
        currentRight--;
        if (freq[nums[currentRight]] == 1) currentDistinctCount--;
        freq[nums[currentRight]]--;
      }

      // Expand the current left boundary
      while (currentLeft > left) {
        currentLeft--;
        if (freq[nums[currentLeft]] == 0) currentDistinctCount++;
        freq[nums[currentLeft]]++;
      }

      // Shrink the current left boundary
      while (currentLeft < left) {
        if (freq[nums[currentLeft]] == 1) currentDistinctCount--;
        freq[nums[currentLeft]]--;
        currentLeft++;
      }

      // Store the result for this query
      answer[query.index] = currentDistinctCount;
    }

    return answer;
  }

  public static void main(String[] args) {
    Solution solution = new Solution();

    // Example 1
    int[] nums1 = { 2, 1, 2, 3, 4 };
    int[][] queries1 = { { 0, 2 }, { 1, 3 }, { 2, 4 } };
    System.out.println(
      Arrays.toString(solution.distinctElements(nums1, queries1))
    ); // [2, 3, 3]

    // Example 2
    int[] nums2 = { 5, 3, 5, 7, 5, 3, 8 };
    int[][] queries2 = { { 0, 5 }, { 2, 4 }, { 3, 6 } };
    System.out.println(
      Arrays.toString(solution.distinctElements(nums2, queries2))
    ); // [3, 2, 4]

    // Example 3
    int[] nums3 = { 9, 9, 8, 8, 7, 6, 6, 5 };
    int[][] queries3 = { { 0, 3 }, { 2, 5 }, { 4, 7 } };
    System.out.println(
      Arrays.toString(solution.distinctElements(nums3, queries3))
    ); // [2, 3, 3]
  }
}

Complexity Analysis

Time Complexity

The overall time complexity is , according to Mo's algorithm.

Space Complexity

  • The primary space usage is the freq array, which is of size , i.e., in this case. Additionally, we use space for storing the results (O(q)), and the queries themselves, which is .

Thus, the space complexity is , which simplifies to assuming a constant maximum element value.

🤖 Don't fully get this? Learn it with Claude

Stuck on Distinct elements in subarray? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.

🪜 Hint ladder (no spoilers)

Progressively stronger hints — you still solve it.

I'm working on the problem **Distinct elements in subarray** (DSA). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🎨 Explain the approach visually

See the technique, not just code.

Explain the optimal approach to **Distinct elements in subarray** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🔍 Review my solution

Catch bugs, edge cases, sub-optimality.

I'll paste my solution to **Distinct elements in subarray**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🔁 Drill the pattern

Lock in recognition with look-alikes.

Give me 2 problems that use the SAME underlying pattern as **Distinct elements in subarray**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

📝 My notes