medium Extra Characters in a String

Problem Statement

Given a string s and an array of words words. Break string s into multiple non-overlapping substrings such that each substring should be part of the words. There are some characters left which are not part of any substring.

Return the minimum number of remaining characters in s, which are not part of any substring after string break-up.

Examples

Example 1:
- Input: s = "amazingracecar", words = ["race", "car"]
- Expected Output: 7
- Justification: The string s can be rearranged to form "racecar", leaving 'a', 'm', 'a', 'z', 'i', 'n', 'g' as extra.
Example 2:
- Input: s = "bookkeeperreading", words = ["keep", "read"]
- Expected Output: 9
- Justification: The words "keep" and "read" can be formed from s, but 'b', 'o', 'o', 'k', 'e', 'r', 'i', 'n', 'g' are extra.
Example 3:
- Input: s = "thedogbarksatnight", words = ["dog", "bark", "night"]
- Expected Output: 6
- Justification: The words "dog", "bark", and "night" can be formed, leaving 't', 'h', 'e', 's', 'a', 't' as extra characters.

Constraints:

1 <= str.length <= 50
1 <= dictionary.length <= 50
1 <= dictionary[i].length <= 50
dictionary[i] and s consists of only lowercase English letters
dictionary contains distinct words

Try it yourself

Try solving this question here:

✅ Solution Extra Characters in a String

Problem Statement

Return the minimum number of remaining characters in s, which are not part of any substring after string break-up.

Examples

Example 1:
- Input: s = "amazingracecar", dictionary = ["race", "car"]
- Expected Output: 7
- Justification: The string s can be rearranged to form "racecar", leaving 'a', 'm', 'a', 'z', 'i', 'n', 'g' as extra.
Example 2:
- Input: s = "bookkeeperreading", dictionary = ["keep", "read"]
- Expected Output: 9
- Justification: The words "keep" and "read" can be formed from s, but 'b', 'o', 'o', 'k', 'e', 'r', 'i', 'n', 'g' are extra.
Example 3:
- Input: s = "thedogbarksatnight", dictionary = ["dog", "bark", "night"]
- Expected Output: 6
- Justification: The words "dog", "bark", and "night" can be formed, leaving 't', 'h', 'e', 's', 'a', 't' as extra characters.

Constraints:

1 <= str.length <= 50
1 <= dictionary.length <= 50
1 <= dictionary[i].length <= 50
dictionary[i] and s consists of only lowercase English letters
dictionary contains distinct words

Solution

To solve the "Extra Characters in a String" problem, we utilize Dynamic Programming (DP) to efficiently determine the minimum number of extra characters that cannot be part of any word in the given dictionary. DP helps in breaking down the problem into smaller subproblems, ensuring that each subproblem is solved only once and its result is reused, thereby optimizing the overall computation.

We explore four distinct approaches to implement this DP solution:

Solution 1: Top Down Dynamic Programming with Substring Method
Solution 2: Bottom Up Dynamic Programming with Substring Method
Solution 3: Top Down Dynamic Programming with Trie
Solution 4: Bottom Up Dynamic Programming with Trie

Solution 1: Top Down Dynamic Programming with Substring Method

In the Top Down Dynamic Programming with Substring Method, we recursively explore all possible substrings of the input string s starting from the first character. At each step, we check if the current substring exists in the dictionary. If it does, we recursively solve the remaining part of the string. We use memoization to store the results of subproblems to avoid redundant computations. The goal is to minimize the number of extra characters by maximizing the number of dictionary words used.

This approach systematically breaks down the problem by considering every possible split of the string s. By leveraging memoization, we ensure that each substring is processed only once, significantly reducing the time complexity. This method is intuitive and straightforward, making it easy to implement and understand.

Step-by-Step Algorithm

Initialize Data Structures:
- Create a memoization array memo of size length to store the results for each starting index in the string.
- Convert the dictionary into a HashSet (wordSet) for quick lookup of words.
Recursive Function solve(index, length, s):
- Base Case: If index equals length (end of the string), return 0 because there are no extra characters left.
- Check Memoization: If the result for index is already computed (memo[index] != null), return the cached result.
- Count Extra Characters:
  - Call solve(index + 1, length, s) to treat the current character as extra and move to the next character. Add 1 to account for this extra character.
- Check Substrings:
  - Iterate over possible substrings starting at index:
    - Extract the substring s.substring(index, end + 1) and check if it exists in wordSet.
    - If the substring exists, recursively call solve(end + 1, length, s) to process the rest of the string.
    - Update minExtra with the smaller value between the current minExtra and the result of the recursive call.
- Store Result:
  - Save the computed minExtra value in memo[index] and return it.
Compute the Result:
- Start by calling solve(0, length, s) to process the entire string.
Return Final Output:
- The result from solve(0, length, s) represents the minimum number of extra characters.

Code

java

import java.util.*;

class Solution {

  Integer[] memo; // Memoization array to store results for each index
  HashSet<String> wordSet; // HashSet for quick dictionary lookup

  public int minExtraChar(String s, String[] dictionary) {
    int length = s.length();
    memo = new Integer[length]; // Initialize memoization array
    wordSet = new HashSet<>(Arrays.asList(dictionary)); // Populate HashSet with dictionary words
    return solve(0, length, s);
  }

  private int solve(int index, int length, String s) {
    // Base case: when we reach the end of the string
    if (index == length) {
      return 0;
    }

    // Return the cached result if already computed
    if (memo[index] != null) {
      return memo[index];
    }

    // Count the current character as an extra character
    int minExtra = solve(index + 1, length, s) + 1;

    // Try forming substrings starting from the current index
    for (int end = index; end < length; end++) {
      String substring = s.substring(index, end + 1); // Current substring
      if (wordSet.contains(substring)) {
        // Check if the substring is in the dictionary
        minExtra = Math.min(minExtra, solve(end + 1, length, s)); // Update minimum extra characters
      }
    }

    // Store the result in the memo and return
    return memo[index] = minExtra;
  }

  public static void main(String[] args) {
    Solution solution = new Solution();

    // Example 1
    String s1 = "amazingracecar";
    String[] dictionary1 = { "race", "car" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s1, dictionary1)
    );
    // Expected Output: 7

    // Example 2
    String s2 = "bookkeeperreading";
    String[] dictionary2 = { "keep", "read" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s2, dictionary2)
    );
    // Expected Output: 9

    // Example 3
    String s3 = "thedogbarksatnight";
    String[] dictionary3 = { "dog", "bark", "night" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s3, dictionary3)
    );
    // Expected Output: 6
  }
}

Complexity Analysis

Time Complexity

Number of States:
- The recursive function solve is called for each unique starting index, resulting in states where N is the length of the string s.
Substring Iteration:
- For each state, the algorithm iterates through substrings from index to length, which takes time in the worst case.
Substring Creation:
- Extracting a substring using s.substring(start, end + 1) takes time.
Overall Time Complexity:
- Combining the above factors, the total time complexity is $³$ .

Space Complexity

Memoization Array:
- The memo array stores results for up to N indices, consuming space.
HashSet for Dictionary:
- The wordSet stores up to K words, each with an average length of M, resulting in space.
Call Stack:
- The recursion depth can reach up to N in the worst case, requiring space for the call stack.
Overall Space Complexity:
- The total space complexity is .

Solution 2: Bottom-Up Dynamic Programming with Substring Method

In this approach, the main idea is to minimize the number of extra characters by leveraging the dictionary words to form substrings from the given string s. We iterate over the string from the end to the beginning, calculating the minimum extra characters needed for every starting index.

We use a DP array (dp) where dp[start] represents the minimum extra characters required for the substring s[start:]. Initially, for every index, we assume the current character is extra. Then, we check all possible substrings starting at the current index. If a substring matches a word in the dictionary, we update dp[start] to reflect the better result. Finally, the value at dp[0] gives the minimum extra characters needed for the entire string.

Compared to the top-down approach, this method avoids recursion and uses iteration, making it straightforward and stack-memory efficient. It iteratively calculates the result starting from the end of the string and moves towards the beginning.

Step-by-Step Algorithm

Convert Dictionary to HashSet:
- Create a HashSet called wordSet from the dictionary for fast lookup of substrings.
Initialize DP Array:
- Create a dp array of size length + 1 (where length is the length of s).
- dp[start] will store the minimum extra characters required for the substring s[start:].
Iterate Backward Over the String:
- Loop from start = length - 1 to 0:
  - Initialize dp[start] = dp[start + 1] + 1.
    - This assumes the current character at start is extra.
Check Substrings:
- For each start, loop over all possible substrings starting at start:
  - This step Tries forming substrings starting from the current index.
  - For end ranging from start to length - 1:
    - Extract substring = s.substring(start, end + 1).
    - If substring exists in wordSet:
      - Update dp[start] = Math.min(dp[start], dp[end + 1]).
Return Final Result:
- The result is stored in dp[0], representing the minimum extra characters required for the entire string.

Algorithm Walkthrough

Input:

String: s = "thedogbarksatnight".
Dictionary: dictionary = ["dog", "bark", "night"].

Initialization:

String Length: length = 18.
Dictionary as HashSet: wordSet = {"dog", "bark", "night"}.
DP Array: dp = [0, 0, ..., 0] (size 19, initialized with 0).

Steps:

Step 1: Start iterating from the end of the string.

At start = 18 (beyond the end of the string):
- dp[18] = 0 (Base case).
At start = 17 (character = 't'):
- Initialize dp[17] = dp[18] + 1 = 1.
- Substrings starting at 17: "t".
  - None of the substrings match wordSet.
- Result: dp[17] = 1.
At start = 16 (character = 'h'):
- Initialize dp[16] = dp[17] + 1 = 2.
- Substrings starting at 16: "h", "ht".
  - None of the substrings match wordSet.
- Result: dp[16] = 2.
At start = 15 (character = 'g'):
- Initialize dp[15] = dp[16] + 1 = 3.
- Substrings starting at 15: "g", "gh", "ght".
  - None of the substrings match wordSet.
- Result: dp[15] = 3.
At start = 14 (character = 'i'):
- Initialize dp[14] = dp[15] + 1 = 4.
- Substrings starting at 14: "i", "ig", "igh", "ight".
  - None of the substrings match wordSet.
- Result: dp[14] = 4.

Step 2: Handle substring "night" at index 13.

At start = 13 (character = 'n'):
- Initialize dp[13] = dp[14] + 1 = 5.
- Substrings starting at 13:
  - "n", "ni", "nig", "nigh", "night".
    - "night" matches wordSet.
    - Update dp[13] = dp[18] = 0.
- Result: dp[13] = 0.
At start = 12 (character = 't'):
- Initialize dp[12] = dp[13] + 1 = 1.
- Substrings starting at 12: "t", "tn", "tni", "tnig", "tnigh".
  - None of the substrings match wordSet.
- Result: dp[12] = 1.

Step 3: Handle substring "bark" at index 6.

At start = 6 (character = 'b'):
- Initialize dp[6] = dp[7] + 1 = 6 + 1 = 7.
- Substrings starting at 6:
  - "b", "ba", "bar", "bark".
    - "bark" matches wordSet.
    - Update dp[6] = dp[10] = 3.
- Result: dp[6] = 3.
At start = 5 (character = 'g'):
- Initialize dp[5] = dp[6] + 1 = 4.
- Substrings starting at 5: "g", "gb", "gba", "gbar", "gbark".
  - None of the substrings match wordSet.
- Result: dp[5] = 4.

Step 4: Handle substring "dog" at index 3.

At start = 3 (character = 'd'):
- Initialize dp[3] = dp[4] + 1 = 5 + 1 = 6.
- Substrings starting at 3:
  - "d", "do", "dog".
    - "dog" matches wordSet.
    - Update dp[3] = dp[6] = 3.
- Result: dp[3] = 3.
At start = 2 (character = 'e'):

Initialize dp[2] = dp[3] + 1 = 4.
Substrings starting at 2: "e", "ed", "edo", "edog", etc.
- None of the substrings match wordSet.
Result: dp[2] = 4.

Step 5: Continue Backward to the Beginning.

At start = 0 (character = 't'):
- Initialize dp[0] = dp[1] + 1 = 5 + 1 = 6.
- Substrings starting at 0:
  - "t", "th", "the", "thed", "thedo", "thedog".
    - No valid substring.
- Result: dp[0] = 6.

Final DP Array:

dp = [6, 5, 4, 3, 5, 4, 3, 6, 5, 4, 3, 2, 1, 0, 4, 3, 2, 1, 0].

Final Result:

Minimum Extra Characters: dp[0] = 6.
Explanation:
- "dog", "bark", and "night" are matched from the dictionary.
- Extra characters are 't', 'h', 'e', 's', 'a', 't'.

Code

java

import java.util.*;

class Solution {

  public int minExtraChar(String s, String[] dictionary) {
    int length = s.length(); // Length of the string
    var wordSet = new HashSet<>(Arrays.asList(dictionary)); // Convert dictionary to HashSet for quick lookups
    var dp = new int[length + 1]; // DP array to store minimum extra characters from each index

    // Start iterating from the end of the string to the beginning
    for (int start = length - 1; start >= 0; start--) {
      // Initialize dp[start] by counting the current character as extra
      dp[start] = dp[start + 1] + 1;

      // Try forming substrings starting from the current index
      for (int end = start; end < length; end++) {
        String substring = s.substring(start, end + 1); // Extract substring
        if (wordSet.contains(substring)) {
          // Check if substring exists in the dictionary
          dp[start] = Math.min(dp[start], dp[end + 1]); // Update dp[start] with the minimum value
        }
      }
    }

    // The result is the minimum extra characters starting from index 0
    return dp[0];
  }

  public static void main(String[] args) {
    Solution solution = new Solution();

    // Example 1
    String s1 = "amazingracecar";
    String[] dictionary1 = { "race", "car" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s1, dictionary1)
    );
    // Expected Output: 7

    // Example 2
    String s2 = "bookkeeperreading";
    String[] dictionary2 = { "keep", "read" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s2, dictionary2)
    );
    // Expected Output: 9

    // Example 3
    String s3 = "thedogbarksatnight";
    String[] dictionary3 = { "dog", "bark", "night" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s3, dictionary3)
    );
    // Expected Output: 6
  }
}

Complexity Analysis

Time Complexity

Outer Loop:
- Iterates over all starting indices of the string, contributing operations.
Inner Loop:
- For each start, the inner loop iterates over all possible substrings starting at start, contributing operations.
Substring Extraction:
- Extracting substrings with s.substring(start, end + 1) takes time in the worst case.
Overall Time Complexity:
- Combining all the factors, the time complexity is $³$ .

Space Complexity

DP Array:
- The dp array of size N + 1 contributes space.
HashSet for Dictionary:
- Storing the dictionary in a HashSet requires space, where M is the average length of words in the dictionary and K is the number of words.
Overall Space Complexity: .

Solution 3: Top Down Dynamic Programming with Trie

This solution uses a top-down dynamic programming approach with a Trie to efficiently determine the number of extra characters in a given substring that cannot form words in the dictionary. The Trie allows us to quickly validate substrings against dictionary words, while dynamic programming ensures that overlapping subproblems are solved only once.

The approach works as follows:

First, a Trie is built from the given dictionary words, where each node in the Trie represents a character in a dictionary word, and the end of a word is marked with a flag.
Then, the string s is processed recursively from left to right using a dp function. At each index, the function either counts the current character as extra and moves to the next index, or it tries to match substrings starting at the current index with words in the Trie. If a match is found, the remaining portion of the string is processed recursively.
Memoization is used to store the results of subproblems for each starting index, ensuring that each state is computed only once.

This approach is optimal because it combines the power of Tries (for fast substring matching) and dynamic programming (to reuse previous computations). By avoiding the repeated recomputation of substring matches, the solution achieves a significant speedup compared to naive methods.

Step-by-Step Algorithm

Build a Trie from the Dictionary:
- Create a root node for the Trie.
- For each word in the dictionary:
  - Traverse through its characters.
  - Add each character as a child node of the current node if it doesn't already exist.
  - Mark the last node of the word as the end of a word.
Initialize the DP Array:
- Create a memo array of size n + 1, where n is the length of the string s.
- Initialize all values in memo to null to indicate uncomputed states.
Recursive DP Function:
- Define a function dp(start) that computes the number of extra characters in the substring starting at index start.
- Base Case:
  - If start == n, return 0, as no characters are left to process.
- Memoization:
  - If memo[start] is not null, return its value to avoid redundant computation.
- Count Extra Characters:
  - Assume the character at start is extra and move to the next index, initializing minExtra as dp(start + 1) + 1.
- Trie Matching:
  - Starting from the root of the Trie, traverse the Trie while iterating through characters of the substring s[start:].
  - If a Trie node doesn't exist for a character, break out of the loop, as there is not valid substring starting from the current index.
  - If a Trie node marks the end of a word, update minExtra with a minimum of minExtra and result returned after recursively processing the remaining substring.
Return the Final Result:
- Call dp(0) to compute the result for the entire string.

Algorithm Walkthrough

Input:

String: s = "thedogbarksatnight".
Dictionary: dictionary = ["dog", "bark", "night"].

Step 1: Trie Construction

Insert "dog" into the Trie:
- Add nodes for 'd', 'o', 'g'.
- Mark 'g' as the end of a word.
Insert "bark" into the Trie:
- Add nodes for 'b', 'a', 'r', 'k'.
- Mark 'k' as the end of a word.
Insert "night" into the Trie:
- Add nodes for 'n', 'i', 'g', 'h', 't'.
- Mark 't' as the end of a word.

Step 2: Start DP Recursion from Index 0

The recursion starts with start = 0.

Call dp(0):
- Base Case:
  - The function calls itself recursively with start = 1:
    - This pattern continues until start = 18.
At start = 18:
- This is the base case: start == length. No characters remain, so return 0.

Step 3: Process Substrings from the End

After reaching the base case, the recursion starts processing substrings backward (from start = 17 to start = 0).

Processing start = 17:
- The character is 't'.
- Initialize minExtra = dp(18) + 1 = 1. Assume 't' is extra.
- No valid substrings start from 't', as it does not match any word in the Trie.
- Store and Return: memo[17] = 1.
Processing start = 16:
- The character is 'h'.
- Initialize minExtra = dp(17) + 1 = 2. Assume 'h' is extra.
- No valid substrings start from 'h', as it does not match any word in the Trie.
- Store and Return: memo[16] = 2.
Processing start = 15:
- The character is 'g'.
- Initialize minExtra = dp(16) + 1 = 3. Assume 'g' is extra.
- No valid substrings start from 'g', as it does not match any word in the Trie.
- Store and Return: memo[15] = 3.

Step 4: Process Substring "night" at start = 13

memo = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 3, 2, 1, 0]

Processing start = 13:
- The character is 'n'.
- Initialize minExtra = dp(14) + 1 = 4. Assume 'n' is extra.
- Trie Traversal:
  - Substrings:
    - "n", "ni", "nig", "nigh", "night".
    - "night" matches in the Trie.
    - Update minExtra = dp(18) = 0.
- Store and Return: memo[13] = 0.

Step 5: Process Substring "bark" at start = 6

memo = [ 0, 0, 0, 0, 0, 0, 3, 6, 5, 4, 3, 2, 1, 0, 4, 3, 2, 1, 0]

Processing start = 6:
- The character is 'b'.
- Initialize minExtra = dp(7) + 1 = 6 + 1 = 7. Assume 'b' is extra.
- Trie Traversal:
  - Substrings:
    - "b", "ba", "bar", "bark".
    - "bark" matches in the Trie.
    - Update minExtra = dp(10) = 3.
- Store and Return: memo[6] = 3.

Step 6: Process Substring "dog" at start = 3

memo = [ 0, 0, 0, 0, 5, 4, 3, 6, 5, 4, 3, 2, 1, 0, 4, 3, 2, 1, 0]

Processing start = 3:
- The character is 'd'.
- Initialize minExtra = dp(4) + 1 = 5 + 1 = 6. Assume 'd' is extra.
- Trie Traversal:
  - Substrings:
    - "d", "do", "dog".
    - "dog" matches in the Trie.
    - Update minExtra = dp(6) = 3.
- Store and Return: memo[3] = 3.

Step 7: Process Substring from start = 0

Processing start = 0:
- The character is 't'.
- Initialize minExtra = dp(1) + 1 = 5 + 1 = 6. Assume 't' is extra.
- Trie Traversal:
  - Substrings:
    - "t", "th", "the", "thed", "thedo", "thedog".
    - None of substrings are valid.
- Return: memo[0] = 6.

Final Result

Memoization Array:

memo = [ 6, 5, 4, 3, 5, 4, 3, 6, 5, 4, 3, 2, 1, 0, 4, 3, 2, 1, 0]

Output: 6, indicating six extra characters ('t', 'h', 'e', 's', 'a', 't').

Code

java

import java.util.*;

// class TrieNode {
//     TrieNode[] children = new TrieNode[26]; // Array to hold child nodes for each letter
//     boolean isEnd = false; // Flag to mark the end of a word
// }

class Solution {

  TrieNode root; // Root of the Trie
  Integer[] memo; // Memoization array to store results for each index

  public int minExtraChar(String s, String[] dictionary) {
    int length = s.length();
    root = buildTrie(dictionary); // Build the Trie using the dictionary words
    memo = new Integer[length + 1]; // Initialize memoization array
    return dp(0, length, s); // Start the top-down DP from index 0
  }

  private int dp(int start, int length, String s) {
    // Base case: If we reach the end of the string, no extra characters are left
    if (start == length) {
      return 0;
    }

    // If the result for this starting index is already computed, return it
    if (memo[start] != null) {
      return memo[start];
    }

    // Initialize the Trie traversal from the root
    TrieNode node = root;

    // Count the current character as extra and move to the next index
    int minExtra = dp(start + 1, length, s) + 1;

    // Traverse the Trie to check for valid dictionary substrings
    for (int end = start; end < length; end++) {
      char c = s.charAt(end); // Current character
      if (node.children[c - 'a'] == null) {
        // If no child exists for this character, break
        break;
      }
      node = node.children[c - 'a']; // Move to the next Trie node

      // If the current node marks the end of a valid word in the dictionary
      if (node.isEnd) {
        // Update the minimum extra characters by recursively processing the remaining substring
        minExtra = Math.min(minExtra, dp(end + 1, length, s));
      }
    }

    // Store the computed result in memoization array and return it
    return memo[start] = minExtra;
  }

  private TrieNode buildTrie(String[] dictionary) {
    TrieNode root = new TrieNode(); // Initialize the root of the Trie
    for (String word : dictionary) {
      TrieNode node = root; // Start from the root for each word
      for (char c : word.toCharArray()) {
        // Create a new Trie node if it doesn't already exist
        if (node.children[c - 'a'] == null) {
          node.children[c - 'a'] = new TrieNode();
        }
        node = node.children[c - 'a']; // Move to the next node
      }
      node.isEnd = true; // Mark the end of the word
    }
    return root;
  }

  public static void main(String[] args) {
    Solution solution = new Solution();

    // Example 1
    String s1 = "amazingracecar";
    String[] dictionary1 = { "race", "car" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s1, dictionary1)
    );
    // Expected Output: 7

    // Example 2
    String s2 = "bookkeeperreading";
    String[] dictionary2 = { "keep", "read" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s2, dictionary2)
    );
    // Expected Output: 9

    // Example 3
    String s3 = "thedogbarksatnight";
    String[] dictionary3 = { "dog", "bark", "night" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s3, dictionary3)
    );
    // Expected Output: 6
  }
}

Complexity Analysis

Time Complexity

DP Method:
- The dp method is called for N + 1 unique states, where N is the length of the string.
- Each state involves iterating over the substring starting at the current index, which takes time in the worst case.
- Thus, the cost of the dp method is .
Building the Trie:
- Constructing the Trie requires inserting K words, where each word has an average length of M.
- The cost of building the Trie is .

Overall Time Complexity: .

Space Complexity

Trie:
- The Trie requires space to store the dictionary words.
Memoization:
- The memo array has a size of N + 1, consuming space.
Recursive Stack:
- The recursion depth is at most N, adding stack space.

Overall Space Complexity: .

Solution 4: Bottom Up Dynamic Programming with Trie

This solution uses bottom-up dynamic programming with a Trie to efficiently calculate the number of extra characters in a given string that cannot form words from a provided dictionary. The Trie enables fast lookups for dictionary words while avoiding the need to repeatedly recompute substring matches. The dynamic programming (DP) array ensures that the solution is computed incrementally from the end of the string toward the beginning.

Here’s the approach:

Trie Construction: A Trie is built from the dictionary, where each character of a word is represented as a node. Words that exist in the dictionary are marked with an isEnd flag at the last node.
DP Initialization: A DP array dp is created to store the minimum extra characters for substrings starting from different indices.
Iterative Processing: Starting from the last character of the string, each character is assumed to be extra initially. Then, substrings starting from the current index are checked against the Trie. If a substring matches a word in the Trie, the DP value is updated to reflect fewer extra characters.
Result: The final result is stored in dp[0], representing the minimum extra characters required for the entire string.

This approach is optimal because it combines efficient substring matching using the Trie and incremental computation with the DP array.

Step-by-Step Algorithm

Trie Construction:
- Initialize the root of the Trie.
- For each word in the dictionary:
  - Start at the root of the Trie.
  - For each character in the word:
    - If the character doesn’t exist as a child of the current node, create a new node.
    - Move to the child node representing the character.
  - Mark the last node as the end of the word.
Initialize DP Array:
- Create a DP array dp of size n + 1, where n is the length of the string s.
- Set dp[n] = 0 since no extra characters are required after the end of the string.
To calculate the minimum extra characters for each substring starting at every index, we process the string iteratively from the last character to the first:

a. Initialize Extra Characters:
- Assume that the character at the current index start is extra.
- Set dp[start] = dp[start + 1] + 1. This means that the cost for the substring starting at start is at least one more than the cost for the substring starting at start + 1.
b. Traverse the Trie:
- Begin from the root of the Trie to validate substrings starting from start.
- Iterate through each possible end index starting from start to n - 1:
  - Check Trie Validity:
    - If the character s[end] is not a child of the current Trie node, break the loop, as no further substring starting at start can match a dictionary word.
  - Move to the Next Trie Node:
    - If the character s[end] exists in the Trie, move to the child node for s[end].
  - Check for a Complete Word:
    - If the current Trie node marks the end of a word, this substring matches a dictionary word.
    - Update dp[start] = min(dp[start], dp[end + 1]), as including this valid word reduces the number of extra characters needed for the remaining substring.
c. Continue to the Next Start Index:
- Repeat the process for the next start index until the first character is processed. The final result will be stored in dp[0].

Return the Result:
- Return dp[0] as the result, which represents the minimum extra characters for the entire string.

Code

java

// class TrieNode {
//     TrieNode[] children = new TrieNode[26]; // Array to hold child nodes for each letter
//     boolean isEnd = false; // Flag to mark the end of a word
// }

class Solution {

  public int minExtraChar(String s, String[] dictionary) {
    int n = s.length();
    TrieNode root = buildTrie(dictionary); // Build the Trie from the dictionary
    int[] dp = new int[n + 1]; // DP array to store minimum extra characters for each index

    // Iterate from the end of the string to the beginning
    for (int start = n - 1; start >= 0; start--) {
      dp[start] = dp[start + 1] + 1; // Initialize dp[start] by counting the current character as extra
      TrieNode node = root; // Start Trie traversal from the root

      // Traverse the Trie to check valid dictionary substrings starting from 'start'
      for (int end = start; end < n; end++) {
        char c = s.charAt(end); // Current character in the substring
        if (node.children[c - 'a'] == null) {
          // If no child exists for the character, break
          break;
        }
        node = node.children[c - 'a']; // Move to the next Trie node

        // If the current node marks the end of a valid word in the dictionary
        if (node.isEnd) {
          // Update dp[start] with the minimum value
          dp[start] = Math.min(dp[start], dp[end + 1]);
        }
      }
    }

    // The result is stored in dp[0], which represents the entire string
    return dp[0];
  }

  private TrieNode buildTrie(String[] dictionary) {
    TrieNode root = new TrieNode();
    for (String word : dictionary) {
      TrieNode node = root; // Start from the root for each word
      for (char c : word.toCharArray()) {
        // Traverse through each character in the word
        // Create a new Trie node if it doesn't already exist
        if (node.children[c - 'a'] == null) {
          node.children[c - 'a'] = new TrieNode();
        }
        node = node.children[c - 'a']; // Move to the next node
      }
      node.isEnd = true; // Mark the end of the word
    }
    return root;
  }

  public static void main(String[] args) {
    Solution solution = new Solution();

    // Example 1
    String s1 = "amazingracecar";
    String[] dictionary1 = { "race", "car" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s1, dictionary1)
    );
    // Expected Output: 7

    // Example 2
    String s2 = "bookkeeperreading";
    String[] dictionary2 = { "keep", "read" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s2, dictionary2)
    );
    // Expected Output: 9

    // Example 3
    String s3 = "thedogbarksatnight";
    String[] dictionary3 = { "dog", "bark", "night" };
    System.out.println(
      "Minimum extra characters: " + solution.minExtraChar(s3, dictionary3)
    );
    // Expected Output: 6
  }
}

Complexity Analysis

Time Complexity

Dynamic Programming (DP) Operations:
- The dp array is computed for each index start in the string s. For each start, we iterate over potential end values (start to n - 1) to form substrings.
- This results in a total of $²$ iterations since the outer loop runs N times, and the inner loop runs up to N times in the worst case.
Building the Trie:
- For each word in the dictionary (total of K words), we insert its characters into the Trie. Each word has an average length of M.
- Thus, building the Trie takes .

Overall Time Complexity: $²$ .

Space Complexity

Trie Storage:
- The Trie stores K words, each with an average length of M. Each character is represented by a node in the Trie.
- Hence, the Trie requires space.
DP Array:
- The dp array has a size of N + 1 to store results for each index of the string.
- This contributes space.

Overall Space Complexity: .

🤖 Don't fully get this? Learn it with Claude

Stuck on Extra Characters in a String? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.

🪜 Hint ladder (no spoilers)

Progressively stronger hints — you still solve it.

I'm working on the problem **Extra Characters in a String** (DSA). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

🎨 Explain the approach visually

See the technique, not just code.

Explain the optimal approach to **Extra Characters in a String** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

🔍 Review my solution

Catch bugs, edge cases, sub-optimality.

I'll paste my solution to **Extra Characters in a String**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

🔁 Drill the pattern

Lock in recognition with look-alikes.

Give me 2 problems that use the SAME underlying pattern as **Extra Characters in a String**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

📝 My notes

← Implement Trie (Prefix Tree)Search Suggestions System →