medium Extra Characters in a String
Problem Statement
Given a string s and an array of words words. Break string s into multiple non-overlapping substrings such that each substring should be part of the words. There are some characters left which are not part of any substring.
Return the minimum number of remaining characters in s, which are not part of any substring after string break-up.
Examples
-
Example 1:
- Input:
s = "amazingracecar",words = ["race", "car"] - Expected Output:
7 - Justification: The string
scan be rearranged to form "racecar", leaving 'a', 'm', 'a', 'z', 'i', 'n', 'g' as extra.
- Input:
-
Example 2:
- Input:
s = "bookkeeperreading",words = ["keep", "read"] - Expected Output:
9 - Justification: The words "keep" and "read" can be formed from
s, but 'b', 'o', 'o', 'k', 'e', 'r', 'i', 'n', 'g' are extra.
- Input:
-
Example 3:
- Input:
s = "thedogbarksatnight",words = ["dog", "bark", "night"] - Expected Output:
6 - Justification: The words "dog", "bark", and "night" can be formed, leaving 't', 'h', 'e', 's', 'a', 't' as extra characters.
- Input:
Constraints:
1 <= str.length <= 501 <= dictionary.length <= 501 <= dictionary[i].length <= 50dictionary[i]andsconsists of only lowercase English lettersdictionarycontains distinct words
Try it yourself
Try solving this question here:
✅ Solution Extra Characters in a String
Problem Statement
Given a string s and an array of words words. Break string s into multiple non-overlapping substrings such that each substring should be part of the words. There are some characters left which are not part of any substring.
Return the minimum number of remaining characters in s, which are not part of any substring after string break-up.
Examples
-
Example 1:
- Input:
s = "amazingracecar",dictionary = ["race", "car"] - Expected Output:
7 - Justification: The string
scan be rearranged to form "racecar", leaving 'a', 'm', 'a', 'z', 'i', 'n', 'g' as extra.
- Input:
-
Example 2:
- Input:
s = "bookkeeperreading",dictionary = ["keep", "read"] - Expected Output:
9 - Justification: The words "keep" and "read" can be formed from
s, but 'b', 'o', 'o', 'k', 'e', 'r', 'i', 'n', 'g' are extra.
- Input:
-
Example 3:
- Input:
s = "thedogbarksatnight",dictionary = ["dog", "bark", "night"] - Expected Output:
6 - Justification: The words "dog", "bark", and "night" can be formed, leaving 't', 'h', 'e', 's', 'a', 't' as extra characters.
- Input:
Constraints:
1 <= str.length <= 501 <= dictionary.length <= 501 <= dictionary[i].length <= 50dictionary[i]andsconsists of only lowercase English lettersdictionarycontains distinct words
Solution
To solve the "Extra Characters in a String" problem, we utilize Dynamic Programming (DP) to efficiently determine the minimum number of extra characters that cannot be part of any word in the given dictionary. DP helps in breaking down the problem into smaller subproblems, ensuring that each subproblem is solved only once and its result is reused, thereby optimizing the overall computation.
We explore four distinct approaches to implement this DP solution:
- Solution 1: Top Down Dynamic Programming with Substring Method
- Solution 2: Bottom Up Dynamic Programming with Substring Method
- Solution 3: Top Down Dynamic Programming with Trie
- Solution 4: Bottom Up Dynamic Programming with Trie
Solution 1: Top Down Dynamic Programming with Substring Method
In the Top Down Dynamic Programming with Substring Method, we recursively explore all possible substrings of the input string s starting from the first character. At each step, we check if the current substring exists in the dictionary. If it does, we recursively solve the remaining part of the string. We use memoization to store the results of subproblems to avoid redundant computations. The goal is to minimize the number of extra characters by maximizing the number of dictionary words used.
This approach systematically breaks down the problem by considering every possible split of the string s. By leveraging memoization, we ensure that each substring is processed only once, significantly reducing the time complexity. This method is intuitive and straightforward, making it easy to implement and understand.
Step-by-Step Algorithm
-
Initialize Data Structures:
- Create a memoization array
memoof sizelengthto store the results for each starting index in the string. - Convert the
dictionaryinto aHashSet(wordSet) for quick lookup of words.
- Create a memoization array
-
Recursive Function
solve(index, length, s):- Base Case: If
indexequalslength(end of the string), return0because there are no extra characters left. - Check Memoization: If the result for
indexis already computed (memo[index] != null), return the cached result. - Count Extra Characters:
- Call
solve(index + 1, length, s)to treat the current character as extra and move to the next character. Add1to account for this extra character.
- Call
- Check Substrings:
- Iterate over possible substrings starting at
index:- Extract the substring
s.substring(index, end + 1)and check if it exists inwordSet. - If the substring exists, recursively call
solve(end + 1, length, s)to process the rest of the string. - Update
minExtrawith the smaller value between the currentminExtraand the result of the recursive call.
- Extract the substring
- Iterate over possible substrings starting at
- Store Result:
- Save the computed
minExtravalue inmemo[index]and return it.
- Save the computed
- Base Case: If
-
Compute the Result:
- Start by calling
solve(0, length, s)to process the entire string.
- Start by calling
-
Return Final Output:
- The result from
solve(0, length, s)represents the minimum number of extra characters.
- The result from
Code
import java.util.*;
class Solution {
Integer[] memo; // Memoization array to store results for each index
HashSet<String> wordSet; // HashSet for quick dictionary lookup
public int minExtraChar(String s, String[] dictionary) {
int length = s.length();
memo = new Integer[length]; // Initialize memoization array
wordSet = new HashSet<>(Arrays.asList(dictionary)); // Populate HashSet with dictionary words
return solve(0, length, s);
}
private int solve(int index, int length, String s) {
// Base case: when we reach the end of the string
if (index == length) {
return 0;
}
// Return the cached result if already computed
if (memo[index] != null) {
return memo[index];
}
// Count the current character as an extra character
int minExtra = solve(index + 1, length, s) + 1;
// Try forming substrings starting from the current index
for (int end = index; end < length; end++) {
String substring = s.substring(index, end + 1); // Current substring
if (wordSet.contains(substring)) {
// Check if the substring is in the dictionary
minExtra = Math.min(minExtra, solve(end + 1, length, s)); // Update minimum extra characters
}
}
// Store the result in the memo and return
return memo[index] = minExtra;
}
public static void main(String[] args) {
Solution solution = new Solution();
// Example 1
String s1 = "amazingracecar";
String[] dictionary1 = { "race", "car" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s1, dictionary1)
);
// Expected Output: 7
// Example 2
String s2 = "bookkeeperreading";
String[] dictionary2 = { "keep", "read" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s2, dictionary2)
);
// Expected Output: 9
// Example 3
String s3 = "thedogbarksatnight";
String[] dictionary3 = { "dog", "bark", "night" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s3, dictionary3)
);
// Expected Output: 6
}
}
Complexity Analysis
Time Complexity
-
Number of States:
- The recursive function
solveis called for each unique starting index, resulting instates where Nis the length of the strings.
- The recursive function
-
Substring Iteration:
- For each state, the algorithm iterates through substrings from
indextolength, which takestime in the worst case.
- For each state, the algorithm iterates through substrings from
-
Substring Creation:
- Extracting a substring using
s.substring(start, end + 1)takestime.
- Extracting a substring using
-
Overall Time Complexity:
- Combining the above factors, the total time complexity is
.
- Combining the above factors, the total time complexity is
Space Complexity
-
Memoization Array:
- The
memoarray stores results for up toNindices, consumingspace.
- The
-
HashSet for Dictionary:
- The
wordSetstores up toKwords, each with an average length ofM, resulting inspace.
- The
-
Call Stack:
- The recursion depth can reach up to
Nin the worst case, requiringspace for the call stack.
- The recursion depth can reach up to
-
Overall Space Complexity:
- The total space complexity is
.
- The total space complexity is
Solution 2: Bottom-Up Dynamic Programming with Substring Method
In this approach, the main idea is to minimize the number of extra characters by leveraging the dictionary words to form substrings from the given string s. We iterate over the string from the end to the beginning, calculating the minimum extra characters needed for every starting index.
We use a DP array (dp) where dp[start] represents the minimum extra characters required for the substring s[start:]. Initially, for every index, we assume the current character is extra. Then, we check all possible substrings starting at the current index. If a substring matches a word in the dictionary, we update dp[start] to reflect the better result. Finally, the value at dp[0] gives the minimum extra characters needed for the entire string.
Compared to the top-down approach, this method avoids recursion and uses iteration, making it straightforward and stack-memory efficient. It iteratively calculates the result starting from the end of the string and moves towards the beginning.
Step-by-Step Algorithm
-
Convert Dictionary to HashSet:
- Create a
HashSetcalledwordSetfrom the dictionary for fast lookup of substrings.
- Create a
-
Initialize DP Array:
- Create a
dparray of sizelength + 1(wherelengthis the length ofs). dp[start]will store the minimum extra characters required for the substrings[start:].
- Create a
-
Iterate Backward Over the String:
- Loop from
start = length - 1to0:- Initialize
dp[start] = dp[start + 1] + 1.- This assumes the current character at
startis extra.
- This assumes the current character at
- Initialize
- Loop from
-
Check Substrings:
- For each
start, loop over all possible substrings starting atstart:- This step Tries forming substrings starting from the current index.
- For
endranging fromstarttolength - 1:- Extract
substring = s.substring(start, end + 1). - If
substringexists inwordSet:- Update
dp[start] = Math.min(dp[start], dp[end + 1]).
- Update
- Extract
- For each
-
Return Final Result:
- The result is stored in
dp[0], representing the minimum extra characters required for the entire string.
- The result is stored in
Algorithm Walkthrough
Input:
- String:
s = "thedogbarksatnight". - Dictionary:
dictionary = ["dog", "bark", "night"].
Initialization:
- String Length:
length = 18. - Dictionary as HashSet:
wordSet = {"dog", "bark", "night"}. - DP Array:
dp = [0, 0, ..., 0](size 19, initialized with 0).
Steps:
Step 1: Start iterating from the end of the string.
-
At
start = 18(beyond the end of the string):dp[18] = 0(Base case).
-
At
start = 17(character = 't'):- Initialize
dp[17] = dp[18] + 1 = 1. - Substrings starting at
17:"t".- None of the substrings match
wordSet.
- None of the substrings match
- Result:
dp[17] = 1.
- Initialize
-
At
start = 16(character = 'h'):- Initialize
dp[16] = dp[17] + 1 = 2. - Substrings starting at
16:"h","ht".- None of the substrings match
wordSet.
- None of the substrings match
- Result:
dp[16] = 2.
- Initialize
-
At
start = 15(character = 'g'):- Initialize
dp[15] = dp[16] + 1 = 3. - Substrings starting at
15:"g","gh","ght".- None of the substrings match
wordSet.
- None of the substrings match
- Result:
dp[15] = 3.
- Initialize
-
At
start = 14(character = 'i'):- Initialize
dp[14] = dp[15] + 1 = 4. - Substrings starting at
14:"i","ig","igh","ight".- None of the substrings match
wordSet.
- None of the substrings match
- Result:
dp[14] = 4.
- Initialize
Step 2: Handle substring "night" at index 13.
-
At
start = 13(character = 'n'):- Initialize
dp[13] = dp[14] + 1 = 5. - Substrings starting at
13:"n","ni","nig","nigh","night"."night"matcheswordSet.- Update
dp[13] = dp[18] = 0.
- Result:
dp[13] = 0.
- Initialize
-
At
start = 12(character = 't'):- Initialize
dp[12] = dp[13] + 1 = 1. - Substrings starting at
12:"t","tn","tni","tnig","tnigh".- None of the substrings match
wordSet.
- None of the substrings match
- Result:
dp[12] = 1.
- Initialize
Step 3: Handle substring "bark" at index 6.
-
At
start = 6(character = 'b'):- Initialize
dp[6] = dp[7] + 1 = 6 + 1 = 7. - Substrings starting at
6:"b","ba","bar","bark"."bark"matcheswordSet.- Update
dp[6] = dp[10] = 3.
- Result:
dp[6] = 3.
- Initialize
-
At
start = 5(character = 'g'):- Initialize
dp[5] = dp[6] + 1 = 4. - Substrings starting at
5:"g","gb","gba","gbar","gbark".- None of the substrings match
wordSet.
- None of the substrings match
- Result:
dp[5] = 4.
- Initialize
Step 4: Handle substring "dog" at index 3.
-
At
start = 3(character = 'd'):- Initialize
dp[3] = dp[4] + 1 = 5 + 1 = 6. - Substrings starting at
3:"d","do","dog"."dog"matcheswordSet.- Update
dp[3] = dp[6] = 3.
- Result:
dp[3] = 3.
- Initialize
-
At
start = 2(character = 'e'):
- Initialize
dp[2] = dp[3] + 1 = 4. - Substrings starting at
2:"e","ed","edo","edog", etc.- None of the substrings match
wordSet.
- None of the substrings match
- Result:
dp[2] = 4.
Step 5: Continue Backward to the Beginning.
- At
start = 0(character = 't'):- Initialize
dp[0] = dp[1] + 1 = 5 + 1 = 6. - Substrings starting at
0:"t","th","the","thed","thedo","thedog".- No valid substring.
- Result:
dp[0] = 6.
- Initialize
Final DP Array:
dp = [6, 5, 4, 3, 5, 4, 3, 6, 5, 4, 3, 2, 1, 0, 4, 3, 2, 1, 0].
Final Result:
- Minimum Extra Characters:
dp[0] = 6. - Explanation:
"dog","bark", and"night"are matched from the dictionary.- Extra characters are
't','h','e','s','a','t'.
Code
import java.util.*;
class Solution {
public int minExtraChar(String s, String[] dictionary) {
int length = s.length(); // Length of the string
var wordSet = new HashSet<>(Arrays.asList(dictionary)); // Convert dictionary to HashSet for quick lookups
var dp = new int[length + 1]; // DP array to store minimum extra characters from each index
// Start iterating from the end of the string to the beginning
for (int start = length - 1; start >= 0; start--) {
// Initialize dp[start] by counting the current character as extra
dp[start] = dp[start + 1] + 1;
// Try forming substrings starting from the current index
for (int end = start; end < length; end++) {
String substring = s.substring(start, end + 1); // Extract substring
if (wordSet.contains(substring)) {
// Check if substring exists in the dictionary
dp[start] = Math.min(dp[start], dp[end + 1]); // Update dp[start] with the minimum value
}
}
}
// The result is the minimum extra characters starting from index 0
return dp[0];
}
public static void main(String[] args) {
Solution solution = new Solution();
// Example 1
String s1 = "amazingracecar";
String[] dictionary1 = { "race", "car" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s1, dictionary1)
);
// Expected Output: 7
// Example 2
String s2 = "bookkeeperreading";
String[] dictionary2 = { "keep", "read" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s2, dictionary2)
);
// Expected Output: 9
// Example 3
String s3 = "thedogbarksatnight";
String[] dictionary3 = { "dog", "bark", "night" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s3, dictionary3)
);
// Expected Output: 6
}
}
Complexity Analysis
Time Complexity
-
Outer Loop:
- Iterates over all starting indices of the string, contributing
operations.
- Iterates over all starting indices of the string, contributing
-
Inner Loop:
- For each
start, the inner loop iterates over all possible substrings starting atstart, contributingoperations.
- For each
-
Substring Extraction:
- Extracting substrings with
s.substring(start, end + 1)takestime in the worst case.
- Extracting substrings with
-
Overall Time Complexity:
- Combining all the factors, the time complexity is
.
- Combining all the factors, the time complexity is
Space Complexity
-
DP Array:
- The
dparray of sizeN + 1contributesspace.
- The
-
HashSet for Dictionary:
- Storing the
dictionaryin aHashSetrequiresspace, where Mis the average length of words in the dictionary andKis the number of words.
- Storing the
-
Overall Space Complexity:
.
Solution 3: Top Down Dynamic Programming with Trie
This solution uses a top-down dynamic programming approach with a Trie to efficiently determine the number of extra characters in a given substring that cannot form words in the dictionary. The Trie allows us to quickly validate substrings against dictionary words, while dynamic programming ensures that overlapping subproblems are solved only once.
The approach works as follows:
- First, a Trie is built from the given dictionary words, where each node in the Trie represents a character in a dictionary word, and the end of a word is marked with a flag.
- Then, the string
sis processed recursively from left to right using a dp function. At each index, the function either counts the current character as extra and moves to the next index, or it tries to match substrings starting at the current index with words in the Trie. If a match is found, the remaining portion of the string is processed recursively. - Memoization is used to store the results of subproblems for each starting index, ensuring that each state is computed only once.
This approach is optimal because it combines the power of Tries (for fast substring matching) and dynamic programming (to reuse previous computations). By avoiding the repeated recomputation of substring matches, the solution achieves a significant speedup compared to naive methods.
Step-by-Step Algorithm
-
Build a Trie from the Dictionary:
- Create a root node for the Trie.
- For each word in the dictionary:
- Traverse through its characters.
- Add each character as a child node of the current node if it doesn't already exist.
- Mark the last node of the word as the end of a word.
-
Initialize the DP Array:
- Create a
memoarray of sizen + 1, wherenis the length of the strings. - Initialize all values in
memotonullto indicate uncomputed states.
- Create a
-
Recursive DP Function:
- Define a function
dp(start)that computes the number of extra characters in the substring starting at indexstart. - Base Case:
- If
start == n, return0, as no characters are left to process.
- If
- Memoization:
- If
memo[start]is notnull, return its value to avoid redundant computation.
- If
- Count Extra Characters:
- Assume the character at
startis extra and move to the next index, initializingminExtraasdp(start + 1) + 1.
- Assume the character at
- Trie Matching:
- Starting from the root of the Trie, traverse the Trie while iterating through characters of the substring
s[start:]. - If a Trie node doesn't exist for a character, break out of the loop, as there is not valid substring starting from the current index.
- If a Trie node marks the end of a word, update
minExtrawith a minimum ofminExtraand result returned after recursively processing the remaining substring.
- Starting from the root of the Trie, traverse the Trie while iterating through characters of the substring
- Define a function
-
Return the Final Result:
- Call
dp(0)to compute the result for the entire string.
- Call
Algorithm Walkthrough
Input:
- String:
s = "thedogbarksatnight". - Dictionary:
dictionary = ["dog", "bark", "night"].
Step 1: Trie Construction
-
Insert
"dog"into the Trie:- Add nodes for
'd','o','g'. - Mark
'g'as the end of a word.
- Add nodes for
-
Insert
"bark"into the Trie:- Add nodes for
'b','a','r','k'. - Mark
'k'as the end of a word.
- Add nodes for
-
Insert
"night"into the Trie:- Add nodes for
'n','i','g','h','t'. - Mark
't'as the end of a word.
- Add nodes for
Step 2: Start DP Recursion from Index 0
- The recursion starts with
start = 0.
-
Call
dp(0):- Base Case:
- The function calls itself recursively with
start = 1:- This pattern continues until
start = 18.
- This pattern continues until
- The function calls itself recursively with
- Base Case:
-
At
start = 18:- This is the base case:
start == length. No characters remain, so return0.
- This is the base case:
Step 3: Process Substrings from the End
After reaching the base case, the recursion starts processing substrings backward (from start = 17 to start = 0).
-
Processing
start = 17:- The character is
't'. - Initialize
minExtra = dp(18) + 1 = 1. Assume't'is extra. - No valid substrings start from
't', as it does not match any word in the Trie. - Store and Return:
memo[17] = 1.
- The character is
-
Processing
start = 16:- The character is
'h'. - Initialize
minExtra = dp(17) + 1 = 2. Assume'h'is extra. - No valid substrings start from
'h', as it does not match any word in the Trie. - Store and Return:
memo[16] = 2.
- The character is
-
Processing
start = 15:- The character is
'g'. - Initialize
minExtra = dp(16) + 1 = 3. Assume'g'is extra. - No valid substrings start from
'g', as it does not match any word in the Trie. - Store and Return:
memo[15] = 3.
- The character is
Step 4: Process Substring "night" at start = 13
memo = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 3, 2, 1, 0]
- Processing
start = 13:- The character is
'n'. - Initialize
minExtra = dp(14) + 1 = 4. Assume'n'is extra. - Trie Traversal:
- Substrings:
"n","ni","nig","nigh","night"."night"matches in the Trie.- Update
minExtra = dp(18) = 0.
- Substrings:
- Store and Return:
memo[13] = 0.
- The character is
Step 5: Process Substring "bark" at start = 6
memo = [ 0, 0, 0, 0, 0, 0, 3, 6, 5, 4, 3, 2, 1, 0, 4, 3, 2, 1, 0]
- Processing
start = 6:- The character is
'b'. - Initialize
minExtra = dp(7) + 1 = 6 + 1 = 7. Assume'b'is extra. - Trie Traversal:
- Substrings:
"b","ba","bar","bark"."bark"matches in the Trie.- Update
minExtra = dp(10) = 3.
- Substrings:
- Store and Return:
memo[6] = 3.
- The character is
Step 6: Process Substring "dog" at start = 3
memo = [ 0, 0, 0, 0, 5, 4, 3, 6, 5, 4, 3, 2, 1, 0, 4, 3, 2, 1, 0]
- Processing
start = 3:- The character is
'd'. - Initialize
minExtra = dp(4) + 1 = 5 + 1 = 6. Assume'd'is extra. - Trie Traversal:
- Substrings:
"d","do","dog"."dog"matches in the Trie.- Update
minExtra = dp(6) = 3.
- Substrings:
- Store and Return:
memo[3] = 3.
- The character is
Step 7: Process Substring from start = 0
- Processing
start = 0:- The character is
't'. - Initialize
minExtra = dp(1) + 1 = 5 + 1 = 6. Assume't'is extra. - Trie Traversal:
- Substrings:
"t","th","the","thed","thedo","thedog".- None of substrings are valid.
- Substrings:
- Return:
memo[0] = 6.
- The character is
Final Result
- Memoization Array:
memo = [ 6, 5, 4, 3, 5, 4, 3, 6, 5, 4, 3, 2, 1, 0, 4, 3, 2, 1, 0] - Output:
6, indicating six extra characters ('t','h','e','s','a','t').
Code
import java.util.*;
// class TrieNode {
// TrieNode[] children = new TrieNode[26]; // Array to hold child nodes for each letter
// boolean isEnd = false; // Flag to mark the end of a word
// }
class Solution {
TrieNode root; // Root of the Trie
Integer[] memo; // Memoization array to store results for each index
public int minExtraChar(String s, String[] dictionary) {
int length = s.length();
root = buildTrie(dictionary); // Build the Trie using the dictionary words
memo = new Integer[length + 1]; // Initialize memoization array
return dp(0, length, s); // Start the top-down DP from index 0
}
private int dp(int start, int length, String s) {
// Base case: If we reach the end of the string, no extra characters are left
if (start == length) {
return 0;
}
// If the result for this starting index is already computed, return it
if (memo[start] != null) {
return memo[start];
}
// Initialize the Trie traversal from the root
TrieNode node = root;
// Count the current character as extra and move to the next index
int minExtra = dp(start + 1, length, s) + 1;
// Traverse the Trie to check for valid dictionary substrings
for (int end = start; end < length; end++) {
char c = s.charAt(end); // Current character
if (node.children[c - 'a'] == null) {
// If no child exists for this character, break
break;
}
node = node.children[c - 'a']; // Move to the next Trie node
// If the current node marks the end of a valid word in the dictionary
if (node.isEnd) {
// Update the minimum extra characters by recursively processing the remaining substring
minExtra = Math.min(minExtra, dp(end + 1, length, s));
}
}
// Store the computed result in memoization array and return it
return memo[start] = minExtra;
}
private TrieNode buildTrie(String[] dictionary) {
TrieNode root = new TrieNode(); // Initialize the root of the Trie
for (String word : dictionary) {
TrieNode node = root; // Start from the root for each word
for (char c : word.toCharArray()) {
// Create a new Trie node if it doesn't already exist
if (node.children[c - 'a'] == null) {
node.children[c - 'a'] = new TrieNode();
}
node = node.children[c - 'a']; // Move to the next node
}
node.isEnd = true; // Mark the end of the word
}
return root;
}
public static void main(String[] args) {
Solution solution = new Solution();
// Example 1
String s1 = "amazingracecar";
String[] dictionary1 = { "race", "car" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s1, dictionary1)
);
// Expected Output: 7
// Example 2
String s2 = "bookkeeperreading";
String[] dictionary2 = { "keep", "read" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s2, dictionary2)
);
// Expected Output: 9
// Example 3
String s3 = "thedogbarksatnight";
String[] dictionary3 = { "dog", "bark", "night" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s3, dictionary3)
);
// Expected Output: 6
}
}
Complexity Analysis
Time Complexity
- DP Method:
- The
dpmethod is called forN + 1unique states, whereNis the length of the string. - Each state involves iterating over the substring starting at the current index, which takes
time in the worst case. - Thus, the cost of the
dpmethod is.
- The
- Building the Trie:
- Constructing the Trie requires inserting
Kwords, where each word has an average length ofM. - The cost of building the Trie is
.
- Constructing the Trie requires inserting
Overall Time Complexity:
Space Complexity
- Trie:
- The Trie requires
space to store the dictionary words.
- The Trie requires
- Memoization:
- The
memoarray has a size ofN + 1, consumingspace.
- The
- Recursive Stack:
- The recursion depth is at most
N, addingstack space.
- The recursion depth is at most
Overall Space Complexity:
Solution 4: Bottom Up Dynamic Programming with Trie
This solution uses bottom-up dynamic programming with a Trie to efficiently calculate the number of extra characters in a given string that cannot form words from a provided dictionary. The Trie enables fast lookups for dictionary words while avoiding the need to repeatedly recompute substring matches. The dynamic programming (DP) array ensures that the solution is computed incrementally from the end of the string toward the beginning.
Here’s the approach:
- Trie Construction: A Trie is built from the dictionary, where each character of a word is represented as a node. Words that exist in the dictionary are marked with an
isEndflag at the last node. - DP Initialization: A DP array
dpis created to store the minimum extra characters for substrings starting from different indices. - Iterative Processing: Starting from the last character of the string, each character is assumed to be extra initially. Then, substrings starting from the current index are checked against the Trie. If a substring matches a word in the Trie, the DP value is updated to reflect fewer extra characters.
- Result: The final result is stored in
dp[0], representing the minimum extra characters required for the entire string.
This approach is optimal because it combines efficient substring matching using the Trie and incremental computation with the DP array.
Step-by-Step Algorithm
-
Trie Construction:
- Initialize the root of the Trie.
- For each word in the dictionary:
- Start at the root of the Trie.
- For each character in the word:
- If the character doesn’t exist as a child of the current node, create a new node.
- Move to the child node representing the character.
- Mark the last node as the end of the word.
-
Initialize DP Array:
- Create a DP array
dpof sizen + 1, wherenis the length of the strings. - Set
dp[n] = 0since no extra characters are required after the end of the string.
- Create a DP array
-
To calculate the minimum extra characters for each substring starting at every index, we process the string iteratively from the last character to the first:
-
a. Initialize Extra Characters:
- Assume that the character at the current index
startis extra. - Set
dp[start] = dp[start + 1] + 1. This means that the cost for the substring starting atstartis at least one more than the cost for the substring starting atstart + 1.
- Assume that the character at the current index
-
b. Traverse the Trie:
- Begin from the root of the Trie to validate substrings starting from
start. - Iterate through each possible
endindex starting fromstartton - 1:- Check Trie Validity:
- If the character
s[end]is not a child of the current Trie node, break the loop, as no further substring starting atstartcan match a dictionary word.
- If the character
- Move to the Next Trie Node:
- If the character
s[end]exists in the Trie, move to the child node fors[end].
- If the character
- Check for a Complete Word:
- If the current Trie node marks the end of a word, this substring matches a dictionary word.
- Update
dp[start] = min(dp[start], dp[end + 1]), as including this valid word reduces the number of extra characters needed for the remaining substring.
- Check Trie Validity:
- Begin from the root of the Trie to validate substrings starting from
-
c. Continue to the Next Start Index:
- Repeat the process for the next
startindex until the first character is processed. The final result will be stored indp[0].
- Repeat the process for the next
- Return the Result:
- Return
dp[0]as the result, which represents the minimum extra characters for the entire string.
- Return
Code
// class TrieNode {
// TrieNode[] children = new TrieNode[26]; // Array to hold child nodes for each letter
// boolean isEnd = false; // Flag to mark the end of a word
// }
class Solution {
public int minExtraChar(String s, String[] dictionary) {
int n = s.length();
TrieNode root = buildTrie(dictionary); // Build the Trie from the dictionary
int[] dp = new int[n + 1]; // DP array to store minimum extra characters for each index
// Iterate from the end of the string to the beginning
for (int start = n - 1; start >= 0; start--) {
dp[start] = dp[start + 1] + 1; // Initialize dp[start] by counting the current character as extra
TrieNode node = root; // Start Trie traversal from the root
// Traverse the Trie to check valid dictionary substrings starting from 'start'
for (int end = start; end < n; end++) {
char c = s.charAt(end); // Current character in the substring
if (node.children[c - 'a'] == null) {
// If no child exists for the character, break
break;
}
node = node.children[c - 'a']; // Move to the next Trie node
// If the current node marks the end of a valid word in the dictionary
if (node.isEnd) {
// Update dp[start] with the minimum value
dp[start] = Math.min(dp[start], dp[end + 1]);
}
}
}
// The result is stored in dp[0], which represents the entire string
return dp[0];
}
private TrieNode buildTrie(String[] dictionary) {
TrieNode root = new TrieNode();
for (String word : dictionary) {
TrieNode node = root; // Start from the root for each word
for (char c : word.toCharArray()) {
// Traverse through each character in the word
// Create a new Trie node if it doesn't already exist
if (node.children[c - 'a'] == null) {
node.children[c - 'a'] = new TrieNode();
}
node = node.children[c - 'a']; // Move to the next node
}
node.isEnd = true; // Mark the end of the word
}
return root;
}
public static void main(String[] args) {
Solution solution = new Solution();
// Example 1
String s1 = "amazingracecar";
String[] dictionary1 = { "race", "car" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s1, dictionary1)
);
// Expected Output: 7
// Example 2
String s2 = "bookkeeperreading";
String[] dictionary2 = { "keep", "read" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s2, dictionary2)
);
// Expected Output: 9
// Example 3
String s3 = "thedogbarksatnight";
String[] dictionary3 = { "dog", "bark", "night" };
System.out.println(
"Minimum extra characters: " + solution.minExtraChar(s3, dictionary3)
);
// Expected Output: 6
}
}
Complexity Analysis
Time Complexity
-
Dynamic Programming (DP) Operations:
- The
dparray is computed for each indexstartin the strings. For eachstart, we iterate over potentialendvalues (startton - 1) to form substrings. - This results in a total of
iterations since the outer loop runs Ntimes, and the inner loop runs up toNtimes in the worst case.
- The
-
Building the Trie:
- For each word in the dictionary (total of
Kwords), we insert its characters into the Trie. Each word has an average length ofM. - Thus, building the Trie takes
.
- For each word in the dictionary (total of
Overall Time Complexity:
Space Complexity
-
Trie Storage:
- The Trie stores
Kwords, each with an average length ofM. Each character is represented by a node in the Trie. - Hence, the Trie requires
space.
- The Trie stores
-
DP Array:
- The
dparray has a size ofN + 1to store results for each index of the string. - This contributes
space.
- The
Overall Space Complexity:
🤖 Don't fully get this? Learn it with Claude
Stuck on Extra Characters in a String? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.
Progressively stronger hints — you still solve it.
I'm working on the problem **Extra Characters in a String** (DSA). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
See the technique, not just code.
Explain the optimal approach to **Extra Characters in a String** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Catch bugs, edge cases, sub-optimality.
I'll paste my solution to **Extra Characters in a String**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Lock in recognition with look-alikes.
Give me 2 problems that use the SAME underlying pattern as **Extra Characters in a String**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.