easy Index Pairs of a String
Problem Statement
Given a string text and a list of strings words, identify all [i, j] index pairs such that the substring text[i...j] is in words.
These index pairs should be returned in ascending order, first by the start index, then by the end index. Find every occurrence of each word within the text, ensuring that overlapping occurrences are also identified.
Examples
-
- Input: text =
"bluebirdskyscraper", words =["blue", "bird", "sky"] - Expected Output:
[[0, 3], [4, 7], [8, 10]] - Justification: The word "blue" is found from index 0 to 3, "bird" from 4 to 7, and "sky" from 8 to 10 in the string.
- Input: text =
-
- Input: text =
"programmingisfun", words =["pro", "is", "fun", "gram"] - Expected Output:
[[0, 2], [3, 6], [11, 12], [13, 15]] - Justification: "pro" is found from 0 to 2, "gram" from 3 to 6, "is" from 11 to 12, and "fun" from 13 to 15.
- Input: text =
-
- Input: text =
"interstellar", words =["stellar", "star", "inter"] - Expected Output:
[[0, 4], [5, 11]] - Justification: "inter" is found from 0 to 4, and "stellar" from 5 to 11. "star" is not found.
- Input: text =
Constraints:
1 <= text.length <= 1001 <= words.length <= 201 <= words[i].length <= 50textandwords[i]consist of lowercase English letters.- All the strings of words are unique.
Try it yourself
Try solving this question here:
✅ Solution Index Pairs of a String
Problem Statement
Given a string text and a list of strings words, identify all [i, j] index pairs such that the substring text[i...j] is in words.
These index pairs should be returned in ascending order, first by the start index, then by the end index. Find every occurrence of each word within the text, ensuring that overlapping occurrences are also identified.
Examples
-
- Input: text =
"bluebirdskyscraper", words =["blue", "bird", "sky"] - Expected Output:
[[0, 3], [4, 7], [8, 10]] - Justification: The word "blue" is found from index 0 to 3, "bird" from 4 to 7, and "sky" from 8 to 10 in the string.
- Input: text =
-
- Input: text =
"programmingisfun", words =["pro", "is", "fun", "gram"] - Expected Output:
[[0, 2], [3, 6], [11, 12], [13, 15]] - Justification: "pro" is found from 0 to 2, "gram" from 3 to 6, "is" from 11 to 12, and "fun" from 13 to 15.
- Input: text =
-
- Input: text =
"interstellar", words =["stellar", "star", "inter"] - Expected Output:
[[0, 4], [5, 11]] - Justification: "inter" is found from 0 to 4, and "stellar" from 5 to 11. "star" is not found.
- Input: text =
Constraints:
1 <= text.length <= 1001 <= words.length <= 201 <= words[i].length <= 50textandwords[i]consist of lowercase English letters.- All the strings of words are unique.
Solution
To solve this problem, we will use a trie data structure, which is particularly efficient for managing a set of strings and performing quick searches for patterns within a text. A trie, also known as a prefix tree, allows us to efficiently store and retrieve strings with a common prefix, making it ideal for our purpose of identifying substrings within a given text.
The approach involves two main phases: building the trie with the given list of words and then using it to find the index pairs in the text. The trie's structure enables us to search for each word in the text in an optimized manner, significantly reducing the number of comparisons needed compared to brute force methods.
Step-by-Step Algorithm
-
Initialize the Trie:
- Create a trie and insert all the words from the given list into it. This is done in the first part of the code.
-
Search and Record Index Pairs:
- Iterate over each character in the text. This character will act as the starting point for potential matches.
- For each starting character, initiate a pointer
pat the root of the trie. - Then, iterate over the text starting from the current starting point until the end of the text. In each iteration:
- Check if the current character exists as a child of the current trie node pointed by
p. - If it does not exist, break out of the inner loop as no further matching is possible from this starting point.
- If it exists, move the pointer
pto this child node. - Check if the current node
pis a leaf node (indicated byp.isEndbeingtrue). If it is, it means a complete word from the list has been found. - Record the start and end indices of this word. The start index is the position of the starting character, and the end index is the current position in the text.
- Check if the current character exists as a child of the current trie node pointed by
- Continue this process for each character in the text to ensure all occurrences, including overlapping ones, are found.
Algorithm Walkthrough
Using the input text = "programmingisfun", words = ["pro", "is", "fun", "gram"]:
-
Initialize the Trie with Words:
- "pro", "is", "fun", and "gram" are inserted into the trie.
-
Search and Record Index Pairs:
- Start at index 0 (
'p'in"programmingisfun"):- Pointer
pis at the root. Finds 'p', moves to 'r', then 'o'.p.isEndistrueat 'o', so record [0, 2].
- Pointer
- At index 1 (
'r'), no match is found. - similarly, At index 2 (
'o'), no match is found. - At index 3 (
'g'):- Finds 'g', 'r', 'a', 'm'.
p.isEndistrueat 'm', so record [3, 6].
- Finds 'g', 'r', 'a', 'm'.
- At index 11 (
'i'):- Finds 'i', 's'.
p.isEndistrueat 's', so record [11, 12].
- Finds 'i', 's'.
- At index 13 (
's'):- Finds 'f', 'u', 'n'.
p.isEndistrueat 'n', so record [13, 15].
- Finds 'f', 'u', 'n'.
- Start at index 0 (
-
Compile Results:
- The results [[0, 2], [3, 6], [11, 12], [13, 15]] are compiled into the 2D array
ansand returned.
- The results [[0, 2], [3, 6], [11, 12], [13, 15]] are compiled into the 2D array
By following these steps, the algorithm efficiently locates all occurrences of the words in the text, including overlapping ones, using the trie structure for optimized searching.
Code
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
// class TrieNode {
// TrieNode[] children = new TrieNode[26]; // Array to hold child nodes for each letter
// boolean isEnd = false; // Flag to mark the end of a word
// }
class Trie {
TrieNode root = new TrieNode();
// Inserts a word into the trie
public void insert(String word) {
TrieNode cur = root;
for (char c : word.toCharArray()) {
if (cur.children[c - 'a'] == null) {
cur.children[c - 'a'] = new TrieNode(); // Create a new node if not present
}
cur = cur.children[c - 'a'];
}
cur.isEnd = true; // Mark the end of a word
}
}
public class Solution {
public List<List<Integer>> indexPairs(String text, List<String> words) {
Trie trie = new Trie();
// Populate the trie with the list of words
for (String word : words) {
trie.insert(word);
}
List<List<Integer>> result = new ArrayList<>();
for (int i = 0; i < text.length(); i++) {
TrieNode p = trie.root;
for (int j = i; j < text.length(); j++) {
char currentChar = text.charAt(j);
if (p.children[currentChar - 'a'] == null) {
break; // Break if the character is not in the trie
}
p = p.children[currentChar - 'a'];
if (p.isEnd) {
result.add(Arrays.asList(i, j)); // Add index pair if word found as a list
}
}
}
return result; // Return a list of lists containing index pairs
}
// Main method for testing
public static void main(String[] args) {
Solution solution = new Solution();
// Test examples
String text1 = "bluebirdskyscraper";
List<String> words1 = Arrays.asList("blue", "bird", "sky");
System.out.println(solution.indexPairs(text1, words1));
String text2 = "programmingisfun";
List<String> words2 = Arrays.asList("pro", "is", "fun", "gram");
System.out.println(solution.indexPairs(text2, words2));
String text3 = "interstellar";
List<String> words3 = Arrays.asList("stellar", "star", "inter");
System.out.println(solution.indexPairs(text3, words3));
}
}
Time and Space Complexity Analysis
Time Complexity:
- Building the Trie:
, where N is the number of words and L is the average length of these words. - Finding Index Pairs:
, where T is the length of the text string. - Overall:
Space Complexity:
- Trie Storage:
, for storing N words each of average length L.
🤖 Don't fully get this? Learn it with Claude
Stuck on Index Pairs of a String? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.
Progressively stronger hints — you still solve it.
I'm working on the problem **Index Pairs of a String** (DSA). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
See the technique, not just code.
Explain the optimal approach to **Index Pairs of a String** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Catch bugs, edge cases, sub-optimality.
I'll paste my solution to **Index Pairs of a String**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Lock in recognition with look-alikes.
Give me 2 problems that use the SAME underlying pattern as **Index Pairs of a String**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.