Knowledge Guide
HomeDSACompany Practice

medium Remove Duplicate Letters

Problem Statement

Given a string s, remove all duplicate letters from the input string while maintaining the original order of the letters.

Additionally, the returned string should be the smallest in lexicographical order among all possible results.

A string is in the smallest lexicographical order if it appears first in a dictionary. For example, "abc" is smaller than "acb" because "abc" comes first alphabetically.

Examples:

    • Input: "babac"
    • Expected Output: "abc"
    • Justification:
      • After removing 1 b and 1 a from the input string, we can get bac, and abc strings. The final answer is 'abc', which is the smallest lexicographical string without duplicate letters.
    • Input: "zabccde"
    • Expected Output: "zabcde"
    • Justification: Removing one of the 'c's forms 'zabcde', the smallest string in lexicographical order without duplicates.
    • Input: "mnopmn"
    • Expected Output: "mnop"
    • Justification: Removing the second 'm' and 'n' gives 'mnop', which is the smallest possible string without duplicate characters.

Constraints:

Try it yourself

Try solving this question here:

✅ Solution Remove Duplicate Letters

Problem Statement

Given a string s, remove all duplicate letters from the input string while maintaining the original order of the letters.

Additionally, the returned string should be the smallest in lexicographical order among all possible results.

A string is in the smallest lexicographical order if it appears first in a dictionary. For example, "abc" is smaller than "acb" because "abc" comes first alphabetically.

Examples:

    • Input: "babac"
    • Expected Output: "abc"
    • Justification:
      • After removing 1 b and 1 a from the input string, we can get bac, and abc strings. The final answer is 'abc', which is the smallest lexicographical string without duplicate letters.
    • Input: "zabccde"
    • Expected Output: "zabcde"
    • Justification: Removing one of the 'c's forms 'zabcde', the smallest string in lexicographical order without duplicates.
    • Input: "mnopmn"
    • Expected Output: "mnop"
    • Justification: Removing the second 'm' and 'n' gives 'mnop', which is the smallest possible string without duplicate characters.

Constraints:

  • 1 <= s.length <= 104
  • s consists of lowercase English letters.

Solution

To solve the given problem, begin by iterating through the string, and calculate the frequency count for each character. We will use the stack to maintain the order of characters and the set to track the uniqueness of characters.

Next, start traversing the string, and for every character encountered, check if it's already in the 'present' set. If it's not, compare it with the top element of the 'result' stack. If the stack is not empty, and the current character is lexicographically smaller than the stack's top, and the top character of the stack appears again in the string (indicated by a non-zero frequency count), repeatedly pop the stack and remove those elements from the 'present' set. Then, add the current character to the stack and the 'present' set.

This process, facilitated by the stack and set, ensures that the stack is built with unique characters in the smallest lexicographical order. After processing the entire string, pop elements from the stack to construct the final string, thereby obtaining the smallest lexicographical sequence without duplicate characters.

  1. Frequency Count (count):

    • Initialize a count dictionary (or hash map in some languages) to store the frequency of each character in the string s.
  2. Character Presence Tracking (present):

    • Use a set present to keep track of the characters that have been added to the resultant string. This set prevents duplicate characters in the result.
  3. Building the Result (result):

    • Create a stack result to construct the final string.
    • For each character c in the string s:
      • If c is not in present, proceed to compare it with the top character of result.
      • While result is not empty, and c is lexicographically smaller than the top character of result, and the frequency of the top character of result is more than 0 (indicating it appears again in the string), pop the top character from result and remove it from present.
      • Push c onto result and add it to present.
      • Decrement the frequency of c in count.
  4. Result Construction:

    • The stack result now contains the characters of the final string in reverse order. Pop elements from result to construct the output string in the correct order, from left to right.

This approach works because it ensures that the smallest character is placed first, respecting the original order and removing duplicates. The frequency count ensures that characters are not wrongly discarded.

Algorithm Walkthrough:

Input: "zabccde"

  1. Initialization:

    • Calculate the frequency of each character: {'z': 1, 'a': 1, 'b': 1, 'c': 2, 'd': 1, 'e': 1}.
    • Initialize an empty stack result and a set present to keep track of characters already in result.
  2. Iteration Over Characters:

    • 'z': Not in present. Add 'z' to result, add to present. Decrease frequency of 'z'.
    • 'a': Not in present. As 'a' is smaller than 'z', but 'z' won't appear again (frequency is now 0), we keep 'z'. Add 'a' to result, add to present. Decrease frequency of 'a'.
    • 'b': Not in present. Add 'b' to result, add to present. Decrease frequency of 'b'.
    • First 'c': Not in present. Add 'c' to result, add to present. Decrease frequency of 'c'.
    • Second 'c': Already in present. Skip it.
    • 'd': Not in present. Add 'd' to result, add to present. Decrease frequency of 'd'.
    • 'e': Not in present. Add 'e' to result, add to present. Decrease frequency of 'e'.
  3. Result Construction:

    • The result stack now contains ['z', 'a', 'b', 'c', 'd', 'e'].
    • Convert the stack to a string to get the final result: "zabcdef".

Expected Output: "zabcdef"

Image
Image

Code

Here is the code for this algorithm:

java
import java.util.*;

public class Solution {

  public String removeDuplicateLetters(String s) {
    HashMap<Character, Integer> count = new HashMap<>();
    HashSet<Character> present = new HashSet<>();
    Stack<Character> result = new Stack<>();

    // Counting the frequency of each character
    for (char c : s.toCharArray()) {
      count.put(c, count.getOrDefault(c, 0) + 1);
    }

    for (char c : s.toCharArray()) {
      if (!present.contains(c)) {
        // Ensure the smallest lexicographical order by removing higher characters that will appear again
        while (
          !result.isEmpty() && c < result.peek() && count.get(result.peek()) > 0
        ) {
          present.remove(result.pop());
        }
        result.push(c);
        present.add(c);
      }
      count.put(c, count.get(c) - 1); // Decreasing the frequency
    }

    // Building the result string from the stack
    StringBuilder sb = new StringBuilder();
    for (char c : result) {
      sb.append(c);
    }
    return sb.toString();
  }

  public static void main(String[] args) {
    Solution sol = new Solution();
    System.out.println(sol.removeDuplicateLetters("babac")); // Output: "abc"
    System.out.println(sol.removeDuplicateLetters("zabccde")); // Output: "zabcde"
    System.out.println(sol.removeDuplicateLetters("mnopmn")); // Output: "mnop"
  }
}

Complexity Analysis:

  • Time Complexity: O(N) where N is the length of the string. Each character is visited once.
  • Space Complexity: O(1) as the extra space used does not depend on the input size. The character count and the stack size are bounded by the character set size (constant).
🤖 Don't fully get this? Learn it with Claude

Stuck on Remove Duplicate Letters? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.

🪜 Hint ladder (no spoilers)

Progressively stronger hints — you still solve it.

I'm working on the problem **Remove Duplicate Letters** (DSA). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🎨 Explain the approach visually

See the technique, not just code.

Explain the optimal approach to **Remove Duplicate Letters** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🔍 Review my solution

Catch bugs, edge cases, sub-optimality.

I'll paste my solution to **Remove Duplicate Letters**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🔁 Drill the pattern

Lock in recognition with look-alikes.

Give me 2 problems that use the SAME underlying pattern as **Remove Duplicate Letters**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

📝 My notes