codetutor/backend/data/questions/append-characters-to-string-to-make-subsequence.yaml

title: Append Characters to String to Make Subsequence
slug: append-characters-to-string-to-make-subsequence
difficulty: easy
leetcode_id: 2486
leetcode_url: https://leetcode.com/problems/append-characters-to-string-to-make-subsequence/
categories:
  - strings
  - two-pointers
patterns:
  - two-pointers
  - greedy

function_signature: "def append_characters(s: str, t: str) -> int:"

test_cases:
  visible:
    - input: { s: "coaching", t: "coding" }
      expected: 4
    - input: { s: "abcde", t: "a" }
      expected: 0
    - input: { s: "z", t: "abcde" }
      expected: 5
  hidden:
    - input: { s: "abc", t: "abc" }
      expected: 0
    - input: { s: "abc", t: "d" }
      expected: 1
    - input: { s: "aaa", t: "aaaa" }
      expected: 1
    - input: { s: "xyz", t: "xyz" }
      expected: 0
    - input: { s: "abc", t: "axbxc" }
      expected: 4

description: |
  You are given two strings `s` and `t` consisting of only lowercase English letters.

  Return *the minimum number of characters that need to be appended to the end of* `s` *so that* `t` *becomes a **subsequence** of* `s`.

  A **subsequence** is a string that can be derived from another string by deleting some or no characters without changing the order of the remaining characters.

constraints: |
  - `1 <= s.length, t.length <= 10^5`
  - `s` and `t` consist only of lowercase English letters.

examples:
  - input: 's = "coaching", t = "coding"'
    output: "4"
    explanation: 'Append the characters "ding" to the end of s so that s = "coachingding". Now, t is a subsequence of s ("coachingding"). It can be shown that appending any 3 characters to the end of s will never make t a subsequence.'
  - input: 's = "abcde", t = "a"'
    output: "0"
    explanation: 't is already a subsequence of s ("abcde").'
  - input: 's = "z", t = "abcde"'
    output: "5"
    explanation: 'Append the characters "abcde" to the end of s so that s = "zabcde". Now, t is a subsequence of s ("zabcde"). It can be shown that appending any 4 characters to the end of s will never make t a subsequence.'

explanation:
  intuition: |
    Imagine you're reading through string `s` character by character, trying to "match" as many characters of `t` as possible, **in order**.

    Think of it like this: you have a checklist (string `t`) and you're scanning through a document (string `s`). Every time you find the next item on your checklist in the document, you check it off and move to the next item. Characters you don't need can be skipped.

    The key insight is that we want to find the **longest prefix of `t`** that already exists as a subsequence of `s`. Whatever remains of `t` after this matching process is exactly what we need to append.

    For example, with `s = "coaching"` and `t = "coding"`:
    - We find `'c'` in `s` at index 0 — match!
    - We find `'o'` in `s` at index 1 — match!
    - We don't find `'d'` anywhere after index 1 in `s`
    - So only `"co"` (2 characters) of `t` can be matched
    - We need to append the remaining 4 characters: `"ding"`

    This greedy matching works because we're always looking for the **earliest** possible match for each character, which leaves the most room for subsequent characters.

  approach: |
    We solve this using the **Two Pointers** technique:

    **Step 1: Initialise two pointers**

    - `i`: Pointer for string `s`, starting at `0`
    - `j`: Pointer for string `t`, starting at `0`

    &nbsp;

    **Step 2: Scan through both strings**

    - While `i < len(s)` and `j < len(t)`:
      - If `s[i] == t[j]`, we found a match — increment `j` to look for the next character of `t`
      - Always increment `i` to continue scanning through `s`

    &nbsp;

    **Step 3: Calculate the result**

    - After the loop, `j` represents how many characters of `t` we successfully matched
    - The answer is `len(t) - j` — the number of unmatched characters that must be appended

    &nbsp;

    This approach works because the greedy matching (taking the earliest match for each character) is optimal. There's no benefit to skipping a valid match, as that would only reduce our options for matching subsequent characters.

  common_pitfalls:
    - title: Checking Characters Out of Order
      description: |
        A common mistake is trying to find each character of `t` in `s` independently, without respecting the order requirement.

        For example, with `s = "abc"` and `t = "cab"`:
        - All characters of `t` exist in `s`
        - But `"cab"` is NOT a subsequence of `"abc"` because `'c'` appears after `'a'` and `'b'` in `s`

        The two-pointer approach naturally handles ordering by only looking forward in `s` for each subsequent character of `t`.
      wrong_approach: "Check if each character of t exists in s"
      correct_approach: "Use two pointers to match characters in order"

    - title: Reversing the Pointer Logic
      description: |
        It's tempting to swap the roles of `s` and `t`, but the problem specifically asks for `t` to be a subsequence of `s`, not the other way around.

        We iterate through `s` with our main pointer and only advance the `t` pointer when we find a match. This ensures we're finding `t` within `s`.
      wrong_approach: "Looking for s as a subsequence of t"
      correct_approach: "Always scan through s while matching against t"

    - title: Off-by-One in the Result
      description: |
        After matching, `j` represents the **count** of matched characters (0-indexed pointer that advanced `j` times).

        If `j = 2` after matching, it means we matched `t[0]` and `t[1]`, so 2 characters are matched. The remaining characters to append is `len(t) - j`, not `len(t) - j - 1`.
      wrong_approach: "Return len(t) - j - 1"
      correct_approach: "Return len(t) - j"

  key_takeaways:
    - "**Subsequence matching pattern**: Use two pointers — one for the source string (scan all), one for the target (advance on match)"
    - "**Greedy is optimal**: Taking the earliest match leaves maximum room for subsequent characters"
    - "**Linear efficiency**: A single pass through both strings gives O(n + m) time with O(1) space"
    - "**Foundation for harder problems**: This pattern extends to problems like *Is Subsequence*, *Longest Common Subsequence*, and edit distance variants"

  time_complexity: "O(n + m). We traverse each string at most once, where `n = len(s)` and `m = len(t)`."
  space_complexity: "O(1). We only use two pointer variables regardless of input size."

solutions:
  - approach_name: Two Pointers
    is_optimal: true
    code: |
      def append_characters(s: str, t: str) -> int:
          # Pointer for string t - tracks how much we've matched
          j = 0

          # Scan through every character in s
          for char in s:
              # If we've matched all of t, we're done
              if j == len(t):
                  break
              # Found a match - advance the t pointer
              if char == t[j]:
                  j += 1

          # Characters remaining in t that weren't matched
          return len(t) - j
    explanation: |
      **Time Complexity:** O(n + m) — Single pass through `s`, and `j` advances at most `m` times.

      **Space Complexity:** O(1) — Only one pointer variable used.

      We greedily match characters of `t` as we scan through `s`. The number of unmatched characters at the end is exactly what we need to append.

  - approach_name: Two Pointers (Explicit Indices)
    is_optimal: true
    code: |
      def append_characters(s: str, t: str) -> int:
          i, j = 0, 0  # Pointers for s and t respectively
          n, m = len(s), len(t)

          # Continue until we exhaust either string
          while i < n and j < m:
              # Match found - advance t pointer
              if s[i] == t[j]:
                  j += 1
              # Always advance s pointer
              i += 1

          # Return count of unmatched characters in t
          return m - j
    explanation: |
      **Time Complexity:** O(n + m) — We traverse `s` fully and advance through `t` as matches are found.

      **Space Complexity:** O(1) — Only two index variables used.

      This variant uses explicit index pointers instead of Python's `for` loop. The logic is identical: scan `s`, match against `t`, and return the unmatched suffix length.