codetutor/backend/data/questions/interleaving-string.yaml

title: Interleaving String
slug: interleaving-string
difficulty: medium
leetcode_id: 97
leetcode_url: https://leetcode.com/problems/interleaving-string/
categories:
  - strings
  - dynamic-programming
patterns:
  - dynamic-programming

function_signature: "def is_interleave(s1: str, s2: str, s3: str) -> bool:"

test_cases:
  visible:
    - input: { s1: "aabcc", s2: "dbbca", s3: "aadbbcbcac" }
      expected: true
    - input: { s1: "aabcc", s2: "dbbca", s3: "aadbbbaccc" }
      expected: false
    - input: { s1: "", s2: "", s3: "" }
      expected: true
  hidden:
    - input: { s1: "a", s2: "b", s3: "ab" }
      expected: true
    - input: { s1: "a", s2: "b", s3: "ba" }
      expected: true
    - input: { s1: "abc", s2: "def", s3: "abcdef" }
      expected: true
    - input: { s1: "", s2: "abc", s3: "abc" }
      expected: true
    - input: { s1: "abc", s2: "", s3: "abc" }
      expected: true
    - input: { s1: "a", s2: "b", s3: "abc" }
      expected: false
    - input: { s1: "aaa", s2: "aaa", s3: "aaaaaa" }
      expected: true

description: |
  Given strings `s1`, `s2`, and `s3`, find whether `s3` is formed by an **interleaving** of `s1` and `s2`.

  An **interleaving** of two strings `s` and `t` is a configuration where `s` and `t` are divided into `n` and `m` substrings respectively, such that:

  - `s = s1 + s2 + ... + sn`
  - `t = t1 + t2 + ... + tm`
  - `|n - m| <= 1`
  - The **interleaving** is `s1 + t1 + s2 + t2 + s3 + t3 + ...` or `t1 + s1 + t2 + s2 + t3 + s3 + ...`

  **Note:** `a + b` is the concatenation of strings `a` and `b`.

constraints: |
  - `0 <= s1.length, s2.length <= 100`
  - `0 <= s3.length <= 200`
  - `s1`, `s2`, and `s3` consist of lowercase English letters.

examples:
  - input: 's1 = "aabcc", s2 = "dbbca", s3 = "aadbbcbcac"'
    output: "true"
    explanation: "One way to obtain s3 is: Split s1 into s1 = \"aa\" + \"bc\" + \"c\", and s2 into s2 = \"dbbc\" + \"a\". Interleaving the two splits, we get \"aa\" + \"dbbc\" + \"bc\" + \"a\" + \"c\" = \"aadbbcbcac\"."
  - input: 's1 = "aabcc", s2 = "dbbca", s3 = "aadbbbaccc"'
    output: "false"
    explanation: "It is impossible to interleave s2 with any other string to obtain s3."
  - input: 's1 = "", s2 = "", s3 = ""'
    output: "true"
    explanation: "Two empty strings trivially interleave to form an empty string."

explanation:
  intuition: |
    Imagine you have two decks of cards (representing `s1` and `s2`), and you want to merge them into a single pile (`s3`) while preserving the relative order of cards within each original deck. The question is: can `s3` be formed by picking cards alternately (or in any valid interleaving pattern) from the tops of these two decks?

    The key insight is that at any point in building `s3`, you have a **choice**: take the next character from `s1` or from `s2`. This decision tree branches exponentially, but many branches lead to the same "state" — defined by how many characters we've used from each string.

    Think of it as navigating a 2D grid where the x-axis represents progress through `s1` and the y-axis represents progress through `s2`. Starting at `(0, 0)`, you want to reach `(len(s1), len(s2))`. At each cell `(i, j)`, you can move right (use a character from `s1`) or down (use a character from `s2`) — but only if that character matches the next character needed in `s3`.

    This grid perspective reveals the **optimal substructure**: whether we can reach `(i, j)` depends only on whether we could reach `(i-1, j)` or `(i, j-1)` with a matching character. This is the hallmark of dynamic programming.

  approach: |
    We solve this using **2D Dynamic Programming**:

    **Step 1: Early termination check**

    - If `len(s1) + len(s2) != len(s3)`, return `False` immediately — the lengths don't match, so interleaving is impossible

    &nbsp;

    **Step 2: Initialize the DP table**

    - Create a 2D boolean table `dp` of size `(len(s1) + 1) x (len(s2) + 1)`
    - `dp[i][j]` represents: "Can the first `i` characters of `s1` and first `j` characters of `s2` interleave to form the first `i + j` characters of `s3`?"
    - Set `dp[0][0] = True` — empty strings trivially interleave to form an empty string

    &nbsp;

    **Step 3: Fill the first row and column**

    - First row (`dp[0][j]`): Can `s2[:j]` alone form `s3[:j]`? Only if all characters match sequentially
    - First column (`dp[i][0]`): Can `s1[:i]` alone form `s3[:i]`? Only if all characters match sequentially
    - These represent paths that use only one string

    &nbsp;

    **Step 4: Fill the rest of the table**

    - For each cell `dp[i][j]`, check two possibilities:
      - **From the left** (`dp[i-1][j]`): If we could form `s3[:i+j-1]` and `s1[i-1] == s3[i+j-1]`, then `dp[i][j] = True`
      - **From above** (`dp[i][j-1]`): If we could form `s3[:i+j-1]` and `s2[j-1] == s3[i+j-1]`, then `dp[i][j] = True`
    - Either path being valid makes the current state valid

    &nbsp;

    **Step 5: Return the answer**

    - Return `dp[len(s1)][len(s2)]` — whether we can use all of both strings to form all of `s3`

  common_pitfalls:
    - title: Exponential Brute Force
      description: |
        A naive recursive approach tries every possible way to interleave:
        - At each position in `s3`, try matching with `s1` or `s2`
        - This leads to `2^(m+n)` possibilities in the worst case

        With `s1.length, s2.length <= 100`, this means up to `2^200` operations — astronomically too slow. The key insight is that many recursive calls compute the same subproblem (same `(i, j)` position), making this a perfect candidate for memoization or bottom-up DP.
      wrong_approach: "Recursive backtracking without memoization"
      correct_approach: "Dynamic programming with O(m*n) states"

    - title: Forgetting the Length Check
      description: |
        If `len(s1) + len(s2) != len(s3)`, it's impossible to interleave — every character from `s1` and `s2` must appear exactly once in `s3`.

        Without this early check, your DP might give false positives for cases like `s1 = "a"`, `s2 = "b"`, `s3 = "ab"` (valid) vs `s3 = "abc"` (invalid — extra character). Always verify lengths first.
      wrong_approach: "Skip length validation"
      correct_approach: "Check len(s1) + len(s2) == len(s3) upfront"

    - title: Off-by-One Index Errors
      description: |
        The DP table has dimensions `(m+1) x (n+1)` to handle empty prefixes. When accessing characters:
        - `dp[i][j]` uses `s1[i-1]` and `s2[j-1]` (0-indexed strings)
        - The corresponding `s3` character is at index `i + j - 1`

        Confusing 0-indexed strings with 1-indexed DP indices is a common source of bugs. Draw the grid and trace through an example to verify your indexing.
      wrong_approach: "Using dp[i][j] with s1[i] and s2[j]"
      correct_approach: "Using dp[i][j] with s1[i-1] and s2[j-1]"

  key_takeaways:
    - "**2D DP for two sequences**: When combining or comparing two sequences, think of a 2D grid where axes represent progress through each sequence"
    - "**State definition is crucial**: Here, `dp[i][j]` captures whether prefixes of length `i` and `j` can form a prefix of `s3` — a clean, sufficient state"
    - "**Space optimization possible**: The follow-up asks for `O(s2.length)` space — since each row only depends on the previous row and current row, you can use a 1D array"
    - "**Early termination**: Simple checks like length validation can save significant computation and handle edge cases cleanly"

  time_complexity: "O(m * n). We fill a 2D table of size `(m+1) x (n+1)` where `m = len(s1)` and `n = len(s2)`, with O(1) work per cell."
  space_complexity: "O(m * n). We store a 2D boolean table. This can be optimized to O(n) by using a 1D array and updating in-place."

solutions:
  - approach_name: 2D Dynamic Programming
    is_optimal: true
    code: |
      def is_interleave(s1: str, s2: str, s3: str) -> bool:
          m, n = len(s1), len(s2)

          # Early termination: lengths must match
          if m + n != len(s3):
              return False

          # dp[i][j] = can s1[:i] and s2[:j] interleave to form s3[:i+j]?
          dp = [[False] * (n + 1) for _ in range(m + 1)]

          # Base case: empty strings form empty string
          dp[0][0] = True

          # Fill first column: using only s1
          for i in range(1, m + 1):
              dp[i][0] = dp[i - 1][0] and s1[i - 1] == s3[i - 1]

          # Fill first row: using only s2
          for j in range(1, n + 1):
              dp[0][j] = dp[0][j - 1] and s2[j - 1] == s3[j - 1]

          # Fill the rest of the table
          for i in range(1, m + 1):
              for j in range(1, n + 1):
                  # Current position in s3
                  k = i + j - 1
                  # Can we get here from the left (using s1[i-1])?
                  from_s1 = dp[i - 1][j] and s1[i - 1] == s3[k]
                  # Can we get here from above (using s2[j-1])?
                  from_s2 = dp[i][j - 1] and s2[j - 1] == s3[k]
                  dp[i][j] = from_s1 or from_s2

          return dp[m][n]
    explanation: |
      **Time Complexity:** O(m * n) — We iterate through every cell in the DP table once.

      **Space Complexity:** O(m * n) — We store the full 2D table.

      This solution builds up the answer by considering all valid ways to consume characters from `s1` and `s2`. Each cell represents a subproblem that's computed exactly once.

  - approach_name: Space-Optimized DP (1D Array)
    is_optimal: true
    code: |
      def is_interleave(s1: str, s2: str, s3: str) -> bool:
          m, n = len(s1), len(s2)

          # Early termination: lengths must match
          if m + n != len(s3):
              return False

          # Use 1D array: dp[j] represents dp[i][j] for current row i
          dp = [False] * (n + 1)

          # Fill the DP table row by row
          for i in range(m + 1):
              for j in range(n + 1):
                  if i == 0 and j == 0:
                      dp[j] = True
                  elif i == 0:
                      # First row: only using s2
                      dp[j] = dp[j - 1] and s2[j - 1] == s3[j - 1]
                  elif j == 0:
                      # First column: only using s1
                      dp[j] = dp[j] and s1[i - 1] == s3[i - 1]
                  else:
                      # General case: from left (s1) or from above (s2)
                      k = i + j - 1
                      dp[j] = (dp[j] and s1[i - 1] == s3[k]) or \
                              (dp[j - 1] and s2[j - 1] == s3[k])

          return dp[n]
    explanation: |
      **Time Complexity:** O(m * n) — Same iteration as 2D approach.

      **Space Complexity:** O(n) — Only one row of the DP table is stored.

      This answers the follow-up question. Since each row only depends on the current and previous row values, we can overwrite the array in-place. `dp[j]` holds the "from above" value before we update it, and `dp[j-1]` holds the already-updated "from left" value.

  - approach_name: Recursive with Memoization
    is_optimal: false
    code: |
      def is_interleave(s1: str, s2: str, s3: str) -> bool:
          m, n = len(s1), len(s2)

          if m + n != len(s3):
              return False

          # Memoization cache
          memo = {}

          def dp(i: int, j: int) -> bool:
              # Base case: used all characters
              if i == m and j == n:
                  return True

              # Check cache
              if (i, j) in memo:
                  return memo[(i, j)]

              k = i + j  # Current position in s3
              result = False

              # Try using next character from s1
              if i < m and s1[i] == s3[k]:
                  result = dp(i + 1, j)

              # Try using next character from s2
              if not result and j < n and s2[j] == s3[k]:
                  result = dp(i, j + 1)

              memo[(i, j)] = result
              return result

          return dp(0, 0)
    explanation: |
      **Time Complexity:** O(m * n) — Each unique `(i, j)` pair is computed once.

      **Space Complexity:** O(m * n) — For the memoization cache, plus O(m + n) recursion stack.

      This top-down approach is often more intuitive to write. It explores the decision tree but caches results to avoid redundant computation. The bottom-up DP is generally preferred for avoiding stack overflow on large inputs.