codetutor/backend/data/questions/edit-distance.yaml

title: Edit Distance
slug: edit-distance
difficulty: medium
leetcode_id: 72
leetcode_url: https://leetcode.com/problems/edit-distance/
categories:
  - strings
  - dynamic-programming
patterns:
  - dynamic-programming

description: |
  Given two strings `word1` and `word2`, return *the minimum number of operations required to convert `word1` to `word2`*.

  You have the following three operations permitted on a word:

  - Insert a character
  - Delete a character
  - Replace a character

constraints: |
  - `0 <= word1.length, word2.length <= 500`
  - `word1` and `word2` consist of lowercase English letters.

examples:
  - input: 'word1 = "horse", word2 = "ros"'
    output: "3"
    explanation: 'horse -> rorse (replace ''h'' with ''r''), rorse -> rose (remove ''r''), rose -> ros (remove ''e'')'
  - input: 'word1 = "intention", word2 = "execution"'
    output: "5"
    explanation: 'intention -> inention (remove ''t''), inention -> enention (replace ''i'' with ''e''), enention -> exention (replace ''n'' with ''x''), exention -> exection (replace ''n'' with ''c''), exection -> execution (insert ''u'')'

explanation:
  intuition: |
    Imagine you have two words written on paper and you want to transform the first word into the second using the fewest edits possible. Each edit is either inserting a letter, deleting a letter, or replacing one letter with another.

    The key insight is that we can break this problem into **smaller subproblems**. If we're comparing two strings character by character from the end, at each position we face a decision:

    - If the current characters **match**, we don't need any operation for this position — we simply move to the smaller subproblem of the remaining prefixes.
    - If they **don't match**, we must perform one of three operations: insert, delete, or replace. Each operation leads to a different subproblem, and we pick the one with the minimum cost.

    Think of it like building a grid where rows represent characters of `word1` and columns represent characters of `word2`. Each cell `(i, j)` stores the minimum edits needed to convert the first `i` characters of `word1` to the first `j` characters of `word2`. We fill this grid from the base cases (empty strings) toward the full strings.

    This is a classic **dynamic programming** problem because:
    1. It has **optimal substructure**: the optimal solution to the full problem depends on optimal solutions to subproblems
    2. It has **overlapping subproblems**: the same subproblems are computed multiple times in a naive recursive approach

  approach: |
    We solve this using a **2D Dynamic Programming** approach:

    **Step 1: Define the state**

    - Let `dp[i][j]` represent the minimum number of operations to convert `word1[0..i-1]` to `word2[0..j-1]`
    - `dp[0][0]` = 0 (empty string to empty string requires 0 operations)

    &nbsp;

    **Step 2: Initialise the base cases**

    - `dp[i][0]` = `i` for all `i`: converting `word1[0..i-1]` to an empty string requires `i` deletions
    - `dp[0][j]` = `j` for all `j`: converting an empty string to `word2[0..j-1]` requires `j` insertions

    &nbsp;

    **Step 3: Fill the DP table**

    - For each cell `dp[i][j]`, compare `word1[i-1]` and `word2[j-1]`:
      - If they are **equal**: `dp[i][j] = dp[i-1][j-1]` (no operation needed)
      - If they are **different**: take the minimum of three options plus 1:
        - `dp[i-1][j]` + 1: **Delete** from `word1` (we matched `word1[0..i-2]` to `word2[0..j-1]`, then delete `word1[i-1]`)
        - `dp[i][j-1]` + 1: **Insert** into `word1` (we matched `word1[0..i-1]` to `word2[0..j-2]`, then insert `word2[j-1]`)
        - `dp[i-1][j-1]` + 1: **Replace** `word1[i-1]` with `word2[j-1]`

    &nbsp;

    **Step 4: Return the result**

    - Return `dp[m][n]` where `m = len(word1)` and `n = len(word2)`

  common_pitfalls:
    - title: Off-by-One Indexing Errors
      description: |
        The DP table has dimensions `(m+1) x (n+1)` to account for empty string cases. When accessing characters, remember that `dp[i][j]` corresponds to `word1[i-1]` and `word2[j-1]`.

        A common mistake is using `word1[i]` when you should use `word1[i-1]`, leading to index out of bounds errors or incorrect comparisons.
      wrong_approach: "Using dp[i][j] with word1[i] and word2[j]"
      correct_approach: "Using dp[i][j] with word1[i-1] and word2[j-1]"

    - title: Forgetting Base Case Initialisation
      description: |
        The first row and first column of the DP table must be initialised explicitly. `dp[i][0] = i` represents deleting all characters from `word1`, and `dp[0][j] = j` represents inserting all characters of `word2`.

        Forgetting this initialisation leads to incorrect results because the recurrence relation depends on these base cases.
      wrong_approach: "Leaving dp[i][0] and dp[0][j] as 0"
      correct_approach: "Initialise dp[i][0] = i and dp[0][j] = j"

    - title: Confusing the Three Operations
      description: |
        It's easy to mix up which direction in the DP table corresponds to which operation:

        - Moving from `dp[i-1][j]` means we're "ignoring" the last character of `word1` — this is a **deletion**
        - Moving from `dp[i][j-1]` means we're "adding" a character to match `word2` — this is an **insertion**
        - Moving from `dp[i-1][j-1]` with different characters means we're **replacing**

        Drawing out the DP table for a small example helps build intuition.
      wrong_approach: "Guessing which transition is which operation"
      correct_approach: "Understand that delete shrinks word1, insert grows word1, replace transforms a character"

  key_takeaways:
    - "**Classic DP pattern**: Edit Distance is a foundational problem that demonstrates 2D dynamic programming with string comparison"
    - "**Three-way minimum**: When multiple choices exist at each step, take the minimum of all valid options"
    - "**Base cases matter**: Proper initialisation of the DP table edges (empty string transformations) is crucial"
    - "**Space optimisation possible**: Since each row only depends on the previous row, you can reduce space from O(mn) to O(n) using a rolling array"

  time_complexity: "O(m * n). We fill a 2D table of size `(m+1) x (n+1)` where `m` and `n` are the lengths of the two strings."
  space_complexity: "O(m * n). We use a 2D DP table to store intermediate results. This can be optimised to O(min(m, n)) using space optimisation techniques."

solutions:
  - approach_name: 2D Dynamic Programming
    is_optimal: true
    code: |
      def min_distance(word1: str, word2: str) -> int:
          m, n = len(word1), len(word2)

          # Create DP table with dimensions (m+1) x (n+1)
          # dp[i][j] = min operations to convert word1[0..i-1] to word2[0..j-1]
          dp = [[0] * (n + 1) for _ in range(m + 1)]

          # Base case: converting word1[0..i-1] to empty string needs i deletions
          for i in range(m + 1):
              dp[i][0] = i

          # Base case: converting empty string to word2[0..j-1] needs j insertions
          for j in range(n + 1):
              dp[0][j] = j

          # Fill the DP table
          for i in range(1, m + 1):
              for j in range(1, n + 1):
                  # If characters match, no operation needed
                  if word1[i - 1] == word2[j - 1]:
                      dp[i][j] = dp[i - 1][j - 1]
                  else:
                      # Take minimum of insert, delete, replace (each costs 1)
                      dp[i][j] = 1 + min(
                          dp[i][j - 1],      # Insert
                          dp[i - 1][j],      # Delete
                          dp[i - 1][j - 1]   # Replace
                      )

          return dp[m][n]
    explanation: |
      **Time Complexity:** O(m * n) — We iterate through all cells in the `(m+1) x (n+1)` DP table.

      **Space Complexity:** O(m * n) — We store the entire DP table.

      This bottom-up DP approach builds the solution from smaller subproblems. Each cell represents the minimum edit distance for a prefix pair, and we fill the table row by row until we reach the answer at `dp[m][n]`.

  - approach_name: Space-Optimised DP
    is_optimal: true
    code: |
      def min_distance(word1: str, word2: str) -> int:
          m, n = len(word1), len(word2)

          # Use only two rows since each row depends only on the previous row
          prev = list(range(n + 1))  # Previous row
          curr = [0] * (n + 1)       # Current row

          for i in range(1, m + 1):
              # First column: converting word1[0..i-1] to empty string
              curr[0] = i

              for j in range(1, n + 1):
                  if word1[i - 1] == word2[j - 1]:
                      # Characters match, no operation needed
                      curr[j] = prev[j - 1]
                  else:
                      # Minimum of insert, delete, replace
                      curr[j] = 1 + min(
                          curr[j - 1],   # Insert (from left in current row)
                          prev[j],       # Delete (from above in previous row)
                          prev[j - 1]    # Replace (from diagonal in previous row)
                      )

              # Swap rows for next iteration
              prev, curr = curr, prev

          # Result is in prev because we swapped at the end
          return prev[n]
    explanation: |
      **Time Complexity:** O(m * n) — Same iteration through all cells.

      **Space Complexity:** O(n) — We only keep two rows of the DP table at any time.

      Since each row only depends on the previous row, we can reduce space from O(m * n) to O(n) by using a rolling array technique. We alternate between two arrays, updating the current row based on the previous row.

  - approach_name: Recursive with Memoisation
    is_optimal: false
    code: |
      def min_distance(word1: str, word2: str) -> int:
          from functools import lru_cache

          @lru_cache(maxsize=None)
          def dp(i: int, j: int) -> int:
              # Base case: if word1 is exhausted, insert remaining of word2
              if i == 0:
                  return j
              # Base case: if word2 is exhausted, delete remaining of word1
              if j == 0:
                  return i

              # If characters match, no operation needed
              if word1[i - 1] == word2[j - 1]:
                  return dp(i - 1, j - 1)

              # Try all three operations and take minimum
              return 1 + min(
                  dp(i, j - 1),      # Insert
                  dp(i - 1, j),      # Delete
                  dp(i - 1, j - 1)   # Replace
              )

          return dp(len(word1), len(word2))
    explanation: |
      **Time Complexity:** O(m * n) — Each unique state `(i, j)` is computed once due to memoisation.

      **Space Complexity:** O(m * n) — For the memoisation cache, plus O(m + n) for the recursion stack.

      This top-down approach is more intuitive to understand as it directly mirrors the problem definition. We recursively break down the problem and cache results to avoid redundant computation. However, it uses more space due to the recursion stack.