Files
codetutor/backend/data/questions/word-search-ii.yaml
2025-05-30 19:18:33 +01:00

275 lines
13 KiB
YAML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
title: Word Search II
slug: word-search-ii
difficulty: hard
leetcode_id: 212
leetcode_url: https://leetcode.com/problems/word-search-ii/
categories:
- arrays
- strings
- recursion
patterns:
- trie
- backtracking
- matrix-traversal
description: |
Given an `m x n` `board` of characters and a list of strings `words`, return *all words on the board*.
Each word must be constructed from letters of sequentially adjacent cells, where **adjacent cells** are horizontally or vertically neighboring. The same letter cell may not be used more than once in a word.
constraints: |
- `m == board.length`
- `n == board[i].length`
- `1 <= m, n <= 12`
- `board[i][j]` is a lowercase English letter
- `1 <= words.length <= 3 * 10^4`
- `1 <= words[i].length <= 10`
- `words[i]` consists of lowercase English letters
- All the strings of `words` are unique
examples:
- input: 'board = [["o","a","a","n"],["e","t","a","e"],["i","h","k","r"],["i","f","l","v"]], words = ["oath","pea","eat","rain"]'
output: '["eat","oath"]'
explanation: "Both 'eat' and 'oath' can be constructed from adjacent cells on the board. 'pea' and 'rain' cannot be formed using the available paths."
- input: 'board = [["a","b"],["c","d"]], words = ["abcb"]'
output: "[]"
explanation: "The word 'abcb' would require revisiting the cell 'b', which is not allowed."
explanation:
intuition: |
Imagine you're solving a word search puzzle from a newspaper, but instead of finding one word, you need to find thousands.
The naive approach would be to run Word Search I (the single-word version) for each word in the list. But with up to 30,000 words and a board that allows paths of length 10, this becomes prohibitively slow — you'd repeat the same board traversals over and over.
The key insight is to **flip the problem around**: instead of searching for each word separately, we search the board once and check all words simultaneously. To do this efficiently, we use a **Trie (prefix tree)** to store all the words. As we explore paths on the board using DFS/backtracking, we traverse the Trie in parallel. If the current path isn't a valid prefix of any word, we can prune immediately.
Think of it like this: you're walking through a maze (the board), carrying a map of all possible destinations (the Trie). At each intersection, you check your map — if no destination lies along this path, turn back. If you reach a destination (a complete word), collect it and potentially continue (since "cat" being a word doesn't mean "cats" isn't also there).
approach: |
We solve this using a **Trie + Backtracking** approach:
**Step 1: Build the Trie**
- Create a Trie data structure and insert all words from the input list
- Each node stores children (a dictionary mapping characters to nodes) and an optional word marker
- Store the complete word at terminal nodes for easy retrieval when found
&nbsp;
**Step 2: Set up the backtracking search**
- Iterate through each cell `(i, j)` on the board as a potential starting point
- Only start DFS if the cell's character exists in the Trie root's children
&nbsp;
**Step 3: DFS with Trie navigation**
- At each cell, check if the current character exists in the current Trie node's children
- If not, return immediately (pruning)
- If yes, move to that Trie child node and continue exploring
- Mark the cell as visited (temporarily replace with `#`) to prevent reuse in the same path
- Explore all four directions: up, down, left, right
- Restore the cell's original character when backtracking
&nbsp;
**Step 4: Collect found words**
- When a Trie node contains a complete word, add it to the result set
- Remove the word from the Trie (set the word marker to None) to avoid duplicates
- Continue exploring since longer words may share this prefix
&nbsp;
**Step 5: Optimisation — Trie pruning**
- After finding a word, if a Trie node has no children and no word, we can remove it
- This progressively shrinks the Trie as words are found, speeding up later searches
&nbsp;
**Step 6: Return results**
- Return the list of all found words
common_pitfalls:
- title: Running Word Search I for Each Word
description: |
The most intuitive approach is to reuse the Word Search I solution for each word:
```python
for word in words:
if exists_on_board(board, word):
result.append(word)
```
With `k` words of average length `L` and a board of size `m × n`, this gives **O(k × m × n × 4^L)** time complexity. For the maximum constraints (`k = 30,000`, `m = n = 12`, `L = 10`), this means potentially 10^15 operations — far too slow.
The Trie approach searches all words simultaneously, reducing this to roughly **O(m × n × 4^L)** with effective pruning.
wrong_approach: "Iterate through words and search each separately"
correct_approach: "Build a Trie and search all words in one board traversal"
- title: Forgetting to Handle Duplicates
description: |
The same word might be findable via multiple paths on the board. For example, "aba" might appear both horizontally and diagonally.
Without proper handling, you'll add duplicates to your result. The solution is to either:
- Use a set for results
- Remove the word from the Trie after finding it (preferred, as it also improves performance)
wrong_approach: "Append found words to a list without deduplication"
correct_approach: "Remove word from Trie after finding, or use a result set"
- title: Not Restoring Board State
description: |
When marking cells as visited during DFS, you must restore them when backtracking. A common bug is forgetting to restore, which corrupts the board for other paths.
```python
# Wrong: board[i][j] stays as '#' for other searches
board[i][j] = '#'
dfs(...)
# Missing: board[i][j] = original_char
```
Always restore the cell after exploring all directions from it.
wrong_approach: "Mark visited without restoration"
correct_approach: "Save original character, mark as '#', restore after DFS returns"
- title: Missing Trie Pruning Optimisation
description: |
Without pruning empty Trie branches, the Trie structure remains full even as words are found. This means the algorithm keeps checking paths that can no longer lead to any words.
For example, after finding "oath", if no other words start with "oat" or "oa" or "o", we should remove those nodes to avoid exploring "o..." prefixes again.
This optimisation can significantly improve average-case performance.
wrong_approach: "Keep the full Trie structure throughout"
correct_approach: "Remove childless, wordless nodes after finding words"
key_takeaways:
- "**Trie for multi-pattern search**: When searching for many patterns in the same data, a Trie lets you check all patterns simultaneously rather than iterating through each"
- "**Prune early, prune often**: The power of the Trie approach comes from pruning — rejecting paths as soon as they can't lead to any word"
- "**Backtracking template**: Mark visited → explore all directions → restore state. This pattern appears in many grid/graph problems"
- "**Optimise the Trie dynamically**: Removing found words and empty branches prevents redundant work and can dramatically improve performance"
time_complexity: "O(m × n × 4^L) where `m × n` is the board size and `L` is the maximum word length. Each cell can be a starting point, and from each cell we explore up to 4 directions for up to `L` steps. The Trie pruning makes this much faster in practice."
space_complexity: "O(N) where `N` is the total number of characters across all words, for storing the Trie. The recursion stack adds O(L) for the maximum word length. The board modification for visited marking is O(1) extra space."
solutions:
- approach_name: Trie + Backtracking
is_optimal: true
code: |
class TrieNode:
def __init__(self):
self.children = {} # char -> TrieNode
self.word = None # Stores complete word at terminal nodes
class Solution:
def findWords(self, board: list[list[str]], words: list[str]) -> list[str]:
# Step 1: Build the Trie from all words
root = TrieNode()
for word in words:
node = root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.word = word # Mark end of word
result = []
rows, cols = len(board), len(board[0])
def backtrack(row: int, col: int, parent: TrieNode) -> None:
char = board[row][col]
node = parent.children[char]
# Found a word — add to result and remove from Trie
if node.word:
result.append(node.word)
node.word = None # Prevent duplicates
# Mark cell as visited
board[row][col] = '#'
# Explore all four directions
for dr, dc in [(0, 1), (1, 0), (0, -1), (-1, 0)]:
new_row, new_col = row + dr, col + dc
# Check bounds and if next char exists in Trie
if (0 <= new_row < rows and 0 <= new_col < cols
and board[new_row][new_col] in node.children):
backtrack(new_row, new_col, node)
# Restore cell for other paths
board[row][col] = char
# Optimisation: prune empty branches from Trie
if not node.children:
del parent.children[char]
# Step 2: Start DFS from each cell
for i in range(rows):
for j in range(cols):
if board[i][j] in root.children:
backtrack(i, j, root)
return result
explanation: |
**Time Complexity:** O(m × n × 4^L) — We potentially start from each cell and explore paths up to length L, with 4 directions at each step. Trie pruning significantly reduces this in practice.
**Space Complexity:** O(N) — The Trie stores all characters from all words. Recursion stack adds O(L).
This solution combines three techniques: a Trie for efficient prefix matching, backtracking for exploring all valid paths, and progressive pruning to eliminate dead branches. The key insight is that we traverse the Trie and board simultaneously, allowing us to prune paths that can't possibly lead to any word.
- approach_name: Brute Force (Word Search I per word)
is_optimal: false
code: |
class Solution:
def findWords(self, board: list[list[str]], words: list[str]) -> list[str]:
rows, cols = len(board), len(board[0])
result = []
def search_word(word: str) -> bool:
"""Search for a single word on the board."""
def dfs(row: int, col: int, idx: int) -> bool:
# Found complete word
if idx == len(word):
return True
# Check bounds and character match
if (row < 0 or row >= rows or col < 0 or col >= cols
or board[row][col] != word[idx]):
return False
# Mark visited
temp = board[row][col]
board[row][col] = '#'
# Explore all directions
found = (dfs(row + 1, col, idx + 1) or
dfs(row - 1, col, idx + 1) or
dfs(row, col + 1, idx + 1) or
dfs(row, col - 1, idx + 1))
# Restore cell
board[row][col] = temp
return found
# Try starting from each cell
for i in range(rows):
for j in range(cols):
if dfs(i, j, 0):
return True
return False
# Search for each word separately
for word in words:
if search_word(word):
result.append(word)
return result
explanation: |
**Time Complexity:** O(k × m × n × 4^L) — For each of k words, we potentially explore all cells and paths.
**Space Complexity:** O(L) — Recursion stack depth equals maximum word length.
This approach applies the Word Search I solution to each word independently. While correct, it's extremely slow for large word lists because it repeats board traversals and doesn't share work between words with common prefixes. Included to illustrate why the Trie optimisation is essential.