questions S-W
This commit is contained in:
273
backend/data/questions/word-ladder.yaml
Normal file
273
backend/data/questions/word-ladder.yaml
Normal file
@@ -0,0 +1,273 @@
|
||||
title: Word Ladder
|
||||
slug: word-ladder
|
||||
difficulty: hard
|
||||
leetcode_id: 127
|
||||
leetcode_url: https://leetcode.com/problems/word-ladder/
|
||||
categories:
|
||||
- strings
|
||||
- graphs
|
||||
- hash-tables
|
||||
patterns:
|
||||
- bfs
|
||||
|
||||
description: |
|
||||
A **transformation sequence** from word `beginWord` to word `endWord` using a dictionary `wordList` is a sequence of words `beginWord -> s1 -> s2 -> ... -> sk` such that:
|
||||
|
||||
- Every adjacent pair of words differs by a single letter.
|
||||
- Every `si` for `1 <= i <= k` is in `wordList`. Note that `beginWord` does not need to be in `wordList`.
|
||||
- `sk == endWord`
|
||||
|
||||
Given two words, `beginWord` and `endWord`, and a dictionary `wordList`, return *the **number of words** in the **shortest transformation sequence** from `beginWord` to `endWord`, or `0` if no such sequence exists*.
|
||||
|
||||
constraints: |
|
||||
- `1 <= beginWord.length <= 10`
|
||||
- `endWord.length == beginWord.length`
|
||||
- `1 <= wordList.length <= 5000`
|
||||
- `wordList[i].length == beginWord.length`
|
||||
- `beginWord`, `endWord`, and `wordList[i]` consist of lowercase English letters
|
||||
- `beginWord != endWord`
|
||||
- All the words in `wordList` are **unique**
|
||||
|
||||
examples:
|
||||
- input: 'beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]'
|
||||
output: "5"
|
||||
explanation: 'One shortest transformation sequence is "hit" -> "hot" -> "dot" -> "dog" -> "cog", which is 5 words long.'
|
||||
- input: 'beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]'
|
||||
output: "0"
|
||||
explanation: 'The endWord "cog" is not in wordList, therefore there is no valid transformation sequence.'
|
||||
|
||||
explanation:
|
||||
intuition: |
|
||||
Imagine each word as a node in a graph. Two nodes are connected by an edge if they differ by exactly one letter. The problem then becomes: **find the shortest path** from `beginWord` to `endWord` in this graph.
|
||||
|
||||
Why BFS? When searching for the shortest path in an unweighted graph (where every edge has the same "cost"), **Breadth-First Search** is the ideal algorithm. BFS explores all nodes at distance 1, then all nodes at distance 2, and so on. The first time we reach `endWord`, we've guaranteed found the shortest path.
|
||||
|
||||
Think of it like ripples spreading outward from a stone dropped in water. Starting from `beginWord`, we explore all words reachable by changing one letter. Then from each of those words, we explore their one-letter neighbours. The "ripple" that first touches `endWord` tells us the shortest transformation length.
|
||||
|
||||
The key insight is recognising this as a **graph shortest-path problem** disguised as a string manipulation problem. Once you see the graph structure, BFS becomes the natural choice.
|
||||
|
||||
approach: |
|
||||
We solve this using **Breadth-First Search (BFS)** with a word set for O(1) lookups:
|
||||
|
||||
**Step 1: Handle early termination**
|
||||
|
||||
- If `endWord` is not in `wordList`, return `0` immediately since no valid transformation exists
|
||||
- Convert `wordList` to a set for O(1) membership checks
|
||||
|
||||
|
||||
|
||||
**Step 2: Initialise BFS data structures**
|
||||
|
||||
- `queue`: Contains tuples of `(current_word, transformation_length)`, starting with `(beginWord, 1)`
|
||||
- `visited`: A set to track words we've already processed, preventing cycles
|
||||
|
||||
|
||||
|
||||
**Step 3: Process the BFS queue**
|
||||
|
||||
- Dequeue the front word and its current transformation length
|
||||
- If this word equals `endWord`, return the transformation length (shortest path found)
|
||||
- Otherwise, generate all possible one-letter transformations
|
||||
|
||||
|
||||
|
||||
**Step 4: Generate neighbour words efficiently**
|
||||
|
||||
- For each position in the word, try replacing it with every letter from `a` to `z`
|
||||
- If the new word exists in `wordList` and hasn't been visited:
|
||||
- Mark it as visited
|
||||
- Add it to the queue with `length + 1`
|
||||
|
||||
|
||||
|
||||
**Step 5: Return result**
|
||||
|
||||
- If the queue empties without finding `endWord`, return `0`
|
||||
|
||||
|
||||
|
||||
This approach guarantees we find the shortest path because BFS explores all words at distance `d` before any word at distance `d+1`.
|
||||
|
||||
common_pitfalls:
|
||||
- title: Using DFS Instead of BFS
|
||||
description: |
|
||||
DFS will find *a* path but not necessarily the *shortest* path. DFS explores one branch deeply before backtracking, so it might find a longer transformation sequence first.
|
||||
|
||||
For example, DFS might find `hit -> hot -> lot -> log -> cog` (5 words) but miss that `hit -> hot -> dot -> dog -> cog` is equally short. Worse, on different inputs DFS could find much longer paths.
|
||||
|
||||
BFS guarantees shortest path in unweighted graphs because it explores level by level.
|
||||
wrong_approach: "Use DFS with path tracking"
|
||||
correct_approach: "Use BFS to guarantee shortest path"
|
||||
|
||||
- title: Comparing Every Word Pair (O(n^2) Neighbour Check)
|
||||
description: |
|
||||
A naive approach compares every word against every other word to find neighbours differing by one letter. With `n` words of length `m`, this is O(n^2 * m) just for building the graph.
|
||||
|
||||
Instead, for each word, generate all possible one-letter variations and check if they exist in the word set. This is O(n * m * 26) = O(n * m), which is much faster when `n` is large.
|
||||
|
||||
With `wordList.length <= 5000` and word length up to 10, the optimised approach does ~1.3M operations vs potentially 250M for the naive approach.
|
||||
wrong_approach: "Compare every pair of words"
|
||||
correct_approach: "Generate variations and check set membership"
|
||||
|
||||
- title: Forgetting to Check if endWord Exists
|
||||
description: |
|
||||
If `endWord` is not in `wordList`, no valid transformation can exist. Failing to check this upfront means BFS runs to exhaustion before returning `0`.
|
||||
|
||||
Always validate inputs first: `if endWord not in word_set: return 0`.
|
||||
|
||||
- title: Not Marking Words as Visited
|
||||
description: |
|
||||
Without tracking visited words, BFS can revisit the same word multiple times from different paths, leading to:
|
||||
- Infinite loops in graphs with cycles
|
||||
- Exponential time complexity as the same subgraphs are explored repeatedly
|
||||
|
||||
Mark words as visited **when adding to the queue**, not when dequeuing. This prevents adding duplicates to the queue.
|
||||
wrong_approach: "Process words without tracking visited"
|
||||
correct_approach: "Mark visited when enqueuing to prevent duplicates"
|
||||
|
||||
key_takeaways:
|
||||
- "**Graph recognition**: Many string transformation problems are graph shortest-path problems in disguise. When you see 'minimum steps' or 'shortest sequence', think BFS"
|
||||
- "**BFS for shortest path**: In unweighted graphs, BFS guarantees the shortest path. This is fundamental and appears in many problems"
|
||||
- "**Optimise neighbour generation**: Instead of comparing all pairs, generate possible variations and check set membership. This changes O(n^2) to O(n * alphabet_size)"
|
||||
- "**Foundation for Word Ladder II**: This problem (LeetCode 126) asks for all shortest paths, requiring you to track parent pointers during BFS"
|
||||
|
||||
time_complexity: "O(n * m * 26) where `n` is the number of words and `m` is the word length. For each word, we generate `m * 26` variations and check set membership in O(m) for hashing."
|
||||
space_complexity: "O(n * m). The visited set and queue can each hold up to `n` words of length `m`."
|
||||
|
||||
solutions:
|
||||
- approach_name: BFS with Set Lookup
|
||||
is_optimal: true
|
||||
code: |
|
||||
from collections import deque
|
||||
|
||||
def ladder_length(begin_word: str, end_word: str, word_list: list[str]) -> int:
|
||||
# Convert to set for O(1) lookups
|
||||
word_set = set(word_list)
|
||||
|
||||
# Early termination: end_word must be reachable
|
||||
if end_word not in word_set:
|
||||
return 0
|
||||
|
||||
# BFS setup: (current_word, transformation_count)
|
||||
queue = deque([(begin_word, 1)])
|
||||
visited = {begin_word}
|
||||
|
||||
while queue:
|
||||
current_word, length = queue.popleft()
|
||||
|
||||
# Try changing each character position
|
||||
for i in range(len(current_word)):
|
||||
# Try all 26 letters
|
||||
for c in 'abcdefghijklmnopqrstuvwxyz':
|
||||
# Build the new word with one character changed
|
||||
next_word = current_word[:i] + c + current_word[i+1:]
|
||||
|
||||
# Found the target!
|
||||
if next_word == end_word:
|
||||
return length + 1
|
||||
|
||||
# Valid unvisited word? Add to queue
|
||||
if next_word in word_set and next_word not in visited:
|
||||
visited.add(next_word)
|
||||
queue.append((next_word, length + 1))
|
||||
|
||||
# No path found
|
||||
return 0
|
||||
explanation: |
|
||||
**Time Complexity:** O(n * m * 26) where n is the word list size and m is word length.
|
||||
|
||||
**Space Complexity:** O(n * m) for the visited set and queue.
|
||||
|
||||
BFS explores words level by level, guaranteeing the first path found to `endWord` is the shortest. We optimise neighbour finding by generating all single-character variations rather than comparing against all words.
|
||||
|
||||
- approach_name: Bidirectional BFS
|
||||
is_optimal: true
|
||||
code: |
|
||||
def ladder_length(begin_word: str, end_word: str, word_list: list[str]) -> int:
|
||||
word_set = set(word_list)
|
||||
|
||||
if end_word not in word_set:
|
||||
return 0
|
||||
|
||||
# Search from both ends simultaneously
|
||||
front = {begin_word}
|
||||
back = {end_word}
|
||||
visited = set()
|
||||
length = 1
|
||||
|
||||
while front and back:
|
||||
# Always expand the smaller frontier for efficiency
|
||||
if len(front) > len(back):
|
||||
front, back = back, front
|
||||
|
||||
next_front = set()
|
||||
|
||||
for word in front:
|
||||
for i in range(len(word)):
|
||||
for c in 'abcdefghijklmnopqrstuvwxyz':
|
||||
next_word = word[:i] + c + word[i+1:]
|
||||
|
||||
# Frontiers meet! Path found
|
||||
if next_word in back:
|
||||
return length + 1
|
||||
|
||||
if next_word in word_set and next_word not in visited:
|
||||
visited.add(next_word)
|
||||
next_front.add(next_word)
|
||||
|
||||
front = next_front
|
||||
length += 1
|
||||
|
||||
return 0
|
||||
explanation: |
|
||||
**Time Complexity:** O(n * m * 26), but often faster in practice due to smaller search space.
|
||||
|
||||
**Space Complexity:** O(n * m) for the visited set and frontiers.
|
||||
|
||||
Bidirectional BFS searches from both `beginWord` and `endWord` simultaneously. When the two search frontiers meet, we've found the shortest path. This reduces the search space from O(b^d) to O(b^(d/2)) where b is branching factor and d is depth, providing significant speedup on large graphs.
|
||||
|
||||
- approach_name: BFS with Wildcard Preprocessing
|
||||
is_optimal: false
|
||||
code: |
|
||||
from collections import deque, defaultdict
|
||||
|
||||
def ladder_length(begin_word: str, end_word: str, word_list: list[str]) -> int:
|
||||
if end_word not in word_list:
|
||||
return 0
|
||||
|
||||
# Preprocess: group words by wildcard patterns
|
||||
# "hot" -> ["*ot", "h*t", "ho*"]
|
||||
word_len = len(begin_word)
|
||||
patterns = defaultdict(list)
|
||||
|
||||
for word in word_list:
|
||||
for i in range(word_len):
|
||||
pattern = word[:i] + '*' + word[i+1:]
|
||||
patterns[pattern].append(word)
|
||||
|
||||
# BFS using pattern lookup
|
||||
queue = deque([(begin_word, 1)])
|
||||
visited = {begin_word}
|
||||
|
||||
while queue:
|
||||
current_word, length = queue.popleft()
|
||||
|
||||
# Find neighbours through shared patterns
|
||||
for i in range(word_len):
|
||||
pattern = current_word[:i] + '*' + current_word[i+1:]
|
||||
|
||||
for neighbour in patterns[pattern]:
|
||||
if neighbour == end_word:
|
||||
return length + 1
|
||||
|
||||
if neighbour not in visited:
|
||||
visited.add(neighbour)
|
||||
queue.append((neighbour, length + 1))
|
||||
|
||||
return 0
|
||||
explanation: |
|
||||
**Time Complexity:** O(n * m^2) for preprocessing plus O(n * m) for BFS.
|
||||
|
||||
**Space Complexity:** O(n * m^2) for the pattern dictionary.
|
||||
|
||||
This approach preprocesses words into "wildcard buckets" (e.g., `h*t` contains both `hot` and `hat`). Finding neighbours becomes a dictionary lookup. This trades space for faster neighbour finding but uses more memory. Best when the word list is dense (many words share patterns).
|
||||
Reference in New Issue
Block a user