title: Word Ladder slug: word-ladder difficulty: hard leetcode_id: 127 leetcode_url: https://leetcode.com/problems/word-ladder/ categories: - strings - graphs - hash-tables patterns: - bfs description: | A **transformation sequence** from word `beginWord` to word `endWord` using a dictionary `wordList` is a sequence of words `beginWord -> s1 -> s2 -> ... -> sk` such that: - Every adjacent pair of words differs by a single letter. - Every `si` for `1 <= i <= k` is in `wordList`. Note that `beginWord` does not need to be in `wordList`. - `sk == endWord` Given two words, `beginWord` and `endWord`, and a dictionary `wordList`, return *the **number of words** in the **shortest transformation sequence** from `beginWord` to `endWord`, or `0` if no such sequence exists*. constraints: | - `1 <= beginWord.length <= 10` - `endWord.length == beginWord.length` - `1 <= wordList.length <= 5000` - `wordList[i].length == beginWord.length` - `beginWord`, `endWord`, and `wordList[i]` consist of lowercase English letters - `beginWord != endWord` - All the words in `wordList` are **unique** examples: - input: 'beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]' output: "5" explanation: 'One shortest transformation sequence is "hit" -> "hot" -> "dot" -> "dog" -> "cog", which is 5 words long.' - input: 'beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]' output: "0" explanation: 'The endWord "cog" is not in wordList, therefore there is no valid transformation sequence.' explanation: intuition: | Imagine each word as a node in a graph. Two nodes are connected by an edge if they differ by exactly one letter. The problem then becomes: **find the shortest path** from `beginWord` to `endWord` in this graph. Why BFS? When searching for the shortest path in an unweighted graph (where every edge has the same "cost"), **Breadth-First Search** is the ideal algorithm. BFS explores all nodes at distance 1, then all nodes at distance 2, and so on. The first time we reach `endWord`, we've guaranteed found the shortest path. Think of it like ripples spreading outward from a stone dropped in water. Starting from `beginWord`, we explore all words reachable by changing one letter. Then from each of those words, we explore their one-letter neighbours. The "ripple" that first touches `endWord` tells us the shortest transformation length. The key insight is recognising this as a **graph shortest-path problem** disguised as a string manipulation problem. Once you see the graph structure, BFS becomes the natural choice. approach: | We solve this using **Breadth-First Search (BFS)** with a word set for O(1) lookups: **Step 1: Handle early termination** - If `endWord` is not in `wordList`, return `0` immediately since no valid transformation exists - Convert `wordList` to a set for O(1) membership checks   **Step 2: Initialise BFS data structures** - `queue`: Contains tuples of `(current_word, transformation_length)`, starting with `(beginWord, 1)` - `visited`: A set to track words we've already processed, preventing cycles   **Step 3: Process the BFS queue** - Dequeue the front word and its current transformation length - If this word equals `endWord`, return the transformation length (shortest path found) - Otherwise, generate all possible one-letter transformations   **Step 4: Generate neighbour words efficiently** - For each position in the word, try replacing it with every letter from `a` to `z` - If the new word exists in `wordList` and hasn't been visited: - Mark it as visited - Add it to the queue with `length + 1`   **Step 5: Return result** - If the queue empties without finding `endWord`, return `0`   This approach guarantees we find the shortest path because BFS explores all words at distance `d` before any word at distance `d+1`. common_pitfalls: - title: Using DFS Instead of BFS description: | DFS will find *a* path but not necessarily the *shortest* path. DFS explores one branch deeply before backtracking, so it might find a longer transformation sequence first. For example, DFS might find `hit -> hot -> lot -> log -> cog` (5 words) but miss that `hit -> hot -> dot -> dog -> cog` is equally short. Worse, on different inputs DFS could find much longer paths. BFS guarantees shortest path in unweighted graphs because it explores level by level. wrong_approach: "Use DFS with path tracking" correct_approach: "Use BFS to guarantee shortest path" - title: Comparing Every Word Pair (O(n^2) Neighbour Check) description: | A naive approach compares every word against every other word to find neighbours differing by one letter. With `n` words of length `m`, this is O(n^2 * m) just for building the graph. Instead, for each word, generate all possible one-letter variations and check if they exist in the word set. This is O(n * m * 26) = O(n * m), which is much faster when `n` is large. With `wordList.length <= 5000` and word length up to 10, the optimised approach does ~1.3M operations vs potentially 250M for the naive approach. wrong_approach: "Compare every pair of words" correct_approach: "Generate variations and check set membership" - title: Forgetting to Check if endWord Exists description: | If `endWord` is not in `wordList`, no valid transformation can exist. Failing to check this upfront means BFS runs to exhaustion before returning `0`. Always validate inputs first: `if endWord not in word_set: return 0`. - title: Not Marking Words as Visited description: | Without tracking visited words, BFS can revisit the same word multiple times from different paths, leading to: - Infinite loops in graphs with cycles - Exponential time complexity as the same subgraphs are explored repeatedly Mark words as visited **when adding to the queue**, not when dequeuing. This prevents adding duplicates to the queue. wrong_approach: "Process words without tracking visited" correct_approach: "Mark visited when enqueuing to prevent duplicates" key_takeaways: - "**Graph recognition**: Many string transformation problems are graph shortest-path problems in disguise. When you see 'minimum steps' or 'shortest sequence', think BFS" - "**BFS for shortest path**: In unweighted graphs, BFS guarantees the shortest path. This is fundamental and appears in many problems" - "**Optimise neighbour generation**: Instead of comparing all pairs, generate possible variations and check set membership. This changes O(n^2) to O(n * alphabet_size)" - "**Foundation for Word Ladder II**: This problem (LeetCode 126) asks for all shortest paths, requiring you to track parent pointers during BFS" time_complexity: "O(n * m * 26) where `n` is the number of words and `m` is the word length. For each word, we generate `m * 26` variations and check set membership in O(m) for hashing." space_complexity: "O(n * m). The visited set and queue can each hold up to `n` words of length `m`." solutions: - approach_name: BFS with Set Lookup is_optimal: true code: | from collections import deque def ladder_length(begin_word: str, end_word: str, word_list: list[str]) -> int: # Convert to set for O(1) lookups word_set = set(word_list) # Early termination: end_word must be reachable if end_word not in word_set: return 0 # BFS setup: (current_word, transformation_count) queue = deque([(begin_word, 1)]) visited = {begin_word} while queue: current_word, length = queue.popleft() # Try changing each character position for i in range(len(current_word)): # Try all 26 letters for c in 'abcdefghijklmnopqrstuvwxyz': # Build the new word with one character changed next_word = current_word[:i] + c + current_word[i+1:] # Found the target! if next_word == end_word: return length + 1 # Valid unvisited word? Add to queue if next_word in word_set and next_word not in visited: visited.add(next_word) queue.append((next_word, length + 1)) # No path found return 0 explanation: | **Time Complexity:** O(n * m * 26) where n is the word list size and m is word length. **Space Complexity:** O(n * m) for the visited set and queue. BFS explores words level by level, guaranteeing the first path found to `endWord` is the shortest. We optimise neighbour finding by generating all single-character variations rather than comparing against all words. - approach_name: Bidirectional BFS is_optimal: true code: | def ladder_length(begin_word: str, end_word: str, word_list: list[str]) -> int: word_set = set(word_list) if end_word not in word_set: return 0 # Search from both ends simultaneously front = {begin_word} back = {end_word} visited = set() length = 1 while front and back: # Always expand the smaller frontier for efficiency if len(front) > len(back): front, back = back, front next_front = set() for word in front: for i in range(len(word)): for c in 'abcdefghijklmnopqrstuvwxyz': next_word = word[:i] + c + word[i+1:] # Frontiers meet! Path found if next_word in back: return length + 1 if next_word in word_set and next_word not in visited: visited.add(next_word) next_front.add(next_word) front = next_front length += 1 return 0 explanation: | **Time Complexity:** O(n * m * 26), but often faster in practice due to smaller search space. **Space Complexity:** O(n * m) for the visited set and frontiers. Bidirectional BFS searches from both `beginWord` and `endWord` simultaneously. When the two search frontiers meet, we've found the shortest path. This reduces the search space from O(b^d) to O(b^(d/2)) where b is branching factor and d is depth, providing significant speedup on large graphs. - approach_name: BFS with Wildcard Preprocessing is_optimal: false code: | from collections import deque, defaultdict def ladder_length(begin_word: str, end_word: str, word_list: list[str]) -> int: if end_word not in word_list: return 0 # Preprocess: group words by wildcard patterns # "hot" -> ["*ot", "h*t", "ho*"] word_len = len(begin_word) patterns = defaultdict(list) for word in word_list: for i in range(word_len): pattern = word[:i] + '*' + word[i+1:] patterns[pattern].append(word) # BFS using pattern lookup queue = deque([(begin_word, 1)]) visited = {begin_word} while queue: current_word, length = queue.popleft() # Find neighbours through shared patterns for i in range(word_len): pattern = current_word[:i] + '*' + current_word[i+1:] for neighbour in patterns[pattern]: if neighbour == end_word: return length + 1 if neighbour not in visited: visited.add(neighbour) queue.append((neighbour, length + 1)) return 0 explanation: | **Time Complexity:** O(n * m^2) for preprocessing plus O(n * m) for BFS. **Space Complexity:** O(n * m^2) for the pattern dictionary. This approach preprocesses words into "wildcard buckets" (e.g., `h*t` contains both `hot` and `hat`). Finding neighbours becomes a dictionary lookup. This trades space for faster neighbour finding but uses more memory. Best when the word list is dense (many words share patterns).