308 lines
8.6 KiB
YAML
308 lines
8.6 KiB
YAML
name: Trie
|
|
slug: trie
|
|
difficulty_level: 3
|
|
pattern_type: data_structure
|
|
display_order: 17
|
|
|
|
description: >
|
|
A tree-like data structure for efficient string prefix operations. Each node
|
|
represents a character, and paths from root to nodes spell out prefixes. Tries
|
|
enable O(m) search, insert, and prefix queries where m is the word length.
|
|
|
|
when_to_use: |
|
|
- Autocomplete systems
|
|
- Spell checkers
|
|
- Word dictionary with prefix search
|
|
- Word break problems
|
|
- IP routing (longest prefix matching)
|
|
|
|
metaphor: |
|
|
Imagine a filing cabinet where files are organized by name, one letter per
|
|
drawer. To find "apple," you open drawer 'a', then find sub-drawer 'p', then
|
|
'p', then 'l', then 'e'. You don't search through all files—you navigate
|
|
directly to the right location. Finding "application" shares the same path
|
|
up to "appl" before diverging.
|
|
|
|
Another analogy: a phone book organized as a tree. Instead of a flat
|
|
alphabetical list, common prefixes are grouped, making it fast to find all
|
|
names starting with "Joh" or check if "Johnson" exists.
|
|
|
|
core_concept: |
|
|
A **Trie** (pronounced "try") stores strings character by character:
|
|
|
|
- **Root**: Empty node representing the empty prefix
|
|
- **Edges**: Labeled with characters
|
|
- **Nodes**: Represent prefixes; may be marked as "end of word"
|
|
|
|
Key insight: all words sharing a prefix share the same path from root.
|
|
This makes prefix operations extremely efficient:
|
|
|
|
- **Insert word**: O(m) — create path from root
|
|
- **Search word**: O(m) — follow path, check end marker
|
|
- **Starts with prefix**: O(m) — just follow path, no end check needed
|
|
|
|
**Trade-off**: Tries use more memory than hash sets (each character is a node),
|
|
but enable prefix queries that hash sets cannot support.
|
|
|
|
visualization: |
|
|
**Trie containing: ["app", "apple", "apply", "apt", "bat"]**
|
|
|
|
```
|
|
(root)
|
|
/ \
|
|
a b
|
|
| |
|
|
p a
|
|
/ \ |
|
|
p t* t*
|
|
|
|
|
l
|
|
/ \
|
|
e* y*
|
|
|
|
* = end of word marker
|
|
|
|
Paths:
|
|
- "app" → a-p-p*
|
|
- "apple" → a-p-p-l-e*
|
|
- "apply" → a-p-p-l-y*
|
|
- "apt" → a-p-t*
|
|
- "bat" → b-a-t*
|
|
```
|
|
|
|
**Search for "apple":**
|
|
|
|
```
|
|
Start at root
|
|
→ 'a': found, move to 'a' node
|
|
→ 'p': found, move to 'p' node
|
|
→ 'p': found, move to second 'p' node
|
|
→ 'l': found, move to 'l' node
|
|
→ 'e': found, move to 'e' node
|
|
→ end of word marker? Yes!
|
|
|
|
"apple" exists ✓
|
|
```
|
|
|
|
**Search for "app":**
|
|
|
|
```
|
|
Follow path a-p-p
|
|
→ end of word marker on second 'p'? Yes!
|
|
|
|
"app" exists ✓
|
|
```
|
|
|
|
**Starts with "ap":**
|
|
|
|
```
|
|
Follow path a-p
|
|
→ reached end of prefix successfully
|
|
|
|
Words with prefix "ap" exist ✓
|
|
```
|
|
|
|
code_template: |
|
|
class TrieNode:
|
|
def __init__(self):
|
|
self.children = {}
|
|
self.is_end = False
|
|
|
|
|
|
class Trie:
|
|
def __init__(self):
|
|
self.root = TrieNode()
|
|
|
|
def insert(self, word: str) -> None:
|
|
"""Insert a word into the trie."""
|
|
node = self.root
|
|
for char in word:
|
|
if char not in node.children:
|
|
node.children[char] = TrieNode()
|
|
node = node.children[char]
|
|
node.is_end = True
|
|
|
|
def search(self, word: str) -> bool:
|
|
"""Check if word exists in trie."""
|
|
node = self._traverse(word)
|
|
return node is not None and node.is_end
|
|
|
|
def starts_with(self, prefix: str) -> bool:
|
|
"""Check if any word starts with prefix."""
|
|
return self._traverse(prefix) is not None
|
|
|
|
def _traverse(self, s: str) -> TrieNode:
|
|
"""Traverse trie following string s."""
|
|
node = self.root
|
|
for char in s:
|
|
if char not in node.children:
|
|
return None
|
|
node = node.children[char]
|
|
return node
|
|
|
|
|
|
class WordDictionary:
|
|
"""Trie with wildcard search support."""
|
|
|
|
def __init__(self):
|
|
self.root = TrieNode()
|
|
|
|
def add_word(self, word: str) -> None:
|
|
node = self.root
|
|
for char in word:
|
|
if char not in node.children:
|
|
node.children[char] = TrieNode()
|
|
node = node.children[char]
|
|
node.is_end = True
|
|
|
|
def search(self, word: str) -> bool:
|
|
"""Search with '.' as wildcard for any character."""
|
|
def dfs(node: TrieNode, i: int) -> bool:
|
|
if i == len(word):
|
|
return node.is_end
|
|
|
|
char = word[i]
|
|
|
|
if char == '.':
|
|
# Try all children
|
|
return any(dfs(child, i + 1)
|
|
for child in node.children.values())
|
|
else:
|
|
if char not in node.children:
|
|
return False
|
|
return dfs(node.children[char], i + 1)
|
|
|
|
return dfs(self.root, 0)
|
|
|
|
|
|
def word_break(s: str, word_dict: list[str]) -> bool:
|
|
"""Check if string can be segmented into dictionary words."""
|
|
trie = Trie()
|
|
for word in word_dict:
|
|
trie.insert(word)
|
|
|
|
n = len(s)
|
|
dp = [False] * (n + 1)
|
|
dp[0] = True # Empty string can be segmented
|
|
|
|
for i in range(n):
|
|
if not dp[i]:
|
|
continue
|
|
|
|
node = trie.root
|
|
for j in range(i, n):
|
|
if s[j] not in node.children:
|
|
break
|
|
node = node.children[s[j]]
|
|
if node.is_end:
|
|
dp[j + 1] = True
|
|
|
|
return dp[n]
|
|
|
|
|
|
def find_words_with_prefix(trie: Trie, prefix: str) -> list[str]:
|
|
"""Find all words starting with prefix."""
|
|
node = trie._traverse(prefix)
|
|
if not node:
|
|
return []
|
|
|
|
results = []
|
|
|
|
def dfs(node: TrieNode, path: str):
|
|
if node.is_end:
|
|
results.append(path)
|
|
for char, child in node.children.items():
|
|
dfs(child, path + char)
|
|
|
|
dfs(node, prefix)
|
|
return results
|
|
|
|
recognition_signals:
|
|
- "prefix"
|
|
- "autocomplete"
|
|
- "word dictionary"
|
|
- "spell check"
|
|
- "word search"
|
|
- "word break"
|
|
- "longest common prefix"
|
|
- "starts with"
|
|
- "implement trie"
|
|
- "wildcard"
|
|
|
|
common_mistakes:
|
|
- title: Confusing search vs starts_with
|
|
description: |
|
|
Search checks if the exact word exists (must have end marker).
|
|
Starts_with only checks if the prefix path exists.
|
|
fix: |
|
|
For search, always check `node.is_end` at the end:
|
|
```python
|
|
def search(self, word):
|
|
node = self._traverse(word)
|
|
return node is not None and node.is_end
|
|
```
|
|
|
|
- title: Not handling empty string
|
|
description: |
|
|
Empty string is a valid prefix (everything starts with it) but may not
|
|
be a valid word in the dictionary.
|
|
fix: |
|
|
starts_with("") should return True if trie has any words.
|
|
search("") should return True only if empty string was explicitly inserted.
|
|
|
|
- title: Using array instead of dict for children
|
|
description: |
|
|
Using `children = [None] * 26` assumes only lowercase letters. This fails
|
|
for other character sets.
|
|
fix: |
|
|
Use a dictionary for flexibility:
|
|
```python
|
|
self.children = {} # Works for any characters
|
|
```
|
|
Or use array only when character set is known and fixed.
|
|
|
|
- title: Memory leaks when deleting words
|
|
description: |
|
|
Simply unmarking is_end doesn't free memory for nodes that are no longer
|
|
part of any word.
|
|
fix: |
|
|
For deletion, either: (1) accept memory isn't freed (common), or
|
|
(2) implement proper deletion that removes orphaned nodes bottom-up.
|
|
|
|
variations:
|
|
- name: Basic Trie
|
|
description: |
|
|
Standard insert, search, and prefix check operations.
|
|
example: "Implement Trie (Prefix Tree)"
|
|
|
|
- name: Wildcard search
|
|
description: |
|
|
Support '.' as wildcard matching any single character. Requires DFS
|
|
to explore all possibilities when encountering wildcard.
|
|
example: "Design Add and Search Words Data Structure"
|
|
|
|
- name: Word search in grid
|
|
description: |
|
|
Use Trie to efficiently search for multiple words in a 2D grid.
|
|
Prune branches that don't match any word prefix.
|
|
example: "Word Search II"
|
|
|
|
- name: Autocomplete
|
|
description: |
|
|
Find all words starting with a given prefix. DFS from the prefix
|
|
endpoint to collect all words.
|
|
example: "Design Search Autocomplete System"
|
|
|
|
- name: Compressed Trie (Radix Tree)
|
|
description: |
|
|
Merge chains of single-child nodes into one node with a string label.
|
|
Saves space for sparse tries.
|
|
example: "Longest Common Prefix optimizations"
|
|
|
|
related_patterns:
|
|
- dfs
|
|
- backtracking
|
|
- dynamic-programming
|
|
|
|
prerequisite_patterns: []
|