title: Implement Trie (Prefix Tree) slug: implement-trie-prefix-tree difficulty: medium leetcode_id: 208 leetcode_url: https://leetcode.com/problems/implement-trie-prefix-tree/ categories: - strings - hash-tables patterns: - trie description: | A [**trie**](https://en.wikipedia.org/wiki/Trie) (pronounced as "try") or **prefix tree** is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker. Implement the `Trie` class: - `Trie()` Initialises the trie object. - `void insert(String word)` Inserts the string `word` into the trie. - `boolean search(String word)` Returns `true` if the string `word` is in the trie (i.e., was inserted before), and `false` otherwise. - `boolean startsWith(String prefix)` Returns `true` if there is a previously inserted string `word` that has the prefix `prefix`, and `false` otherwise. constraints: | - `1 <= word.length, prefix.length <= 2000` - `word` and `prefix` consist only of lowercase English letters. - At most `3 * 10^4` calls **in total** will be made to `insert`, `search`, and `startsWith`. examples: - input: | ["Trie", "insert", "search", "search", "startsWith", "insert", "search"] [[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]] output: "[null, null, true, false, true, null, true]" explanation: | Trie trie = new Trie(); trie.insert("apple"); trie.search("apple"); // return True trie.search("app"); // return False trie.startsWith("app"); // return True trie.insert("app"); trie.search("app"); // return True explanation: intuition: | Imagine building a word-completion system like the one in your phone's keyboard. When you type "app", the system suggests "apple", "application", "approve", and so on. How can we efficiently store thousands of words and quickly find all words that start with a given prefix? A **trie** (prefix tree) is the perfect data structure for this. Think of it like a tree where each node represents a single character, and paths from the root to nodes spell out words or prefixes. Unlike a hash table which stores complete words, a trie shares common prefixes among words, making it extremely efficient for prefix-based operations. Visualise it like a family tree of letters: ``` root | a | p | p / \ l (end of "app") | e | (end of "apple") ``` The key insight is that **each node stores its children** (the next possible characters) and a **flag indicating if a complete word ends here**. This allows us to distinguish between "app" being a complete word vs. just a prefix of "apple". approach: | We implement the Trie using nodes, where each node contains: - A dictionary/hashmap mapping characters to child nodes - A boolean flag indicating if a word ends at this node   **Step 1: Define the TrieNode structure** - `children`: A dictionary to store child nodes, keyed by character - `is_end_of_word`: A boolean flag, initially `False`   **Step 2: Implement insert(word)** - Start at the root node - For each character in the word: - If the character doesn't exist in current node's children, create a new node - Move to the child node for this character - After processing all characters, mark the final node as `is_end_of_word = True`   **Step 3: Implement search(word)** - Start at the root node - For each character in the word: - If the character doesn't exist in current node's children, return `False` - Move to the child node for this character - After processing all characters, return the value of `is_end_of_word` - This distinguishes between finding a prefix vs. a complete word   **Step 4: Implement startsWith(prefix)** - Follow the same traversal as `search` - The only difference: return `True` if we successfully traverse all characters - We don't need to check `is_end_of_word` since we only care if the prefix exists   The beauty of this approach is that all three operations share the same traversal logic, just with different termination conditions. common_pitfalls: - title: Confusing search() with startsWith() description: | A common mistake is implementing `search()` the same way as `startsWith()` — both traverse the trie, but they have different success conditions. For example, if we insert "apple" and then call `search("app")`, we should return `False` because "app" was never inserted as a complete word. However, `startsWith("app")` should return `True` because "apple" starts with "app". The fix: `search()` must check `is_end_of_word` after traversal, while `startsWith()` only needs to confirm the path exists. wrong_approach: "Return True after traversing the prefix successfully in search()" correct_approach: "Check is_end_of_word flag after traversal for search(), not for startsWith()" - title: Using Arrays Instead of Hash Maps description: | Some implementations use a fixed-size array of 26 elements (for lowercase letters a-z) instead of a hash map. While this works, it has drawbacks: - Wastes memory for sparse nodes (most nodes won't have all 26 children) - Less flexible if requirements change (e.g., supporting uppercase or other characters) Using a hash map is more space-efficient for typical use cases and more adaptable. wrong_approach: "Fixed array children[26] for every node" correct_approach: "Dictionary/hash map for children" - title: Forgetting to Initialise the Root description: | The root node is special — it doesn't represent any character but serves as the starting point for all operations. Forgetting to initialise it in the constructor leads to null pointer errors. Always create an empty root node in `__init__()` with an empty children dictionary. key_takeaways: - "**Trie fundamentals**: Each node has children (a map of character → node) and an `is_end_of_word` flag" - "**Prefix sharing**: Tries naturally share common prefixes, making them memory-efficient for related words" - "**O(m) operations**: All operations (insert, search, startsWith) run in O(m) time where m is the word/prefix length — independent of how many words are stored" - "**Foundation for advanced problems**: Tries are essential for autocomplete, spell checking, word search, and problems like Word Search II" time_complexity: "O(m) for all operations, where `m` is the length of the word or prefix being processed. We traverse at most `m` nodes." space_complexity: "O(n * m) in the worst case, where `n` is the number of words and `m` is the average word length. However, shared prefixes reduce actual space usage significantly." solutions: - approach_name: Hash Map Based Trie is_optimal: true code: | class TrieNode: def __init__(self): # Maps character -> child TrieNode self.children: dict[str, 'TrieNode'] = {} # True if a complete word ends at this node self.is_end_of_word: bool = False class Trie: def __init__(self): # Root node doesn't represent any character self.root = TrieNode() def insert(self, word: str) -> None: node = self.root for char in word: # Create child node if it doesn't exist if char not in node.children: node.children[char] = TrieNode() # Move to the child node node = node.children[char] # Mark the end of the word node.is_end_of_word = True def search(self, word: str) -> bool: node = self._traverse(word) # Word exists only if we found the path AND it's marked as end return node is not None and node.is_end_of_word def startsWith(self, prefix: str) -> bool: # Prefix exists if we can traverse to it (don't need end marker) return self._traverse(prefix) is not None def _traverse(self, s: str) -> TrieNode | None: """Helper to traverse the trie following string s. Returns the final node if path exists, None otherwise.""" node = self.root for char in s: if char not in node.children: return None node = node.children[char] return node explanation: | **Time Complexity:** O(m) for all operations, where m is the length of the input string. **Space Complexity:** O(n * m) worst case for storing n words of average length m. This implementation uses a hash map for children, providing O(1) average-case lookup per character. The `_traverse` helper method eliminates code duplication between `search` and `startsWith`. - approach_name: Array Based Trie is_optimal: false code: | class TrieNode: def __init__(self): # Fixed array for 26 lowercase letters (a=0, b=1, ..., z=25) self.children: list[TrieNode | None] = [None] * 26 self.is_end_of_word: bool = False class Trie: def __init__(self): self.root = TrieNode() def insert(self, word: str) -> None: node = self.root for char in word: # Convert character to index (a=0, b=1, etc.) index = ord(char) - ord('a') if node.children[index] is None: node.children[index] = TrieNode() node = node.children[index] node.is_end_of_word = True def search(self, word: str) -> bool: node = self._traverse(word) return node is not None and node.is_end_of_word def startsWith(self, prefix: str) -> bool: return self._traverse(prefix) is not None def _traverse(self, s: str) -> TrieNode | None: node = self.root for char in s: index = ord(char) - ord('a') if node.children[index] is None: return None node = node.children[index] return node explanation: | **Time Complexity:** O(m) for all operations — same as hash map version. **Space Complexity:** O(n * 26 * m) worst case, since each node allocates 26 slots. This approach uses a fixed-size array instead of a hash map. It has O(1) guaranteed lookup (no hash collisions), but wastes memory for sparse nodes. Useful when you know the character set is small and fixed.