title: LFU Cache slug: lfu-cache difficulty: hard leetcode_id: 460 leetcode_url: https://leetcode.com/problems/lfu-cache/ categories: - hash-tables - linked-lists patterns: - heap function_signature: "class LFUCache: def __init__(self, capacity: int); def get(self, key: int) -> int; def put(self, key: int, value: int) -> None" test_cases: visible: - input: operations: ["LFUCache", "put", "put", "get", "put", "get", "get", "put", "get", "get", "get"] arguments: [[2], [1, 1], [2, 2], [1], [3, 3], [2], [3], [4, 4], [1], [3], [4]] expected: [null, null, null, 1, null, -1, 3, null, -1, 3, 4] - input: operations: ["LFUCache", "put", "get"] arguments: [[1], [1, 1], [1]] expected: [null, null, 1] - input: operations: ["LFUCache", "put", "put", "get", "get"] arguments: [[2], [1, 10], [2, 20], [1], [2]] expected: [null, null, null, 10, 20] hidden: - input: operations: ["LFUCache", "put", "put", "put", "get"] arguments: [[2], [1, 1], [2, 2], [3, 3], [1]] expected: [null, null, null, null, -1] - input: operations: ["LFUCache", "put", "get", "put", "get", "get"] arguments: [[1], [1, 1], [1], [2, 2], [1], [2]] expected: [null, null, 1, null, -1, 2] - input: operations: ["LFUCache", "put", "put", "get", "get", "put", "get", "get", "get"] arguments: [[2], [1, 1], [2, 2], [1], [1], [3, 3], [2], [3], [1]] expected: [null, null, null, 1, 1, null, -1, 3, 1] - input: operations: ["LFUCache", "put", "put", "put", "put", "get"] arguments: [[3], [1, 1], [2, 2], [3, 3], [4, 4], [4]] expected: [null, null, null, null, null, 4] - input: operations: ["LFUCache", "put", "get", "put", "get", "put", "get"] arguments: [[2], [1, 100], [1], [1, 200], [1], [2, 300], [2]] expected: [null, null, 100, null, 200, null, 300] - input: operations: ["LFUCache", "get"] arguments: [[2], [1]] expected: [null, -1] description: | Design and implement a data structure for a **Least Frequently Used (LFU)** cache. Implement the `LFUCache` class: - `LFUCache(int capacity)` Initialises the object with the `capacity` of the data structure. - `int get(int key)` Gets the value of the `key` if the `key` exists in the cache. Otherwise, returns `-1`. - `void put(int key, int value)` Update the value of the `key` if present, or inserts the `key` if not already present. When the cache reaches its `capacity`, it should invalidate and remove the **least frequently used** key before inserting a new item. For this problem, when there is a **tie** (i.e., two or more keys with the same frequency), the **least recently used** key would be invalidated. To determine the least frequently used key, a **use counter** is maintained for each key in the cache. The key with the smallest **use counter** is the least frequently used key. When a key is first inserted into the cache, its **use counter** is set to `1` (due to the `put` operation). The **use counter** for a key in the cache is incremented when either a `get` or `put` operation is called on it. The functions `get` and `put` must each run in **O(1)** average time complexity. constraints: | - `1 <= capacity <= 10^4` - `0 <= key <= 10^5` - `0 <= value <= 10^9` - At most `2 * 10^5` calls will be made to `get` and `put`. examples: - input: | ["LFUCache", "put", "put", "get", "put", "get", "get", "put", "get", "get", "get"] [[2], [1, 1], [2, 2], [1], [3, 3], [2], [3], [4, 4], [1], [3], [4]] output: "[null, null, null, 1, null, -1, 3, null, -1, 3, 4]" explanation: | LFUCache lfu = new LFUCache(2); lfu.put(1, 1); // cache=[1,_], cnt(1)=1 lfu.put(2, 2); // cache=[2,1], cnt(2)=1, cnt(1)=1 lfu.get(1); // return 1, cache=[1,2], cnt(1)=2 lfu.put(3, 3); // 2 is the LFU key because cnt(2)=1 is smallest, invalidate 2 lfu.get(2); // return -1 (not found) lfu.get(3); // return 3, cache=[3,1], cnt(3)=2 lfu.put(4, 4); // Both 1 and 3 have cnt=2, but 1 is LRU, invalidate 1 lfu.get(1); // return -1 (not found) lfu.get(3); // return 3, cnt(3)=3 lfu.get(4); // return 4, cnt(4)=2 explanation: intuition: | Think of the LFU cache like a **library with limited shelf space**. Each book has two properties: how many times it's been checked out (frequency) and when it was last touched (recency). When the shelves are full and a new book arrives, you remove the book with the fewest checkouts. If two books tie on checkout count, you remove the one that was touched longer ago. The challenge is achieving O(1) operations. A naive approach might scan all items to find the minimum frequency, but that's O(n). The key insight is to **group items by their frequency** using a clever data structure combination: 1. **Hash map for key lookup** — Instant access to any item's data and frequency 2. **Frequency buckets** — Group all items with the same frequency together 3. **Ordered list within each bucket** — Track recency order for tie-breaking When an item's frequency increases (from access), we simply move it from one bucket to the next. When we need to evict, we go to the lowest frequency bucket and remove the oldest item (the tail of that bucket's list). The trick to maintaining O(1) is tracking the `min_freq` variable. It only ever increases by 1 (when items are accessed) or resets to 1 (when new items are inserted). We never need to search for the minimum. approach: | We solve this using **Two Hash Maps + Doubly-Linked Lists**: **Step 1: Define the data structures** - `key_to_node`: Hash map from key to node (stores value, frequency, and list position) - `freq_to_list`: Hash map from frequency to a doubly-linked list of nodes with that frequency - `min_freq`: Integer tracking the current minimum frequency in the cache - `capacity`: Maximum number of items the cache can hold - `size`: Current number of items in the cache   **Step 2: Implement the `get` operation** - If key doesn't exist, return `-1` - If key exists: - Remove the node from its current frequency list - Increment the node's frequency - Add the node to the new frequency list (at the head, marking it as most recently used) - Update `min_freq` if the old frequency list is now empty and was the minimum - Return the value   **Step 3: Implement the `put` operation** - If capacity is 0, do nothing - If key exists: - Update the value - Call the same "touch" logic as `get` to update frequency - If key doesn't exist: - If cache is at capacity, evict the LFU item (tail of `freq_to_list[min_freq]`) - Create a new node with frequency 1 - Add to `key_to_node` and `freq_to_list[1]` - Reset `min_freq` to 1 (new items always have the lowest possible frequency)   **Step 4: Implement helper for moving nodes between frequency lists** - Remove node from old frequency's list - If that list becomes empty and it was `min_freq`, increment `min_freq` - Add node to new frequency's list at the head (most recently used position)   Using doubly-linked lists allows O(1) removal from anywhere and O(1) insertion at head. The hash maps provide O(1) key lookup and O(1) access to any frequency bucket. common_pitfalls: - title: Using a Min-Heap for Frequency Tracking description: | A natural instinct is to use a min-heap to always know the minimum frequency. However, heaps have O(log n) operations for insertion and deletion. The problem requires O(1) average time. Instead, track `min_freq` as a simple integer that only changes in predictable ways: it resets to 1 on insert, and may increment by 1 when we access items and empty a frequency bucket. wrong_approach: "Min-heap to find lowest frequency" correct_approach: "Track min_freq integer, only increments or resets to 1" - title: Not Handling the Tie-Breaker Correctly description: | When multiple keys have the same frequency, the **least recently used** among them should be evicted. Within each frequency bucket, you need an ordered structure. Using a set or unordered collection loses recency information. A doubly-linked list with newest items at the head and oldest at the tail provides O(1) access to the LRU item for eviction. wrong_approach: "Set or unordered collection for frequency groups" correct_approach: "Doubly-linked list with head=MRU, tail=LRU" - title: Forgetting to Update min_freq on Access description: | When you access an item and increment its frequency, you might empty its old frequency bucket. If that bucket was the minimum, you need to update `min_freq`. For example, if `min_freq=2` and the only item with frequency 2 gets accessed (now frequency 3), `min_freq` should become 3. Forgetting this leads to evicting items from empty buckets. wrong_approach: "Only update min_freq on eviction" correct_approach: "Check if old frequency bucket is empty after access" - title: Zero Capacity Edge Case description: | The constraints allow `capacity >= 1`, but some implementations forget to handle the boundary. With capacity 0, all `put` operations should be no-ops and all `get` operations should return -1. Always check `if capacity == 0` at the start of `put`. key_takeaways: - "**Compound data structures**: Complex cache problems often require combining multiple data structures (hash maps + linked lists) to achieve O(1) for different operations" - "**Frequency bucketing**: Grouping items by frequency and tracking the minimum avoids expensive searches" - "**Doubly-linked lists for O(1) removal**: When you need to remove items from the middle of a sequence in O(1), doubly-linked lists are the answer" - "**LFU vs LRU**: LRU only tracks recency; LFU tracks frequency with recency as tie-breaker. LFU is more complex but can be more cache-efficient for certain access patterns" time_complexity: "O(1) for both `get` and `put` operations. Hash map lookups, linked list insertions/deletions, and frequency updates are all constant time." space_complexity: "O(capacity). We store at most `capacity` items, each with constant overhead for hash map entries and list nodes." solutions: - approach_name: Two Hash Maps with Doubly-Linked Lists is_optimal: true code: | class Node: """Doubly-linked list node storing key, value, and frequency.""" def __init__(self, key: int, value: int): self.key = key self.value = value self.freq = 1 # New items start with frequency 1 self.prev = None self.next = None class DoublyLinkedList: """Doubly-linked list with sentinel nodes for O(1) operations.""" def __init__(self): # Sentinel nodes simplify edge cases self.head = Node(0, 0) # Dummy head (MRU side) self.tail = Node(0, 0) # Dummy tail (LRU side) self.head.next = self.tail self.tail.prev = self.head self.size = 0 def add_first(self, node: Node) -> None: """Add node right after head (most recently used position).""" node.next = self.head.next node.prev = self.head self.head.next.prev = node self.head.next = node self.size += 1 def remove(self, node: Node) -> None: """Remove a node from anywhere in the list in O(1).""" node.prev.next = node.next node.next.prev = node.prev self.size -= 1 def remove_last(self) -> Node: """Remove and return the tail node (least recently used).""" if self.size == 0: return None last = self.tail.prev self.remove(last) return last def is_empty(self) -> bool: return self.size == 0 class LFUCache: def __init__(self, capacity: int): self.capacity = capacity self.size = 0 self.min_freq = 0 # Maps key -> Node self.key_to_node: dict[int, Node] = {} # Maps frequency -> DoublyLinkedList of nodes with that frequency self.freq_to_list: dict[int, DoublyLinkedList] = {} def _update_freq(self, node: Node) -> None: """Move node from current frequency bucket to next frequency bucket.""" freq = node.freq # Remove from current frequency list self.freq_to_list[freq].remove(node) # If this was the min frequency list and it's now empty, increment min_freq if freq == self.min_freq and self.freq_to_list[freq].is_empty(): self.min_freq += 1 # Increment frequency and add to new list node.freq += 1 if node.freq not in self.freq_to_list: self.freq_to_list[node.freq] = DoublyLinkedList() self.freq_to_list[node.freq].add_first(node) def get(self, key: int) -> int: if key not in self.key_to_node: return -1 node = self.key_to_node[key] # Update frequency (this also marks it as most recently used) self._update_freq(node) return node.value def put(self, key: int, value: int) -> None: if self.capacity == 0: return if key in self.key_to_node: # Key exists: update value and frequency node = self.key_to_node[key] node.value = value self._update_freq(node) else: # New key: check if we need to evict if self.size >= self.capacity: # Evict LFU (and LRU among ties) lfu_list = self.freq_to_list[self.min_freq] evicted = lfu_list.remove_last() del self.key_to_node[evicted.key] self.size -= 1 # Insert new node with frequency 1 new_node = Node(key, value) self.key_to_node[key] = new_node if 1 not in self.freq_to_list: self.freq_to_list[1] = DoublyLinkedList() self.freq_to_list[1].add_first(new_node) self.min_freq = 1 # New items always have the minimum frequency self.size += 1 explanation: | **Time Complexity:** O(1) for both `get` and `put`. - Hash map lookups: O(1) - Doubly-linked list add/remove: O(1) - Frequency bucket access: O(1) **Space Complexity:** O(capacity). We maintain at most `capacity` nodes, each stored once in `key_to_node` and once in a frequency list. The number of frequency buckets is bounded by the number of operations, but nodes are shared references. - approach_name: OrderedDict per Frequency (Python-Specific) is_optimal: true code: | from collections import OrderedDict, defaultdict class LFUCache: def __init__(self, capacity: int): self.capacity = capacity self.min_freq = 0 # Maps key -> (value, frequency) self.key_to_val_freq: dict[int, tuple[int, int]] = {} # Maps frequency -> OrderedDict of keys (maintains insertion order) # OrderedDict gives us O(1) move_to_end and popitem self.freq_to_keys: dict[int, OrderedDict] = defaultdict(OrderedDict) def _update_freq(self, key: int) -> None: """Increment frequency of key and move to appropriate bucket.""" value, freq = self.key_to_val_freq[key] # Remove from current frequency bucket del self.freq_to_keys[freq][key] # Update min_freq if we emptied the minimum bucket if not self.freq_to_keys[freq] and freq == self.min_freq: self.min_freq += 1 # Add to next frequency bucket new_freq = freq + 1 self.freq_to_keys[new_freq][key] = None # Value doesn't matter self.key_to_val_freq[key] = (value, new_freq) def get(self, key: int) -> int: if key not in self.key_to_val_freq: return -1 self._update_freq(key) return self.key_to_val_freq[key][0] def put(self, key: int, value: int) -> None: if self.capacity == 0: return if key in self.key_to_val_freq: # Update existing key _, freq = self.key_to_val_freq[key] self.key_to_val_freq[key] = (value, freq) self._update_freq(key) else: # Evict if at capacity if len(self.key_to_val_freq) >= self.capacity: # popitem(last=False) removes oldest (LRU) from min freq bucket evicted_key, _ = self.freq_to_keys[self.min_freq].popitem(last=False) del self.key_to_val_freq[evicted_key] # Insert new key with frequency 1 self.key_to_val_freq[key] = (value, 1) self.freq_to_keys[1][key] = None self.min_freq = 1 explanation: | **Time Complexity:** O(1) average for both `get` and `put`. Python's `OrderedDict` maintains insertion order and provides O(1) `popitem()` and `move_to_end()`. We use it as a pseudo-linked-list where order represents recency. **Space Complexity:** O(capacity). This approach is more Pythonic and concise, leveraging built-in data structures. The trade-off is that it's language-specific and relies on Python's `OrderedDict` implementation details.