310 lines
8.9 KiB
YAML
310 lines
8.9 KiB
YAML
name: Heap / Priority Queue
|
|
slug: heap
|
|
difficulty_level: 3
|
|
pattern_type: data_structure
|
|
display_order: 16
|
|
|
|
description: >
|
|
A data structure that efficiently maintains the minimum or maximum element,
|
|
supporting O(log n) insertion and extraction. Heaps are essential when you
|
|
repeatedly need to access the smallest or largest element from a changing set.
|
|
|
|
when_to_use: |
|
|
- Finding K largest/smallest elements
|
|
- K-way merge of sorted lists
|
|
- Finding median from data stream
|
|
- Task scheduling by priority
|
|
- Dijkstra's shortest path algorithm
|
|
|
|
metaphor: |
|
|
Imagine a hospital emergency room where patients are treated by urgency, not
|
|
arrival time. A priority queue (heap) lets you always know who's next without
|
|
sorting everyone whenever someone new arrives. The most urgent patient "bubbles
|
|
up" to the front automatically.
|
|
|
|
Another analogy: a to-do list that always shows your most important task first.
|
|
When you add or complete tasks, the list reorganizes itself so the highest
|
|
priority is always accessible in O(1) time.
|
|
|
|
core_concept: |
|
|
A **heap** is a complete binary tree where each parent is smaller (min-heap) or
|
|
larger (max-heap) than its children. This property guarantees:
|
|
|
|
- **Peek min/max**: O(1) — it's always at the root
|
|
- **Insert**: O(log n) — bubble up to maintain heap property
|
|
- **Extract min/max**: O(log n) — remove root, bubble down to reheapify
|
|
|
|
Key insight: heaps don't fully sort the data. They only guarantee the root is
|
|
the min/max. This partial ordering is enough for many problems and is more
|
|
efficient than maintaining full sorted order.
|
|
|
|
**When to use heaps:**
|
|
- Need repeated access to min/max element
|
|
- Data changes frequently (insertions/deletions)
|
|
- Full sorting is overkill (only need top K, not all elements sorted)
|
|
|
|
visualization: |
|
|
**Min-Heap Structure:**
|
|
|
|
```
|
|
Array: [1, 3, 2, 7, 6, 4, 5]
|
|
|
|
As tree:
|
|
1 (index 0)
|
|
/ \
|
|
3 2 (indices 1, 2)
|
|
/ \ / \
|
|
7 6 4 5 (indices 3, 4, 5, 6)
|
|
|
|
Parent of index i: (i-1) // 2
|
|
Left child: 2*i + 1
|
|
Right child: 2*i + 2
|
|
```
|
|
|
|
**Inserting 0 into heap:**
|
|
|
|
```
|
|
Add 0 at end:
|
|
1
|
|
/ \
|
|
3 2
|
|
/ \ / \
|
|
7 6 4 5
|
|
/
|
|
0
|
|
|
|
Bubble up (0 < 7, swap):
|
|
1
|
|
/ \
|
|
3 2
|
|
/ \ / \
|
|
0 6 4 5
|
|
/
|
|
7
|
|
|
|
Bubble up (0 < 3, swap):
|
|
1
|
|
/ \
|
|
0 2
|
|
/ \ / \
|
|
3 6 4 5
|
|
/
|
|
7
|
|
|
|
Bubble up (0 < 1, swap):
|
|
0
|
|
/ \
|
|
1 2
|
|
/ \ / \
|
|
3 6 4 5
|
|
/
|
|
7
|
|
```
|
|
|
|
**Top K Elements using Min-Heap:**
|
|
|
|
```
|
|
Find 3 largest from [3, 1, 4, 1, 5, 9, 2, 6]
|
|
|
|
Maintain min-heap of size 3:
|
|
|
|
Process 3: heap = [3]
|
|
Process 1: heap = [1, 3]
|
|
Process 4: heap = [1, 3, 4]
|
|
Process 1: 1 <= heap[0]=1, skip
|
|
Process 5: 5 > 1, remove 1, add 5 → heap = [3, 5, 4]
|
|
Process 9: 9 > 3, remove 3, add 9 → heap = [4, 5, 9]
|
|
Process 2: 2 <= 4, skip
|
|
Process 6: 6 > 4, remove 4, add 6 → heap = [5, 9, 6]
|
|
|
|
Result: [5, 6, 9] are the top 3
|
|
```
|
|
|
|
code_template: |
|
|
import heapq
|
|
|
|
def find_k_largest(nums: list[int], k: int) -> list[int]:
|
|
"""Find k largest elements using min-heap."""
|
|
# Min-heap of size k keeps k largest
|
|
heap = []
|
|
|
|
for num in nums:
|
|
if len(heap) < k:
|
|
heapq.heappush(heap, num)
|
|
elif num > heap[0]:
|
|
heapq.heapreplace(heap, num) # Pop min, push new
|
|
|
|
return heap
|
|
|
|
|
|
def find_k_smallest(nums: list[int], k: int) -> list[int]:
|
|
"""Find k smallest elements using max-heap (negated values)."""
|
|
# Max-heap (negated) of size k keeps k smallest
|
|
heap = []
|
|
|
|
for num in nums:
|
|
if len(heap) < k:
|
|
heapq.heappush(heap, -num)
|
|
elif num < -heap[0]:
|
|
heapq.heapreplace(heap, -num)
|
|
|
|
return [-x for x in heap]
|
|
|
|
|
|
def merge_k_sorted_lists(lists: list[list[int]]) -> list[int]:
|
|
"""Merge k sorted lists using min-heap."""
|
|
heap = []
|
|
result = []
|
|
|
|
# Initialize heap with first element from each list
|
|
for i, lst in enumerate(lists):
|
|
if lst:
|
|
heapq.heappush(heap, (lst[0], i, 0))
|
|
|
|
while heap:
|
|
val, list_idx, elem_idx = heapq.heappop(heap)
|
|
result.append(val)
|
|
|
|
# Add next element from same list
|
|
if elem_idx + 1 < len(lists[list_idx]):
|
|
next_val = lists[list_idx][elem_idx + 1]
|
|
heapq.heappush(heap, (next_val, list_idx, elem_idx + 1))
|
|
|
|
return result
|
|
|
|
|
|
class MedianFinder:
|
|
"""Find median from data stream using two heaps."""
|
|
|
|
def __init__(self):
|
|
self.small = [] # Max-heap (negated) for smaller half
|
|
self.large = [] # Min-heap for larger half
|
|
|
|
def add_num(self, num: int) -> None:
|
|
# Add to max-heap (smaller half)
|
|
heapq.heappush(self.small, -num)
|
|
|
|
# Balance: largest of small should be <= smallest of large
|
|
if self.large and -self.small[0] > self.large[0]:
|
|
heapq.heappush(self.large, -heapq.heappop(self.small))
|
|
|
|
# Size balance: small can have at most 1 more element
|
|
if len(self.small) > len(self.large) + 1:
|
|
heapq.heappush(self.large, -heapq.heappop(self.small))
|
|
elif len(self.large) > len(self.small):
|
|
heapq.heappush(self.small, -heapq.heappop(self.large))
|
|
|
|
def find_median(self) -> float:
|
|
if len(self.small) > len(self.large):
|
|
return -self.small[0]
|
|
return (-self.small[0] + self.large[0]) / 2
|
|
|
|
|
|
def kth_smallest_in_matrix(matrix: list[list[int]], k: int) -> int:
|
|
"""Find kth smallest in row-wise and column-wise sorted matrix."""
|
|
n = len(matrix)
|
|
heap = [(matrix[0][0], 0, 0)]
|
|
visited = {(0, 0)}
|
|
|
|
for _ in range(k - 1):
|
|
val, r, c = heapq.heappop(heap)
|
|
|
|
# Add right neighbor
|
|
if c + 1 < n and (r, c + 1) not in visited:
|
|
visited.add((r, c + 1))
|
|
heapq.heappush(heap, (matrix[r][c + 1], r, c + 1))
|
|
|
|
# Add bottom neighbor
|
|
if r + 1 < n and (r + 1, c) not in visited:
|
|
visited.add((r + 1, c))
|
|
heapq.heappush(heap, (matrix[r + 1][c], r + 1, c))
|
|
|
|
return heap[0][0]
|
|
|
|
recognition_signals:
|
|
- "kth largest"
|
|
- "kth smallest"
|
|
- "top k"
|
|
- "merge sorted"
|
|
- "median"
|
|
- "priority"
|
|
- "schedule"
|
|
- "Dijkstra"
|
|
- "frequency"
|
|
- "closest points"
|
|
|
|
common_mistakes:
|
|
- title: Using max-heap when min-heap needed (or vice versa)
|
|
description: |
|
|
Python's heapq is a min-heap. Using it directly for "k largest" keeps
|
|
k smallest instead.
|
|
fix: |
|
|
For max-heap behavior, negate values:
|
|
```python
|
|
heapq.heappush(heap, -num) # Push negative
|
|
max_val = -heapq.heappop(heap) # Negate back
|
|
```
|
|
|
|
- title: Wrong heap size for "top K" problems
|
|
description: |
|
|
For "k largest," keeping a max-heap of all elements and extracting k times
|
|
is O(n + k log n). Using min-heap of size k is O(n log k).
|
|
fix: |
|
|
For k largest: use min-heap of size k, remove smallest when full.
|
|
For k smallest: use max-heap of size k, remove largest when full.
|
|
|
|
- title: Forgetting tuple comparison order
|
|
description: |
|
|
When heap contains tuples, Python compares by first element, then second,
|
|
etc. If first elements are equal, comparison moves to second element.
|
|
fix: |
|
|
Put the comparison key first in the tuple:
|
|
```python
|
|
heapq.heappush(heap, (priority, item))
|
|
```
|
|
If items aren't comparable, use a counter as tiebreaker.
|
|
|
|
- title: Modifying heap elements directly
|
|
description: |
|
|
Changing an element's value after it's in the heap breaks heap property.
|
|
fix: |
|
|
Heaps don't support "decrease key" directly. Either: (1) use lazy deletion
|
|
(mark as invalid, skip when popped), or (2) re-heapify the entire heap.
|
|
|
|
variations:
|
|
- name: Top K elements
|
|
description: |
|
|
Keep k largest using min-heap of size k, or k smallest using max-heap
|
|
of size k.
|
|
example: "Kth Largest Element, Top K Frequent Elements"
|
|
|
|
- name: K-way merge
|
|
description: |
|
|
Merge k sorted lists efficiently by maintaining heap of current elements
|
|
from each list.
|
|
example: "Merge K Sorted Lists, Smallest Range Covering K Lists"
|
|
|
|
- name: Two heaps (median)
|
|
description: |
|
|
Maintain two heaps: max-heap for smaller half, min-heap for larger half.
|
|
Median is at the roots.
|
|
example: "Find Median from Data Stream, Sliding Window Median"
|
|
|
|
- name: Dijkstra's algorithm
|
|
description: |
|
|
Min-heap tracks vertices by shortest known distance. Extract minimum,
|
|
relax edges, update heap.
|
|
example: "Network Delay Time, Cheapest Flights Within K Stops"
|
|
|
|
- name: Task scheduling
|
|
description: |
|
|
Prioritize tasks by some criteria (deadline, duration). Process highest
|
|
priority first.
|
|
example: "Task Scheduler, Meeting Rooms III"
|
|
|
|
related_patterns:
|
|
- binary-search
|
|
- two-pointers
|
|
|
|
prerequisite_patterns: []
|