questions F-L

This commit is contained in:
2025-05-25 11:47:04 +01:00
parent 798e0ba1df
commit 5dbe52df0d
54 changed files with 11235 additions and 0 deletions

View File

@@ -0,0 +1,248 @@
title: Find in Mountain Array
slug: find-in-mountain-array
difficulty: hard
leetcode_id: 1095
leetcode_url: https://leetcode.com/problems/find-in-mountain-array/
categories:
- arrays
- binary-search
patterns:
- binary-search
description: |
*(This problem is an **interactive problem**.)*
You may recall that an array `arr` is a **mountain array** if and only if:
- `arr.length >= 3`
- There exists some `i` with `0 < i < arr.length - 1` such that:
- `arr[0] < arr[1] < ... < arr[i - 1] < arr[i]`
- `arr[i] > arr[i + 1] > ... > arr[arr.length - 1]`
Given a mountain array `mountainArr`, return the **minimum** index such that `mountainArr.get(index) == target`. If such an index does not exist, return `-1`.
**You cannot access the mountain array directly.** You may only access the array using a `MountainArray` interface:
- `MountainArray.get(k)` returns the element of the array at index `k` (0-indexed).
- `MountainArray.length()` returns the length of the array.
Submissions making more than `100` calls to `MountainArray.get` will be judged *Wrong Answer*. Also, any solutions that attempt to circumvent the judge will result in disqualification.
constraints: |
- `3 <= mountainArr.length() <= 10^4`
- `0 <= target <= 10^9`
- `0 <= mountainArr.get(index) <= 10^9`
examples:
- input: "mountainArr = [1,2,3,4,5,3,1], target = 3"
output: "2"
explanation: "3 exists in the array, at index=2 and index=5. Return the minimum index, which is 2."
- input: "mountainArr = [0,1,2,4,2,1], target = 3"
output: "-1"
explanation: "3 does not exist in the array, so we return -1."
explanation:
intuition: |
Imagine a mountain with a single peak. You're standing at the base and need to find a specific elevation marker — but you can only check the elevation at a limited number of points (100 checks maximum).
The key insight is that a mountain array is actually **two sorted arrays joined at the peak**: the left side is strictly increasing, and the right side is strictly decreasing. This structure is perfect for binary search!
Think of it like this: if we can find the peak, we've essentially split the problem into two simpler binary searches:
1. Search the ascending left side (standard binary search)
2. If not found, search the descending right side (reversed binary search)
But here's the crucial detail: we want the **minimum index**. Since the left side has smaller indices than the right side, we should search the left side first. If we find the target there, we're done — no need to check the right side.
The challenge is that each `get()` call is expensive (limited to 100 total), so we must use binary search for all three operations: finding the peak and searching both sides.
approach: |
We solve this using **Three Binary Searches**:
**Step 1: Find the peak index**
- Use binary search to locate the peak (maximum element)
- At each midpoint, compare `arr[mid]` with `arr[mid + 1]`
- If `arr[mid] < arr[mid + 1]`, we're on the ascending side — peak is to the right
- If `arr[mid] > arr[mid + 1]`, we're on the descending side or at the peak — search left
- When `left == right`, we've found the peak
&nbsp;
**Step 2: Binary search the ascending (left) side**
- Search from index `0` to `peak` using standard binary search
- If `arr[mid] < target`, move right; if `arr[mid] > target`, move left
- If found, return immediately (this guarantees minimum index)
&nbsp;
**Step 3: Binary search the descending (right) side**
- Only if not found on the left side
- Search from index `peak + 1` to `n - 1`
- Since this side is **decreasing**, the comparisons are reversed:
- If `arr[mid] > target`, move right (smaller values are to the right)
- If `arr[mid] < target`, move left
&nbsp;
**Step 4: Return the result**
- If found on either side, return that index
- Otherwise, return `-1`
common_pitfalls:
- title: Exceeding the Call Limit
description: |
With at most `100` calls to `MountainArray.get()` and array length up to `10^4`, a linear scan is not an option.
Three binary searches use at most `3 * log2(10^4) ≈ 3 * 14 = 42` calls, well within the limit. But caching values you've already fetched can help reduce redundant calls further.
wrong_approach: "Linear scan or excessive get() calls"
correct_approach: "Three binary searches with O(log n) calls each"
- title: Forgetting to Search Left Side First
description: |
The problem asks for the **minimum index**. If the target appears on both the ascending and descending sides (like `3` in `[1,2,3,4,5,3,1]`), you must return the smaller index.
Always search the left (ascending) side first and return immediately if found. Only search the right side if the left search fails.
wrong_approach: "Searching right side first or both sides without priority"
correct_approach: "Search ascending side first, return immediately if found"
- title: Incorrect Binary Search Direction on Descending Side
description: |
The descending (right) side of the mountain is sorted in **reverse order**. Standard binary search logic must be inverted:
- In ascending order: `arr[mid] < target` means move right
- In descending order: `arr[mid] < target` means move **left** (larger values are to the left)
Mixing up these directions causes incorrect results.
wrong_approach: "Using same comparison logic for both sides"
correct_approach: "Invert comparisons for the descending side"
- title: Off-by-One in Peak Finding
description: |
When finding the peak, be careful with boundary conditions. The peak can never be at index `0` or `n-1` (by definition of mountain array), so initialize `left = 1` and `right = n - 2` for safety.
Also, when comparing `arr[mid]` with `arr[mid + 1]`, ensure `mid + 1` is within bounds.
key_takeaways:
- "**Decompose the problem**: A mountain array is two sorted subarrays — find the peak first, then binary search each half"
- "**Binary search on structure**: When data has a predictable structure (sorted, bitonic, rotated), binary search can dramatically reduce search time"
- "**Order matters for ties**: When finding minimum/maximum index, search the appropriate half first to short-circuit early"
- "**Interactive problems**: Limited API calls force O(log n) solutions — linear scans are not acceptable"
time_complexity: "O(log n). Three binary searches, each taking O(log n) time in the worst case."
space_complexity: "O(1). We only use a constant number of variables for indices and bounds."
solutions:
- approach_name: Triple Binary Search
is_optimal: true
code: |
# MountainArray interface is provided by the judge:
# class MountainArray:
# def get(self, index: int) -> int: ...
# def length(self) -> int: ...
class Solution:
def findInMountainArray(self, target: int, mountain_arr: 'MountainArray') -> int:
n = mountain_arr.length()
# Step 1: Find the peak index using binary search
left, right = 0, n - 1
while left < right:
mid = (left + right) // 2
# If mid is less than mid+1, peak is to the right
if mountain_arr.get(mid) < mountain_arr.get(mid + 1):
left = mid + 1
else:
# Peak is at mid or to the left
right = mid
peak = left
# Step 2: Binary search on ascending (left) side [0, peak]
left, right = 0, peak
while left <= right:
mid = (left + right) // 2
val = mountain_arr.get(mid)
if val == target:
return mid # Found on left side = minimum index
elif val < target:
left = mid + 1
else:
right = mid - 1
# Step 3: Binary search on descending (right) side [peak+1, n-1]
left, right = peak + 1, n - 1
while left <= right:
mid = (left + right) // 2
val = mountain_arr.get(mid)
if val == target:
return mid
# Descending order: larger values on left, smaller on right
elif val > target:
left = mid + 1 # Move right to find smaller values
else:
right = mid - 1 # Move left to find larger values
return -1 # Target not found in either half
explanation: |
**Time Complexity:** O(log n) — Three binary searches, each O(log n).
**Space Complexity:** O(1) — Only constant extra space for variables.
We first locate the peak using binary search by comparing adjacent elements. Then we search the ascending left side with standard binary search. If not found, we search the descending right side with inverted comparisons. The total number of `get()` calls is at most `2 * log(n) + 2 * log(n) + 2 * log(n) ≈ 6 * log(10^4) ≈ 84`, well within the 100-call limit.
- approach_name: Triple Binary Search with Caching
is_optimal: false
code: |
class Solution:
def findInMountainArray(self, target: int, mountain_arr: 'MountainArray') -> int:
n = mountain_arr.length()
cache = {} # Cache to avoid redundant get() calls
def get(i: int) -> int:
if i not in cache:
cache[i] = mountain_arr.get(i)
return cache[i]
# Find peak
left, right = 0, n - 1
while left < right:
mid = (left + right) // 2
if get(mid) < get(mid + 1):
left = mid + 1
else:
right = mid
peak = left
# Search ascending side
left, right = 0, peak
while left <= right:
mid = (left + right) // 2
val = get(mid)
if val == target:
return mid
elif val < target:
left = mid + 1
else:
right = mid - 1
# Search descending side
left, right = peak + 1, n - 1
while left <= right:
mid = (left + right) // 2
val = get(mid)
if val == target:
return mid
elif val > target:
left = mid + 1
else:
right = mid - 1
return -1
explanation: |
**Time Complexity:** O(log n) — Same as the optimal solution.
**Space Complexity:** O(log n) — Cache stores at most O(log n) values.
This variation adds a cache dictionary to avoid redundant `get()` calls. While the asymptotic complexity is the same, caching can reduce the actual number of API calls when the same index is accessed multiple times (e.g., the peak index might be checked during both the peak-finding phase and the left-side search). This is a practical optimisation for interactive problems with strict call limits.

View File

@@ -0,0 +1,189 @@
title: Find K Closest Elements
slug: find-k-closest-elements
difficulty: medium
leetcode_id: 658
leetcode_url: https://leetcode.com/problems/find-k-closest-elements/
categories:
- arrays
- binary-search
- two-pointers
patterns:
- binary-search
- two-pointers
description: |
Given a **sorted** integer array `arr`, two integers `k` and `x`, return the `k` closest integers to `x` in the array. The result should also be sorted in ascending order.
An integer `a` is closer to `x` than an integer `b` if:
- `|a - x| < |b - x|`, or
- `|a - x| == |b - x|` and `a < b`
constraints: |
- `1 <= k <= arr.length`
- `1 <= arr.length <= 10^4`
- `arr` is sorted in **ascending** order
- `-10^4 <= arr[i], x <= 10^4`
examples:
- input: "arr = [1,2,3,4,5], k = 4, x = 3"
output: "[1,2,3,4]"
explanation: "All elements except 5 are within distance 2 of x=3. Element 4 (distance 1) is closer than 5 (distance 2)."
- input: "arr = [1,1,2,3,4,5], k = 4, x = -1"
output: "[1,1,2,3]"
explanation: "The closest elements to -1 are the smallest values. When distances are equal (both 1s have distance 2), prefer the smaller value."
explanation:
intuition: |
Imagine you're standing at position `x` on a number line, and the sorted array represents points along that line. You need to find the `k` points closest to where you're standing.
The key insight is that the answer is always a **contiguous subarray** of length `k`. Why? Because the array is sorted! If you pick element at index `i` and element at index `j` where `j > i + 1`, and they're both in your answer, then every element between them must also be closer to `x` than elements outside this range.
Think of it like this: you're looking for a **sliding window** of size `k` that captures the `k` closest elements. The question becomes: where should this window start?
Instead of searching for elements, we can **binary search for the left boundary** of this window. For any starting position, we compare whether the left edge or the element just past the right edge is further from `x`. This tells us whether to move the window left or right.
approach: |
We solve this using **Binary Search for Window Start**:
**Step 1: Define the search space**
- We're searching for the starting index of a window of size `k`
- The starting index can range from `0` to `len(arr) - k`
- Set `left = 0`, `right = len(arr) - k`
&nbsp;
**Step 2: Binary search for optimal start position**
- While `left < right`:
- Calculate `mid = left + (right - left) // 2`
- Compare `x - arr[mid]` with `arr[mid + k] - x`
- If `x - arr[mid] > arr[mid + k] - x`:
- The left edge is further from `x` than the element just past the right edge
- Move the window right: `left = mid + 1`
- Else:
- The left edge is closer (or equal), keep it as a candidate
- `right = mid`
&nbsp;
**Step 3: Return the window**
- Return `arr[left:left + k]`
&nbsp;
Why compare `x - arr[mid]` instead of using absolute value? When the left edge is to the left of `x`, `x - arr[mid]` gives the distance. When the right edge past the window is to the right of `x`, `arr[mid + k] - x` gives that distance. This comparison tells us which side should be excluded.
common_pitfalls:
- title: Sorting with Custom Key
description: |
A common first approach is to sort the array by distance to `x`:
```python
sorted(arr, key=lambda a: (abs(a - x), a))[:k]
```
This works but has **O(n log n)** time complexity. Since the array is already sorted, we can do better with O(log n + k) using binary search.
wrong_approach: "Sort by distance, take first k"
correct_approach: "Binary search for window start position"
- title: Using Absolute Values in Comparison
description: |
When comparing distances during binary search, using `abs(arr[mid] - x)` vs `abs(arr[mid + k] - x)` can lead to subtle bugs.
The comparison `x - arr[mid] > arr[mid + k] - x` works because:
- If both are on the same side of `x`, we're comparing actual positions
- If they straddle `x`, the signs handle the comparison correctly
Using absolute values requires additional tie-breaking logic for the "prefer smaller value" rule.
wrong_approach: "abs(arr[mid] - x) vs abs(arr[mid + k] - x)"
correct_approach: "x - arr[mid] vs arr[mid + k] - x"
- title: Wrong Search Space Bounds
description: |
The right bound must be `len(arr) - k`, not `len(arr) - 1`. We're searching for the *start* of a window of size `k`, so the maximum valid start index is `n - k`.
If `arr = [1,2,3,4,5]` and `k = 3`, valid start indices are 0, 1, 2 (giving windows [1,2,3], [2,3,4], [3,4,5]).
wrong_approach: "right = len(arr) - 1"
correct_approach: "right = len(arr) - k"
key_takeaways:
- "**Contiguous subarray insight**: In a sorted array, the k closest elements form a contiguous window"
- "**Binary search for boundaries**: Instead of searching for elements, search for the optimal window position"
- "**Comparison without abs()**: When comparing distances on opposite sides, signed arithmetic handles it correctly"
- "**Foundation for window problems**: This technique extends to other problems about finding optimal subarrays in sorted data"
time_complexity: "O(log(n - k) + k). Binary search takes O(log(n - k)), and returning the slice takes O(k)."
space_complexity: "O(k). The returned list contains k elements. The binary search itself uses O(1) extra space."
solutions:
- approach_name: Binary Search for Window Start
is_optimal: true
code: |
def find_closest_elements(arr: list[int], k: int, x: int) -> list[int]:
# Search for the starting index of the k-element window
left, right = 0, len(arr) - k
while left < right:
mid = left + (right - left) // 2
# Compare left edge distance vs element just past right edge
if x - arr[mid] > arr[mid + k] - x:
# Left edge is further, move window right
left = mid + 1
else:
# Left edge is closer (or equal), keep as candidate
right = mid
# Return the k-element window starting at left
return arr[left:left + k]
explanation: |
**Time Complexity:** O(log(n - k) + k) — Binary search over n - k + 1 positions, plus slicing k elements.
**Space Complexity:** O(k) — Output array of k elements.
We binary search for the optimal starting position of a window of size k. The comparison `x - arr[mid] > arr[mid + k] - x` determines if the left boundary or the element just past the right boundary is further from x. This guides us toward the optimal window.
- approach_name: Two Pointers (Shrinking Window)
is_optimal: false
code: |
def find_closest_elements(arr: list[int], k: int, x: int) -> list[int]:
left, right = 0, len(arr) - 1
# Shrink window until it has exactly k elements
while right - left >= k:
# Compare distances of left and right edges to x
if abs(arr[left] - x) > abs(arr[right] - x):
# Left edge is further, exclude it
left += 1
else:
# Right edge is further (or equal), exclude it
# Equal case: prefer smaller value (left), so exclude right
right -= 1
return arr[left:right + 1]
explanation: |
**Time Complexity:** O(n - k) — We shrink the window n - k times.
**Space Complexity:** O(k) — Output array of k elements.
Start with the full array and repeatedly remove the element furthest from x until k elements remain. When distances are equal, remove the larger (right) element to satisfy the tie-breaking rule. Simpler to understand than binary search but slower for small k.
- approach_name: Sort by Distance
is_optimal: false
code: |
def find_closest_elements(arr: list[int], k: int, x: int) -> list[int]:
# Sort by distance to x, then by value for tie-breaking
sorted_arr = sorted(arr, key=lambda a: (abs(a - x), a))
# Take k closest and sort by value for output
result = sorted(sorted_arr[:k])
return result
explanation: |
**Time Complexity:** O(n log n) — Sorting dominates.
**Space Complexity:** O(n) — Sorted copy of the array.
Sort all elements by their distance to x (with value as tie-breaker), take the first k, then sort again by value for the output. This ignores the fact that the input is already sorted, making it less efficient than the binary search approach.

View File

@@ -0,0 +1,200 @@
title: Find Median from Data Stream
slug: find-median-from-data-stream
difficulty: hard
leetcode_id: 295
leetcode_url: https://leetcode.com/problems/find-median-from-data-stream/
categories:
- heap
- sorting
patterns:
- heap
description: |
The **median** is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
- For example, for `arr = [2, 3, 4]`, the median is `3`.
- For example, for `arr = [2, 3]`, the median is `(2 + 3) / 2 = 2.5`.
Implement the `MedianFinder` class:
- `MedianFinder()` initialises the `MedianFinder` object.
- `void addNum(int num)` adds the integer `num` from the data stream to the data structure.
- `double findMedian()` returns the median of all elements so far. Answers within `10^-5` of the actual answer will be accepted.
constraints: |
- `-10^5 <= num <= 10^5`
- There will be at least one element in the data structure before calling `findMedian`.
- At most `5 * 10^4` calls will be made to `addNum` and `findMedian`.
examples:
- input: |
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
output: "[null, null, null, 1.5, null, 2.0]"
explanation: |
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1); // arr = [1]
medianFinder.addNum(2); // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3); // arr = [1, 2, 3]
medianFinder.findMedian(); // return 2.0
explanation:
intuition: |
Imagine you're watching numbers flow by on a conveyor belt, and at any moment someone might ask: "What's the median of all numbers you've seen so far?"
The naive approach would be to keep a sorted list and insert each new number in its correct position. But insertion into a sorted list takes O(n) time, which becomes too slow with many operations.
Here's the key insight: **you don't need the entire sorted list to find the median**. You only need quick access to the middle element(s). Think of splitting the numbers into two halves:
- The **smaller half** — all numbers less than or equal to the median
- The **larger half** — all numbers greater than or equal to the median
If you had instant access to the **maximum of the smaller half** and the **minimum of the larger half**, you could compute the median immediately. This is exactly what two heaps provide:
- A **max-heap** for the smaller half (gives you the largest of the small numbers)
- A **min-heap** for the larger half (gives you the smallest of the large numbers)
By keeping these heaps balanced (differing in size by at most 1), the median is always at the top of one or both heaps.
approach: |
We solve this using the **Two Heaps** pattern:
**Step 1: Initialise two heaps**
- `max_heap`: A max-heap to store the smaller half of numbers (in Python, we negate values since `heapq` is a min-heap)
- `min_heap`: A min-heap to store the larger half of numbers
- We maintain the invariant: `len(max_heap) >= len(min_heap)` and they differ by at most 1
&nbsp;
**Step 2: Adding a number**
- First, add the new number to `max_heap` (the smaller half)
- Then, move the largest from `max_heap` to `min_heap` to ensure all elements in `max_heap` are smaller than those in `min_heap`
- If `min_heap` becomes larger than `max_heap`, move one element back to balance
This "add-then-balance" approach ensures both heaps stay balanced and maintain the correct ordering.
&nbsp;
**Step 3: Finding the median**
- If total count is odd: the median is the top of `max_heap` (the larger heap)
- If total count is even: the median is the average of both heap tops
&nbsp;
This approach guarantees O(log n) insertion and O(1) median retrieval.
common_pitfalls:
- title: Sorted List Insertion Trap
description: |
A tempting first approach is to maintain a sorted list using binary search insertion:
- Use `bisect.insort()` to insert each number in O(log n) search time
- But the actual insertion into the list still takes O(n) time due to shifting elements
With up to `5 * 10^4` operations, this O(n) insertion leads to O(n^2) total time, which may cause TLE.
wrong_approach: "Sorted list with binary search insertion"
correct_approach: "Two heaps for O(log n) insertion"
- title: Single Heap Mistake
description: |
You might think one heap is enough — just keep all elements and find the middle. But heaps only give you efficient access to one extreme (min or max), not the middle.
Finding the median in a single heap requires removing half the elements, which is O(n log n) per query.
wrong_approach: "Single heap with repeated extraction"
correct_approach: "Two heaps splitting at the median"
- title: Heap Imbalance
description: |
If the heaps become unbalanced (size difference > 1), the median calculation breaks. For example, if `max_heap` has 5 elements and `min_heap` has 2, the top of `max_heap` is not the median.
Always rebalance after each insertion to maintain the invariant: `0 <= len(max_heap) - len(min_heap) <= 1`.
wrong_approach: "Inserting without rebalancing"
correct_approach: "Rebalance heaps after every insertion"
- title: Python Heap Negation
description: |
Python's `heapq` module only provides a min-heap. To simulate a max-heap, you must negate values when pushing and negate again when popping.
Forgetting to negate leads to incorrect ordering — you'd get the minimum of the smaller half instead of the maximum.
wrong_approach: "Using heapq as max-heap without negation"
correct_approach: "Negate values: push -x, pop and negate result"
key_takeaways:
- "**Two Heaps pattern**: Split data at the median using a max-heap for the lower half and min-heap for the upper half"
- "**Streaming data structure**: This design handles continuous data with O(log n) updates and O(1) queries"
- "**Heap balancing invariant**: Keep heap sizes within 1 of each other to ensure the median is always accessible at the tops"
- "**Foundation for variations**: This technique extends to finding other percentiles or handling weighted medians"
time_complexity: "O(log n) per `addNum` call due to heap insertion and rebalancing. O(1) per `findMedian` call since we only access heap tops."
space_complexity: "O(n) where n is the total number of elements added, as all elements are stored across the two heaps."
solutions:
- approach_name: Two Heaps
is_optimal: true
code: |
import heapq
class MedianFinder:
def __init__(self):
# Max-heap for smaller half (store negated values)
self.max_heap = []
# Min-heap for larger half
self.min_heap = []
def addNum(self, num: int) -> None:
# Always add to max_heap first (negate for max-heap behaviour)
heapq.heappush(self.max_heap, -num)
# Move largest from max_heap to min_heap
# This ensures max_heap elements <= min_heap elements
heapq.heappush(self.min_heap, -heapq.heappop(self.max_heap))
# Rebalance: max_heap should have equal or one more element
if len(self.min_heap) > len(self.max_heap):
heapq.heappush(self.max_heap, -heapq.heappop(self.min_heap))
def findMedian(self) -> float:
# Odd total: median is top of max_heap
if len(self.max_heap) > len(self.min_heap):
return -self.max_heap[0]
# Even total: average of both tops
return (-self.max_heap[0] + self.min_heap[0]) / 2
explanation: |
**Time Complexity:** O(log n) for `addNum` — each heap operation is O(log n). O(1) for `findMedian` — just accessing heap tops.
**Space Complexity:** O(n) — storing all n elements across two heaps.
We maintain two heaps that split the data at the median. The max-heap holds the smaller half, the min-heap holds the larger half. After each insertion, we rebalance to keep sizes within 1. The median is always accessible at the top(s) of the heaps.
- approach_name: Sorted List with Binary Search
is_optimal: false
code: |
import bisect
class MedianFinder:
def __init__(self):
# Maintain a sorted list of all numbers
self.nums = []
def addNum(self, num: int) -> None:
# Binary search to find insertion point: O(log n)
# But actual insertion shifts elements: O(n)
bisect.insort(self.nums, num)
def findMedian(self) -> float:
n = len(self.nums)
mid = n // 2
# Odd length: return middle element
if n % 2 == 1:
return self.nums[mid]
# Even length: return average of two middle elements
return (self.nums[mid - 1] + self.nums[mid]) / 2
explanation: |
**Time Complexity:** O(n) for `addNum` — binary search is O(log n) but list insertion is O(n). O(1) for `findMedian` — direct index access.
**Space Complexity:** O(n) — storing all n elements in a list.
This approach maintains a sorted list. While conceptually simple and gives O(1) median lookup, the O(n) insertion time makes it impractical for large inputs. It's included to illustrate why heaps are necessary.

View File

@@ -0,0 +1,164 @@
title: Find Minimum in Rotated Sorted Array
slug: find-minimum-in-rotated-sorted-array
difficulty: medium
leetcode_id: 153
leetcode_url: https://leetcode.com/problems/find-minimum-in-rotated-sorted-array/
categories:
- arrays
- binary-search
patterns:
- binary-search
description: |
Suppose an array of length `n` sorted in ascending order is **rotated** between `1` and `n` times. For example, the array `nums = [0,1,2,4,5,6,7]` might become:
- `[4,5,6,7,0,1,2]` if it was rotated 4 times
- `[0,1,2,4,5,6,7]` if it was rotated 7 times (back to original)
Given the sorted rotated array `nums` of **unique** elements, return *the minimum element of this array*.
You must write an algorithm that runs in **O(log n)** time.
constraints: |
- `n == nums.length`
- `1 <= n <= 5000`
- `-5000 <= nums[i] <= 5000`
- All the integers of `nums` are **unique**
- `nums` is sorted and rotated between 1 and n times
examples:
- input: "nums = [3,4,5,1,2]"
output: "1"
explanation: "Original array was [1,2,3,4,5] rotated 3 times."
- input: "nums = [4,5,6,7,0,1,2]"
output: "0"
explanation: "Original array was [0,1,2,4,5,6,7] rotated 4 times."
- input: "nums = [11,13,15,17]"
output: "11"
explanation: "Array was rotated 4 times (full rotation), so minimum is first element."
explanation:
intuition: |
Visualise a rotated sorted array: it's like taking a sorted array, cutting it somewhere, and swapping the two pieces. This creates a **pivot point** — the place where the large values suddenly drop to small values.
For example, in `[4,5,6,7,0,1,2]`, the pivot is between 7 and 0. The minimum element is always at this pivot point!
Think of it like this: the array has two sorted "halves". One half has larger values, the other has smaller values. The minimum is the first element of the smaller half.
How do we find it with binary search? Compare `nums[mid]` with `nums[right]`:
- If `nums[mid] > nums[right]`: We're in the "larger" half. The pivot (minimum) must be to the right.
- If `nums[mid] <= nums[right]`: We're in the "smaller" half or at the minimum. The pivot is at `mid` or to the left.
Why compare with `right` instead of `left`? Because comparing with `right` consistently tells us which "half" we're in, regardless of how much the array was rotated.
approach: |
We solve this using **Modified Binary Search**:
**Step 1: Initialise pointers**
- `left = 0`, `right = len(nums) - 1`
- The minimum must be somewhere in `[left, right]`
&nbsp;
**Step 2: Binary search with right comparison**
- While `left < right`:
- Calculate `mid = left + (right - left) // 2`
- If `nums[mid] > nums[right]`:
- The pivot (minimum) is in the right half
- Set `left = mid + 1` (exclude mid — it's too large)
- Else (`nums[mid] <= nums[right]`):
- The pivot is at `mid` or in the left half
- Set `right = mid` (keep mid in consideration)
&nbsp;
**Step 3: Return the minimum**
- When `left == right`, we've found the minimum
- Return `nums[left]`
&nbsp;
This works because we're essentially searching for the "boundary" where the array transitions from large values to small values.
common_pitfalls:
- title: Comparing with Left Instead of Right
description: |
Comparing `nums[mid]` with `nums[left]` doesn't work consistently. Consider `[2, 1]`:
- `mid = 0`, `nums[mid] = 2`, `nums[left] = 2`
- `nums[mid] > nums[left]` is false, but the minimum is on the right!
Comparing with `nums[right]` works because the right element is always either in the "smaller half" (after pivot) or the array isn't rotated.
wrong_approach: "if nums[mid] > nums[left]: search right"
correct_approach: "if nums[mid] > nums[right]: search right"
- title: Using left <= right Loop Condition
description: |
For this problem, use `while left < right`. When `left == right`, we've found the answer. Using `<=` can cause infinite loops because we're not always excluding `mid`.
wrong_approach: "while left <= right"
correct_approach: "while left < right"
- title: Excluding mid Incorrectly
description: |
When `nums[mid] <= nums[right]`, `mid` could be the minimum! We must keep it in consideration by setting `right = mid`, not `right = mid - 1`.
When `nums[mid] > nums[right]`, we know `mid` is definitely not the minimum (it's larger than something to its right), so `left = mid + 1` is safe.
wrong_approach: "right = mid - 1 when nums[mid] <= nums[right]"
correct_approach: "right = mid (keep mid as a candidate)"
key_takeaways:
- "**Binary search on rotated arrays**: Compare with the right boundary to determine which half contains the answer"
- "**Understanding the structure**: A rotated sorted array has two sorted segments — find the boundary between them"
- "**Careful with boundary updates**: `mid + 1` vs `mid` depends on whether mid can be the answer"
- "**Foundation for harder problems**: This technique extends to searching for any element in rotated arrays"
time_complexity: "O(log n). Each iteration halves the search space."
space_complexity: "O(1). Only a constant number of variables are used."
solutions:
- approach_name: Binary Search
is_optimal: true
code: |
def find_min(nums: list[int]) -> int:
left, right = 0, len(nums) - 1
while left < right:
mid = left + (right - left) // 2
if nums[mid] > nums[right]:
# Mid is in the "larger" half
# Minimum must be to the right of mid
left = mid + 1
else:
# Mid is in the "smaller" half (or at the minimum)
# Minimum is at mid or to the left
right = mid
# left == right, pointing to the minimum
return nums[left]
explanation: |
**Time Complexity:** O(log n) — Search space halves each iteration.
**Space Complexity:** O(1) — Constant extra space.
We compare `nums[mid]` with `nums[right]` to determine which half contains the minimum. If `mid > right`, the pivot is on the right. Otherwise, it's at `mid` or on the left. The loop converges to the exact position of the minimum.
- approach_name: Linear Scan
is_optimal: false
code: |
def find_min(nums: list[int]) -> int:
# Find where sorted order breaks
for i in range(1, len(nums)):
if nums[i] < nums[i - 1]:
return nums[i]
# No break found — array wasn't rotated (or rotated fully)
return nums[0]
explanation: |
**Time Complexity:** O(n) — Scans through the array.
**Space Complexity:** O(1) — Constant extra space.
Find the point where the sorted order breaks (current element less than previous). The element at that point is the minimum. If no break is found, the array wasn't rotated, so return the first element. This doesn't meet the O(log n) requirement but is useful for understanding the problem.

View File

@@ -0,0 +1,226 @@
title: Find the Duplicate Number
slug: find-the-duplicate-number
difficulty: medium
leetcode_id: 287
leetcode_url: https://leetcode.com/problems/find-the-duplicate-number/
categories:
- arrays
- two-pointers
patterns:
- fast-slow-pointers
- binary-search
description: |
Given an array of integers `nums` containing `n + 1` integers where each integer is in the range `[1, n]` inclusive.
There is only **one repeated number** in `nums`, return *this repeated number*.
You must solve the problem **without** modifying the array `nums` and using only constant extra space.
constraints: |
- `1 <= n <= 10^5`
- `nums.length == n + 1`
- `1 <= nums[i] <= n`
- All the integers in `nums` appear only **once** except for **precisely one integer** which appears **two or more** times
examples:
- input: "nums = [1,3,4,2,2]"
output: "2"
explanation: "The number 2 appears twice in the array."
- input: "nums = [3,1,3,4,2]"
output: "3"
explanation: "The number 3 appears twice in the array."
- input: "nums = [3,3,3,3,3]"
output: "3"
explanation: "The number 3 appears five times in the array."
explanation:
intuition: |
This problem has a beautiful constraint: the array has `n + 1` elements but values are only in the range `[1, n]`. By the **Pigeonhole Principle**, at least one value must repeat.
The key insight is to view the array as a **linked list** where each value points to the next index. Since values are in `[1, n]` and we have indices `[0, n]`, treating `nums[i]` as "next pointer" creates a valid linked structure.
Think of it like this: if we start at index `0` and repeatedly jump to `nums[current_index]`, we create a sequence. Because one number repeats, two different indices point to the same location — this creates a **cycle**! The duplicate number is the entry point of this cycle.
For example, with `nums = [1,3,4,2,2]`:
- Index 0 → value 1 → jump to index 1
- Index 1 → value 3 → jump to index 3
- Index 3 → value 2 → jump to index 2
- Index 2 → value 4 → jump to index 4
- Index 4 → value 2 → jump to index 2 (cycle!)
The cycle exists because both index 3 and index 4 have value `2`. Floyd's Tortoise and Hare algorithm finds exactly where this cycle begins.
approach: |
We solve this using **Floyd's Cycle Detection** (Tortoise and Hare):
**Step 1: Detect the cycle**
- `slow`: Moves one step at a time (`slow = nums[slow]`)
- `fast`: Moves two steps at a time (`fast = nums[nums[fast]]`)
- Both start at index `0`
- Keep moving until they meet — this proves a cycle exists
&nbsp;
**Step 2: Find the cycle entrance**
- Reset `slow` to index `0`, keep `fast` at the meeting point
- Move both pointers one step at a time
- The point where they meet again is the duplicate number
&nbsp;
**Why does this work?**
Let's say the distance from start to cycle entrance is `F`, and the cycle length is `C`. When slow and fast first meet:
- Slow has traveled `F + a` steps (where `a` is distance into the cycle)
- Fast has traveled `2(F + a)` steps
- Since fast is in the cycle: `2(F + a) - (F + a) = C`, so `F + a = C`
This means `F = C - a`. When we reset slow to start and both move at the same speed, slow travels `F` steps to reach the entrance, while fast travels `F = C - a` steps from its position `a` into the cycle — also reaching the entrance!
&nbsp;
**Step 3: Return the result**
- The meeting point in phase 2 is the duplicate value
common_pitfalls:
- title: Using Extra Space
description: |
A common first instinct is to use a hash set to track seen numbers:
```python
seen = set()
for num in nums:
if num in seen:
return num
seen.add(num)
```
While this works and runs in O(n) time, it uses O(n) space. The problem explicitly requires **O(1) space**, so this approach violates the constraints.
wrong_approach: "Hash set to track seen numbers"
correct_approach: "Floyd's cycle detection using the array itself"
- title: Modifying the Array
description: |
Another tempting approach is to mark visited indices by negating values:
```python
for num in nums:
idx = abs(num)
if nums[idx] < 0:
return idx
nums[idx] = -nums[idx]
```
This is O(n) time and O(1) space, but it **modifies the input array**, which the problem forbids. The cycle detection approach leaves the array untouched.
wrong_approach: "Negating values to mark as visited"
correct_approach: "Read-only traversal with two pointers"
- title: Sorting the Array
description: |
Sorting and finding adjacent duplicates is intuitive but has two problems:
- It modifies the array (or requires O(n) space for a copy)
- It's O(n log n) time, not optimal
The cycle detection method achieves O(n) time with O(1) space without modification.
wrong_approach: "Sort and find adjacent duplicates"
correct_approach: "Floyd's algorithm for O(n) time, O(1) space"
- title: Confusing Index with Value
description: |
In Floyd's algorithm, we treat values as pointers to indices. A common mistake is confusing when to use the value versus the index.
Remember: `slow = nums[slow]` means "jump to the index that equals the current value." The duplicate is a **value**, not an index — it's what gets returned after phase 2.
key_takeaways:
- "**Cycle detection pattern**: When array values can be treated as pointers (value in valid index range), consider Floyd's algorithm"
- "**Pigeonhole Principle**: With `n + 1` items in `n` slots, at least one slot must have multiple items — guaranteeing a duplicate exists"
- "**Creative problem reframing**: Transforming an array duplicate problem into a linked list cycle problem unlocks an elegant O(1) space solution"
- "**Two-phase approach**: First detect *that* a cycle exists (fast catches slow), then find *where* it starts (both at same speed)"
time_complexity: "O(n). Each pointer traverses at most O(n) steps in both phases."
space_complexity: "O(1). Only two pointer variables are used, regardless of input size."
solutions:
- approach_name: Floyd's Cycle Detection
is_optimal: true
code: |
def find_duplicate(nums: list[int]) -> int:
# Phase 1: Find the intersection point in the cycle
slow = nums[0]
fast = nums[0]
# Move slow by 1, fast by 2 until they meet
while True:
slow = nums[slow] # One step
fast = nums[nums[fast]] # Two steps
if slow == fast:
break
# Phase 2: Find the entrance to the cycle (the duplicate)
slow = nums[0] # Reset slow to start
# Move both at same speed until they meet at cycle entrance
while slow != fast:
slow = nums[slow]
fast = nums[fast]
# The meeting point is the duplicate number
return slow
explanation: |
**Time Complexity:** O(n) — Each pointer visits at most n nodes in each phase.
**Space Complexity:** O(1) — Only two pointer variables used.
By treating array values as "next pointers," we transform this into a cycle detection problem. The duplicate causes a cycle because two indices point to the same value. Floyd's algorithm finds the cycle entrance in linear time with constant space.
- approach_name: Binary Search on Value Range
is_optimal: false
code: |
def find_duplicate(nums: list[int]) -> int:
# Search the value range [1, n], not the array indices
low, high = 1, len(nums) - 1
while low < high:
mid = (low + high) // 2
# Count numbers <= mid
count = sum(1 for num in nums if num <= mid)
# If count > mid, duplicate is in [low, mid]
# Otherwise, duplicate is in [mid+1, high]
if count > mid:
high = mid
else:
low = mid + 1
return low
explanation: |
**Time Complexity:** O(n log n) — Binary search over n values, each iteration scans n elements.
**Space Complexity:** O(1) — Only a few variables used.
This approach binary searches the *value* range, not the array. If there are more than `mid` numbers in `[1, mid]`, the duplicate must be in that range (Pigeonhole Principle). While not optimal, this demonstrates binary search on answer space rather than on array indices.
- approach_name: Hash Set
is_optimal: false
code: |
def find_duplicate(nums: list[int]) -> int:
seen = set()
for num in nums:
# If we've seen this number before, it's the duplicate
if num in seen:
return num
seen.add(num)
return -1 # Should never reach here given constraints
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(n) — Hash set stores up to n elements.
The most intuitive approach: track seen numbers and return when we find a repeat. While this violates the O(1) space constraint, it's included to show the trade-off between space and algorithmic complexity. Understanding why this isn't acceptable motivates learning Floyd's algorithm.

View File

@@ -0,0 +1,180 @@
title: Find the Town Judge
slug: find-the-town-judge
difficulty: easy
leetcode_id: 997
leetcode_url: https://leetcode.com/problems/find-the-town-judge/
categories:
- arrays
- graphs
- hash-tables
patterns:
- greedy
description: |
In a town, there are `n` people labeled from `1` to `n`. There is a rumor that one of these people is secretly the town judge.
If the town judge exists, then:
1. The town judge trusts nobody.
2. Everybody (except for the town judge) trusts the town judge.
3. There is exactly one person that satisfies properties **1** and **2**.
You are given an array `trust` where `trust[i] = [a_i, b_i]` representing that the person labeled `a_i` trusts the person labeled `b_i`. If a trust relationship does not exist in the `trust` array, then such a trust relationship does not exist.
Return *the label of the town judge if the town judge exists and can be identified, or return* `-1` *otherwise*.
constraints: |
- `1 <= n <= 1000`
- `0 <= trust.length <= 10^4`
- `trust[i].length == 2`
- All the pairs of `trust` are **unique**
- `a_i != b_i`
- `1 <= a_i, b_i <= n`
examples:
- input: "n = 2, trust = [[1,2]]"
output: "2"
explanation: "Person 1 trusts person 2, but person 2 trusts no one. Since person 2 is trusted by everyone else (just person 1) and trusts nobody, person 2 is the town judge."
- input: "n = 3, trust = [[1,3],[2,3]]"
output: "3"
explanation: "Both person 1 and person 2 trust person 3, while person 3 trusts nobody. Person 3 satisfies both conditions."
- input: "n = 3, trust = [[1,3],[2,3],[3,1]]"
output: "-1"
explanation: "Person 3 is trusted by everyone else, but person 3 also trusts person 1. Since the town judge must trust nobody, there is no valid town judge."
explanation:
intuition: |
Think of the trust relationships as a directed graph where each person is a node, and an edge from `a` to `b` means "person `a` trusts person `b`."
The town judge has a very specific signature in this graph:
- **Zero outgoing edges**: They trust nobody
- **Exactly `n-1` incoming edges**: Everyone else trusts them
Imagine you're counting votes at a town meeting. Each trust relationship is like a vote of confidence. The judge receives votes from everyone but casts no votes themselves. If we track the "net trust score" for each person (votes received minus votes cast), the judge would have a score of exactly `n-1`.
This insight transforms a graph problem into a simple counting problem: instead of building complex data structures, we just need to track a single number for each person.
approach: |
We solve this using a **Trust Score Approach**:
**Step 1: Initialise a trust score array**
- Create an array `trust_score` of size `n+1` (using 1-based indexing to match person labels)
- Each position starts at `0`, representing the net trust balance for that person
&nbsp;
**Step 2: Process each trust relationship**
- For each `[a, b]` pair in the `trust` array:
- Decrement `trust_score[a]` by 1 (person `a` trusts someone, so they lose a point)
- Increment `trust_score[b]` by 1 (person `b` is trusted, so they gain a point)
&nbsp;
**Step 3: Find the town judge**
- Iterate through people `1` to `n`
- The town judge is the person with `trust_score[i] == n - 1`
- This means they received `n-1` trust votes and cast 0 votes themselves
&nbsp;
**Step 4: Return the result**
- If found, return the judge's label
- If no one has a trust score of `n-1`, return `-1`
&nbsp;
This approach works because the trust score naturally captures both conditions: trusting nobody (no deductions) and being trusted by everyone else (exactly `n-1` additions).
common_pitfalls:
- title: Using Two Separate Arrays
description: |
A common approach is to maintain two arrays: one for "trusts count" (outgoing edges) and one for "trusted by count" (incoming edges). Then checking if `trusted_by[i] == n-1` and `trusts[i] == 0`.
While correct, this uses unnecessary space. The single trust score approach combines both conditions into one value, halving space usage and simplifying the logic.
wrong_approach: "Two arrays tracking incoming and outgoing separately"
correct_approach: "Single array with net trust score (incoming - outgoing)"
- title: Forgetting the Single Person Case
description: |
When `n = 1` and `trust = []`, the single person is the town judge by definition. They trust nobody (vacuously true since there's no one to trust) and are trusted by everyone else (vacuously true since there's no one else).
The trust score approach handles this naturally: person 1 has a score of `0`, and we need `n - 1 = 0`, so they qualify as the judge.
wrong_approach: "Special-casing n=1 with extra conditionals"
correct_approach: "Let the algorithm handle it naturally"
- title: Using 0-Based Indexing Incorrectly
description: |
People are labeled from `1` to `n`, not `0` to `n-1`. Using a size-`n` array with 0-based indexing requires translating indices, which is error-prone.
Using a size `n+1` array and ignoring index 0 keeps the code simple and matches the problem's labeling directly.
wrong_approach: "Size-n array with index translation"
correct_approach: "Size-(n+1) array with 1-based indexing"
key_takeaways:
- "**Graph degree insight**: In directed graphs, problems about nodes with specific in-degree and out-degree can often be solved by tracking net degree (in - out)"
- "**Space optimisation**: When tracking two related quantities (trusts vs trusted-by), consider if a single combined metric suffices"
- "**Constraint-driven design**: The judge's unique property (trusted by `n-1`, trusts `0`) translates directly to a net score of `n-1`"
- "**Foundation for graph problems**: This in-degree/out-degree counting technique appears in problems like finding celebrities, detecting cycles, and topological sorting"
time_complexity: "O(n + t) where `t` is the length of the trust array. We initialise an array of size `n` and iterate through all `t` trust relationships, then check `n` people."
space_complexity: "O(n). We use a single array of size `n+1` to store the trust score for each person."
solutions:
- approach_name: Trust Score
is_optimal: true
code: |
def find_judge(n: int, trust: list[list[int]]) -> int:
# Use n+1 size for 1-based indexing (people labeled 1 to n)
trust_score = [0] * (n + 1)
# Process each trust relationship
for a, b in trust:
# Person a trusts someone, so they can't be the judge
trust_score[a] -= 1
# Person b is trusted, gaining one vote
trust_score[b] += 1
# Find the person with trust score of n-1
# This means: trusted by n-1 people, trusts nobody
for i in range(1, n + 1):
if trust_score[i] == n - 1:
return i
# No valid town judge found
return -1
explanation: |
**Time Complexity:** O(n + t) where t is the number of trust relationships. We process each relationship once and scan through n people.
**Space Complexity:** O(n) for the trust score array.
The key insight is that a net trust score of `n-1` uniquely identifies the judge: they received votes from all `n-1` other people (contributing +n-1) and cast no votes themselves (contributing 0).
- approach_name: Two Arrays (In-degree and Out-degree)
is_optimal: false
code: |
def find_judge(n: int, trust: list[list[int]]) -> int:
# Track how many people each person trusts (out-degree)
trusts_count = [0] * (n + 1)
# Track how many people trust each person (in-degree)
trusted_by_count = [0] * (n + 1)
for a, b in trust:
trusts_count[a] += 1
trusted_by_count[b] += 1
# Judge trusts nobody (out-degree = 0) and is trusted by all others (in-degree = n-1)
for i in range(1, n + 1):
if trusts_count[i] == 0 and trusted_by_count[i] == n - 1:
return i
return -1
explanation: |
**Time Complexity:** O(n + t) where t is the number of trust relationships.
**Space Complexity:** O(n) but uses two arrays instead of one.
This approach explicitly tracks in-degree and out-degree separately, making the logic clearer but using twice the space. The optimal solution combines these into a single net score.

View File

@@ -0,0 +1,212 @@
title: First Missing Positive
slug: first-missing-positive
difficulty: hard
leetcode_id: 41
leetcode_url: https://leetcode.com/problems/first-missing-positive/
categories:
- arrays
- hash-tables
patterns:
- matrix-traversal
description: |
Given an unsorted integer array `nums`, return the *smallest positive integer* that is *not present* in `nums`.
You must implement an algorithm that runs in `O(n)` time and uses `O(1)` auxiliary space.
constraints: |
- `1 <= nums.length <= 10^5`
- `-2^31 <= nums[i] <= 2^31 - 1`
examples:
- input: "nums = [1,2,0]"
output: "3"
explanation: "The numbers in the range [1,2] are all in the array."
- input: "nums = [3,4,-1,1]"
output: "2"
explanation: "1 is in the array but 2 is missing."
- input: "nums = [7,8,9,11,12]"
output: "1"
explanation: "The smallest positive integer 1 is missing."
explanation:
intuition: |
At first glance, this problem seems straightforward — just find the smallest positive integer not in the array. But the real challenge lies in the **O(n) time and O(1) space** constraints. These constraints rule out sorting (O(n log n)) and hash sets (O(n) space).
The key insight is to **use the array itself as a hash table**. Think of it like assigning seats in a row: if you have `n` seats numbered 1 through `n`, you want each person with ticket number `i` to sit in seat `i`. After everyone is seated, you walk through the row and find the first empty seat — that's your answer.
Why does this work? The first missing positive must be in the range `[1, n+1]` where `n` is the array length. If all numbers 1 through `n` are present, the answer is `n+1`. Otherwise, some number in `[1, n]` is missing, and we want the smallest one.
By placing each value `x` at index `x-1` (so value `1` goes to index `0`, value `2` goes to index `1`, etc.), we transform the array into a lookup table. Then a single scan reveals the first position where the value doesn't match its expected index.
approach: |
We solve this using **Cyclic Sort** (in-place rearrangement):
**Step 1: Rearrange the array**
- Iterate through each position in the array
- For each element `nums[i]`, if it's a positive integer in the range `[1, n]` and not already in its correct position, swap it to where it belongs
- Continue swapping at the current position until the element there is either out of range or already correct
- This ensures each valid value ends up at index `value - 1`
&nbsp;
**Step 2: Find the first missing positive**
- Scan through the rearranged array
- The first index `i` where `nums[i] != i + 1` indicates that `i + 1` is missing
- Return `i + 1` as the answer
&nbsp;
**Step 3: Handle the all-present case**
- If all positions contain their expected values (1, 2, 3, ..., n), the answer is `n + 1`
&nbsp;
The cyclic sort approach works because we're essentially building a perfect hash function: value `x` maps to index `x - 1`. By rearranging in-place, we use constant extra space while achieving linear time.
common_pitfalls:
- title: Using a Hash Set
description: |
The most natural approach is to use a hash set to store all positive numbers, then iterate from 1 upward to find the first missing:
```python
seen = set(nums)
for i in range(1, len(nums) + 2):
if i not in seen:
return i
```
While this is O(n) time, it uses **O(n) space** for the hash set, violating the space constraint. The problem explicitly requires O(1) auxiliary space.
wrong_approach: "Hash set for O(1) lookup"
correct_approach: "Use the array itself as a hash table via cyclic sort"
- title: Sorting the Array
description: |
Another tempting approach is to sort the array first, then scan for the first missing positive:
```python
nums.sort()
# Find first missing...
```
Sorting takes **O(n log n)** time, which violates the O(n) time constraint. Even if you're okay with that, this approach still requires careful handling of duplicates and negatives.
wrong_approach: "Sort first, then scan"
correct_approach: "Cyclic sort achieves O(n) time"
- title: Infinite Loop During Swapping
description: |
When implementing the swap logic, you must check if the target position already contains the correct value:
```python
# Wrong: may infinite loop if duplicates exist
while 1 <= nums[i] <= n:
swap(nums[i], nums[nums[i] - 1])
# Correct: stop if already in place or duplicate
while 1 <= nums[i] <= n and nums[i] != nums[nums[i] - 1]:
swap(...)
```
Without the second condition, swapping identical values creates an infinite loop.
wrong_approach: "Only check range bounds"
correct_approach: "Also check if target position already has the correct value"
- title: Forgetting the n+1 Case
description: |
If the array contains exactly [1, 2, 3, ..., n], then no number in the array is missing — the answer is `n + 1`. Make sure your final scan handles this edge case, typically by returning `n + 1` if the entire array is correctly positioned.
wrong_approach: "Only scan the array without a fallback"
correct_approach: "Return n + 1 if all positions are correct"
key_takeaways:
- "**Cyclic sort pattern**: When values have a natural position (like 1 to n mapping to indices 0 to n-1), consider rearranging the array in-place"
- "**Array as hash table**: The array itself can serve as a constant-space lookup structure when the value range is bounded"
- "**Constraint-driven design**: The O(1) space requirement is the key hint that we must modify the input array rather than use auxiliary data structures"
- "**Related problems**: This technique applies to finding duplicates, missing numbers, and other permutation-based problems"
time_complexity: "O(n). Each element is swapped at most once to its correct position, and we make two linear passes through the array."
space_complexity: "O(1). We only use a constant number of variables; all rearrangement happens in-place."
solutions:
- approach_name: Cyclic Sort
is_optimal: true
code: |
def first_missing_positive(nums: list[int]) -> int:
n = len(nums)
# Phase 1: Place each value at its correct index
# Value x should be at index x-1
for i in range(n):
# Keep swapping until current element is in place or invalid
while 1 <= nums[i] <= n and nums[i] != nums[nums[i] - 1]:
# Swap nums[i] to its correct position
correct_idx = nums[i] - 1
nums[i], nums[correct_idx] = nums[correct_idx], nums[i]
# Phase 2: Find first position where value doesn't match index + 1
for i in range(n):
if nums[i] != i + 1:
return i + 1
# All values 1 to n are present, so answer is n + 1
return n + 1
explanation: |
**Time Complexity:** O(n) — Although there's a nested while loop, each element is moved at most once to its final position, giving O(n) total swaps.
**Space Complexity:** O(1) — Only a few variables are used; the array is modified in-place.
The algorithm works in two phases: first, we rearrange the array so that value `i` sits at index `i-1`. Then we scan to find the first mismatch. This clever use of the input array as a hash table satisfies both the time and space constraints.
- approach_name: Hash Set
is_optimal: false
code: |
def first_missing_positive(nums: list[int]) -> int:
# Store all positive numbers in a set
num_set = set(nums)
# Check each positive integer starting from 1
for i in range(1, len(nums) + 2):
if i not in num_set:
return i
# This line is never reached given the loop bounds
return len(nums) + 1
explanation: |
**Time Complexity:** O(n) — Building the set and scanning are both linear.
**Space Complexity:** O(n) — The hash set stores up to n elements.
This approach is intuitive and correct, but uses O(n) extra space, violating the problem's constraints. It's included to illustrate the natural solution that the cyclic sort approach improves upon.
- approach_name: Index Marking
is_optimal: true
code: |
def first_missing_positive(nums: list[int]) -> int:
n = len(nums)
# Step 1: Replace non-positive and out-of-range values with n+1
for i in range(n):
if nums[i] <= 0 or nums[i] > n:
nums[i] = n + 1
# Step 2: Mark presence by negating values at corresponding indices
for i in range(n):
val = abs(nums[i])
if val <= n:
# Mark index val-1 as "seen" by making it negative
nums[val - 1] = -abs(nums[val - 1])
# Step 3: Find first positive value (indicates missing number)
for i in range(n):
if nums[i] > 0:
return i + 1
return n + 1
explanation: |
**Time Complexity:** O(n) — Three linear passes through the array.
**Space Complexity:** O(1) — Only modifies the array in-place.
This alternative approach uses the sign of each element as a flag. After replacing invalid values with `n+1`, we mark the presence of value `x` by negating the element at index `x-1`. Finally, the first positive element indicates the missing number. Both this and cyclic sort are optimal solutions.

View File

@@ -0,0 +1,219 @@
title: 4Sum
slug: four-sum
difficulty: medium
leetcode_id: 18
leetcode_url: https://leetcode.com/problems/4sum/
categories:
- arrays
- two-pointers
- sorting
patterns:
- two-pointers
description: |
Given an array `nums` of `n` integers, return *an array of all the **unique** quadruplets* `[nums[a], nums[b], nums[c], nums[d]]` such that:
- `0 <= a, b, c, d < n`
- `a`, `b`, `c`, and `d` are **distinct**
- `nums[a] + nums[b] + nums[c] + nums[d] == target`
You may return the answer in **any order**.
constraints: |
- `1 <= nums.length <= 200`
- `-10^9 <= nums[i] <= 10^9`
- `-10^9 <= target <= 10^9`
examples:
- input: "nums = [1,0,-1,0,-2,2], target = 0"
output: "[[-2,-1,1,2],[-2,0,0,2],[-1,0,0,1]]"
explanation: "Three unique quadruplets sum to 0: [-2,-1,1,2], [-2,0,0,2], and [-1,0,0,1]."
- input: "nums = [2,2,2,2,2], target = 8"
output: "[[2,2,2,2]]"
explanation: "The only quadruplet that sums to 8 is [2,2,2,2]."
explanation:
intuition: |
If you've solved **3Sum**, 4Sum follows the same reduction strategy: fix one element and solve a smaller problem.
Think of it like peeling an onion. 3Sum reduces to 2Sum by fixing one element. Similarly, **4Sum reduces to 3Sum** by fixing the first element, which then reduces to 2Sum. Each layer peels away one dimension of complexity.
The key insight is that after sorting, we can use **two nested loops** to fix the first two elements, then apply the familiar two-pointer technique to find the remaining pair. This gives us O(n³) time — the best we can do when there can be O(n³) valid quadruplets.
Sorting remains essential for two reasons:
1. **Two pointers work**: Adjusting sum by moving pointers left or right
2. **Duplicate skipping**: Adjacent duplicates become neighbours we can easily skip
approach: |
We solve this using **Sort + Two Nested Loops + Two Pointers**:
**Step 1: Sort the array**
- Sorting enables two-pointer technique and easy duplicate detection
- Time: O(n log n), dominated by the O(n³) main algorithm
&nbsp;
**Step 2: Fix the first element**
- For each `i` from 0 to n-4:
- Skip if `nums[i] == nums[i-1]` (avoid duplicate quadruplets)
- **Early termination**: If `nums[i] + nums[i+1] + nums[i+2] + nums[i+3] > target`, break (smallest possible sum exceeds target)
- **Skip if too small**: If `nums[i] + nums[n-3] + nums[n-2] + nums[n-1] < target`, continue (largest possible sum is still less than target)
&nbsp;
**Step 3: Fix the second element**
- For each `j` from i+1 to n-3:
- Skip if `nums[j] == nums[j-1]` and `j > i + 1` (avoid duplicates)
- Apply similar early termination and skip optimisations
&nbsp;
**Step 4: Two-pointer search for remaining pair**
- Set `left = j + 1`, `right = n - 1`
- Calculate `total = nums[i] + nums[j] + nums[left] + nums[right]`
- If `total < target`: move `left` right
- If `total > target`: move `right` left
- If `total == target`: found a quadruplet!
- Add to result, skip duplicates, move both pointers
&nbsp;
**Step 5: Return all unique quadruplets**
Duplicate skipping happens at all four levels: outer loop, second loop, left pointer, and right pointer.
common_pitfalls:
- title: Integer Overflow
description: |
With constraints `-10^9 <= nums[i] <= 10^9` and `-10^9 <= target <= 10^9`, the sum of four numbers can reach `4 × 10^9`, which **overflows 32-bit integers**.
In languages like C++ or Java, you must use `long long` or `long` types for the sum calculation. In Python, integers have arbitrary precision, so this isn't an issue — but be aware when porting to other languages.
wrong_approach: "Using int for sum in C++/Java"
correct_approach: "Use long/long long or cast during addition"
- title: Incorrect Duplicate Skipping at Second Level
description: |
When skipping duplicates for the second element `j`, you must check `j > i + 1` before comparing `nums[j] == nums[j-1]`.
Without this check, you might skip the very first valid `j` after `i`, missing valid quadruplets.
Example: `nums = [0,0,0,0]`, `target = 0` — if you skip when `j == i + 1`, you'd incorrectly skip `j = 1` when comparing to `nums[0]`.
wrong_approach: "if nums[j] == nums[j-1]: continue (always)"
correct_approach: "if j > i + 1 and nums[j] == nums[j-1]: continue"
- title: Missing Early Termination Optimisations
description: |
Unlike 3Sum where you can break when `nums[i] > 0` (since target is 0), 4Sum has a variable target. The optimisations become:
- **Break** if `nums[i] + nums[i+1] + nums[i+2] + nums[i+3] > target` — smallest sum exceeds target
- **Continue** if `nums[i] + nums[n-3] + nums[n-2] + nums[n-1] < target` — largest sum too small
Without these, you may TLE on edge cases with skewed distributions.
wrong_approach: "No early termination checks"
correct_approach: "Check minimum and maximum possible sums at each level"
key_takeaways:
- "**Generalise N-sum**: Fix k-2 elements with nested loops, then apply two pointers — this pattern works for any kSum"
- "**Time complexity is O(n^(k-1))**: For 4Sum, it's O(n³); for kSum in general, O(n^(k-1)) is optimal when there can be that many solutions"
- "**Early termination matters**: Checking minimum and maximum possible sums can dramatically prune the search space"
- "**Duplicate handling at every level**: Each nested loop needs its own duplicate skip logic with the correct boundary check"
time_complexity: "O(n³). Sorting is O(n log n), then two nested O(n) loops each contain an O(n) two-pointer search."
space_complexity: "O(log n) to O(n). Depends on the sorting algorithm — O(log n) for in-place sorts, O(n) for others. The output is not counted as extra space."
solutions:
- approach_name: Sort + Two Pointers
is_optimal: true
code: |
def four_sum(nums: list[int], target: int) -> list[list[int]]:
nums.sort() # Enable two pointers and duplicate detection
result = []
n = len(nums)
for i in range(n - 3):
# Skip duplicates for first element
if i > 0 and nums[i] == nums[i - 1]:
continue
# Early termination: smallest possible sum exceeds target
if nums[i] + nums[i + 1] + nums[i + 2] + nums[i + 3] > target:
break
# Skip: largest possible sum with nums[i] is still too small
if nums[i] + nums[n - 3] + nums[n - 2] + nums[n - 1] < target:
continue
for j in range(i + 1, n - 2):
# Skip duplicates for second element (note: j > i + 1)
if j > i + 1 and nums[j] == nums[j - 1]:
continue
# Early termination for inner loop
if nums[i] + nums[j] + nums[j + 1] + nums[j + 2] > target:
break
# Skip if largest sum with nums[i], nums[j] is too small
if nums[i] + nums[j] + nums[n - 2] + nums[n - 1] < target:
continue
# Two pointers for remaining pair
left, right = j + 1, n - 1
while left < right:
total = nums[i] + nums[j] + nums[left] + nums[right]
if total < target:
left += 1
elif total > target:
right -= 1
else:
# Found a quadruplet
result.append([nums[i], nums[j], nums[left], nums[right]])
# Skip duplicates for left pointer
while left < right and nums[left] == nums[left + 1]:
left += 1
# Skip duplicates for right pointer
while left < right and nums[right] == nums[right - 1]:
right -= 1
# Move both pointers
left += 1
right -= 1
return result
explanation: |
**Time Complexity:** O(n³) — O(n log n) sort + two nested O(n) loops with O(n) two-pointer search inside.
**Space Complexity:** O(log n) to O(n) — Sorting space; output not counted.
We sort the array, then use two nested loops to fix the first two elements. For each pair, two pointers find the remaining pair that completes the target sum. Early termination and skip optimisations prune many unnecessary iterations.
- approach_name: Brute Force
is_optimal: false
code: |
def four_sum(nums: list[int], target: int) -> list[list[int]]:
n = len(nums)
result = set() # Use set to avoid duplicates
# Try all possible quadruplets
for i in range(n):
for j in range(i + 1, n):
for k in range(j + 1, n):
for l in range(k + 1, n):
if nums[i] + nums[j] + nums[k] + nums[l] == target:
# Sort tuple to handle duplicates
quad = tuple(sorted([nums[i], nums[j], nums[k], nums[l]]))
result.add(quad)
return [list(q) for q in result]
explanation: |
**Time Complexity:** O(n⁴) — Four nested loops checking all combinations.
**Space Complexity:** O(k) — Where k is the number of unique quadruplets stored in the set.
This naive approach checks every possible combination of four elements. While correct, it's too slow for larger inputs. With n=200, this means up to 64 million iterations. The optimal solution reduces this to O(n³) by using sorting and two pointers.

View File

@@ -0,0 +1,178 @@
title: Gas Station
slug: gas-station
difficulty: medium
leetcode_id: 134
leetcode_url: https://leetcode.com/problems/gas-station/
categories:
- arrays
patterns:
- greedy
description: |
There are `n` gas stations along a circular route, where the amount of gas at the i<sup>th</sup> station is `gas[i]`.
You have a car with an unlimited gas tank and it costs `cost[i]` of gas to travel from the i<sup>th</sup> station to its next (i + 1)<sup>th</sup> station. You begin the journey with an empty tank at one of the gas stations.
Given two integer arrays `gas` and `cost`, return *the starting gas station's index if you can travel around the circuit once in the clockwise direction, otherwise return* `-1`. If there exists a solution, it is **guaranteed** to be **unique**.
constraints: |
- `n == gas.length == cost.length`
- `1 <= n <= 10^5`
- `0 <= gas[i], cost[i] <= 10^4`
- The input is generated such that the answer is unique
examples:
- input: "gas = [1,2,3,4,5], cost = [3,4,5,1,2]"
output: "3"
explanation: "Start at station 3 (index 3) and fill up with 4 units of gas. Your tank = 0 + 4 = 4. Travel to station 4: tank = 4 - 1 + 5 = 8. Travel to station 0: tank = 8 - 2 + 1 = 7. Travel to station 1: tank = 7 - 3 + 2 = 6. Travel to station 2: tank = 6 - 4 + 3 = 5. Travel to station 3: cost is 5, gas is just enough. Return 3."
- input: "gas = [2,3,4], cost = [3,4,3]"
output: "-1"
explanation: "Starting from any station, you cannot complete the circuit. For example, starting at station 2 with 4 gas, you can reach station 1 but cannot travel back to station 2 (requires 4 gas, you only have 3)."
explanation:
intuition: |
Imagine you're planning a road trip around a circular route with gas stations. At each station, you can fill up some gas, but travelling to the next station costs some gas. The question is: **can you find a starting point where you never run out of fuel?**
The key insight comes from two observations:
**Observation 1: Total gas must be enough.** If the total gas available across all stations is less than the total cost to travel the entire circuit, it's impossible to complete the trip from *any* starting point. Conversely, if total gas >= total cost, a solution is **guaranteed** to exist.
**Observation 2: If you fail at station `j`, skip all previous candidates.** Suppose you start at station `i` and run out of gas at station `j`. You might think: "Maybe I should try starting at `i+1`?" But here's the crucial insight — if you couldn't reach `j` starting from `i` with a full journey's worth of gas from stations `i` to `j-1`, then starting from any station *between* `i` and `j` would give you even *less* gas (you'd miss the contributions from earlier stations). So **all stations from `i` to `j` are invalid starting points**.
This means when we fail, we can jump our candidate start directly to `j+1`, making this a linear-time algorithm.
approach: |
We solve this using a **Single Pass Greedy Approach**:
**Step 1: Initialise tracking variables**
- `total_tank`: Tracks the cumulative surplus/deficit across all stations (used to check if a solution exists)
- `current_tank`: Tracks the current fuel level from our candidate starting point
- `start_station`: The index of our current candidate starting point, initialised to `0`
&nbsp;
**Step 2: Iterate through each station**
- For each station `i`, calculate `gas[i] - cost[i]` (the net gain/loss at this station)
- Add this value to both `total_tank` and `current_tank`
- If `current_tank` becomes negative, it means we can't reach station `i+1` from our current `start_station`
- When this happens, reset `start_station` to `i+1` and reset `current_tank` to `0`
&nbsp;
**Step 3: Check feasibility and return**
- After the loop, if `total_tank >= 0`, a solution exists and `start_station` is our answer
- If `total_tank < 0`, return `-1` (not enough total gas)
&nbsp;
The greedy choice — skipping all stations between our failed start and the failure point — is valid because those intermediate stations would only give us less fuel to work with.
common_pitfalls:
- title: Trying Every Starting Point
description: |
A brute force approach would try starting at each station and simulate the entire trip:
- For each starting station `i`, simulate travelling all `n` stations
- Check if the tank ever goes negative
This results in **O(n^2) time complexity**. With `n` up to `10^5`, this means up to 10 billion operations — a guaranteed **Time Limit Exceeded (TLE)**.
wrong_approach: "Nested loops simulating from each start"
correct_approach: "Single pass with smart candidate elimination"
- title: Not Understanding Why We Can Skip Stations
description: |
When you fail at station `j` starting from station `i`, it might seem wasteful to skip directly to `j+1`. Why not try `i+1`?
The reason is mathematical: if you reached stations `i+1`, `i+2`, ..., `j-1` with non-negative fuel (otherwise you would have failed earlier), but still failed at `j`, then starting at any of those intermediate stations means you'd have *less* accumulated fuel when you reach `j`.
For example, if stations give net gains of `[+3, -1, -1, -2]` and you fail at index 3 starting from index 0, starting at index 1 means you miss the +3 from station 0, making failure even more certain.
wrong_approach: "Increment start by 1 when failing"
correct_approach: "Jump start to failure_point + 1"
- title: Forgetting to Check Total Feasibility
description: |
Just finding a valid `start_station` candidate isn't enough. You must verify that the **total gas >= total cost** for the entire circuit.
The `total_tank` variable serves this purpose. Even if we find a candidate, if `total_tank < 0` at the end, no solution exists.
wrong_approach: "Only tracking current_tank"
correct_approach: "Track both current_tank and total_tank"
key_takeaways:
- "**Greedy elimination**: When a candidate fails, use problem structure to eliminate multiple candidates at once, not just one"
- "**Global vs local tracking**: Use separate variables for local decisions (`current_tank`) and global feasibility (`total_tank`)"
- "**Circular problems**: Often can be solved with a single linear pass by tracking cumulative state"
- "**Proof intuition**: If total resources >= total cost, a valid starting point must exist — this is a key insight for many resource allocation problems"
time_complexity: "O(n). We traverse both arrays exactly once, performing constant-time operations at each station."
space_complexity: "O(1). We only use three integer variables (`total_tank`, `current_tank`, `start_station`) regardless of input size."
solutions:
- approach_name: Single Pass Greedy
is_optimal: true
code: |
def can_complete_circuit(gas: list[int], cost: list[int]) -> int:
# Track total surplus to check if solution exists
total_tank = 0
# Track current surplus from candidate start
current_tank = 0
# Our candidate starting station
start_station = 0
for i in range(len(gas)):
# Net gain/loss at this station
net = gas[i] - cost[i]
total_tank += net
current_tank += net
# If we can't reach the next station from current start
if current_tank < 0:
# All stations from start to i are invalid
# Try starting from the next station
start_station = i + 1
current_tank = 0
# If total gas >= total cost, solution exists at start_station
# Otherwise, impossible to complete the circuit
return start_station if total_tank >= 0 else -1
explanation: |
**Time Complexity:** O(n) — Single pass through both arrays.
**Space Complexity:** O(1) — Only three integer variables used.
The key insight is that if we fail to reach station `j` from station `i`, all stations between `i` and `j` are also invalid starting points. Combined with tracking total feasibility, this gives us an elegant linear solution.
- approach_name: Brute Force
is_optimal: false
code: |
def can_complete_circuit(gas: list[int], cost: list[int]) -> int:
n = len(gas)
# Try each station as a starting point
for start in range(n):
tank = 0
can_complete = True
# Simulate travelling around the circuit
for i in range(n):
# Current station index (wrapping around)
station = (start + i) % n
# Fill up and travel to next station
tank += gas[station] - cost[station]
# Ran out of gas before reaching next station
if tank < 0:
can_complete = False
break
if can_complete:
return start
return -1
explanation: |
**Time Complexity:** O(n^2) — For each of n starting points, we simulate travelling n stations.
**Space Complexity:** O(1) — Only tracking tank and loop variables.
This approach is correct but inefficient. It tries every possible starting station and simulates the full circuit. With n up to 10^5, this will cause TLE. Included to illustrate why the greedy optimisation is necessary.

View File

@@ -0,0 +1,195 @@
title: Generate Parentheses
slug: generate-parentheses
difficulty: medium
leetcode_id: 22
leetcode_url: https://leetcode.com/problems/generate-parentheses/
categories:
- strings
- recursion
patterns:
- backtracking
description: |
Given `n` pairs of parentheses, write a function to *generate all combinations of well-formed parentheses*.
A well-formed parentheses string has equal numbers of opening and closing parentheses, with every closing parenthesis matching a preceding opening one.
constraints: |
- `1 <= n <= 8`
examples:
- input: "n = 3"
output: '["((()))","(()())","(())()","()(())","()()()"]'
explanation: "All 5 valid combinations of 3 pairs of parentheses."
- input: "n = 1"
output: '["()"]'
explanation: "With just one pair, there's only one valid combination."
explanation:
intuition: |
Imagine you're building a string character by character, and at each step you can choose to add either an opening `(` or a closing `)` parenthesis.
The key insight is that not every choice is valid. A string of parentheses is **well-formed** if at any point while reading left-to-right, the number of closing parentheses never exceeds the number of opening ones. Think of it like a balance: each `(` adds +1 to the balance, and each `)` subtracts 1. The balance must never go negative.
This naturally leads to a **decision tree** approach. At each position, we branch based on what characters we *can* legally add:
- We can add `(` if we haven't used all `n` opening parentheses yet
- We can add `)` if we have more opening parentheses than closing ones (i.e., there's an unmatched `(` to close)
By exploring all valid paths through this decision tree, we generate exactly the set of well-formed parentheses strings — no more, no less.
approach: |
We solve this using **Backtracking** — systematically building candidates and abandoning paths that can't lead to valid solutions.
**Step 1: Define the recursive state**
- `current`: The string we're building
- `open_count`: Number of `(` parentheses used so far
- `close_count`: Number of `)` parentheses used so far
&nbsp;
**Step 2: Identify base case**
- When `len(current) == 2 * n`, we've placed all parentheses
- Add the completed string to our results list
&nbsp;
**Step 3: Define recursive choices**
- **Add `(`**: Only if `open_count < n` (we haven't used all opening parentheses)
- **Add `)`**: Only if `close_count < open_count` (there's an unmatched `(` to close)
&nbsp;
**Step 4: Backtrack after each choice**
- After exploring a path, the recursion naturally "unwinds"
- Since we pass strings (immutable in Python), backtracking is implicit
- With mutable structures, you'd explicitly remove the last character
&nbsp;
The constraints on when we can add each character ensure we only generate valid combinations, making this more efficient than generating all permutations and filtering.
common_pitfalls:
- title: Generating All Permutations Then Filtering
description: |
A naive approach might generate all possible strings of `(` and `)` characters, then filter for valid ones.
With `n = 8`, that's `2^16 = 65,536` strings to generate and validate, but only `1,430` are valid (the 8th Catalan number). This wastes significant computation on invalid strings.
The backtracking approach only explores valid paths, never generating invalid strings in the first place.
wrong_approach: "Generate all 2^(2n) strings, filter valid ones"
correct_approach: "Use constraints during generation to only build valid strings"
- title: Forgetting the Close Constraint
description: |
It's tempting to think you can always add a `)` as long as you haven't used all `n` of them. But consider building with `n = 2`:
Starting with `()`, if you add `)` next you get `())` — this is invalid because the third character closes a parenthesis that was never opened.
The rule is: you can only add `)` when `close_count < open_count`, not just when `close_count < n`.
wrong_approach: "Add ) whenever close_count < n"
correct_approach: "Add ) only when close_count < open_count"
- title: Modifying Strings In-Place Incorrectly
description: |
In languages with mutable strings or when using a list to build the string, forgetting to backtrack (remove the last character after recursion) leads to corrupted results.
In Python, passing `current + '('` creates a new string, so backtracking is automatic. But if using a list like `current.append('(')`, you must call `current.pop()` after the recursive call returns.
key_takeaways:
- "**Backtracking pattern**: Build solutions incrementally, using constraints to prune invalid paths early"
- "**Decision tree thinking**: Visualise recursive problems as trees where each node is a choice point"
- "**Catalan numbers**: The count of valid parentheses combinations follows the Catalan sequence — this appears in many combinatorial problems"
- "**Constraint propagation**: Encoding validity rules into the recursion conditions is more efficient than post-hoc filtering"
time_complexity: "O(4^n / √n). This is the n<sup>th</sup> Catalan number, representing the count of valid combinations. Each valid string takes O(n) to construct."
space_complexity: "O(n). The recursion depth is at most `2n` (the length of each string), and we store the current string being built."
solutions:
- approach_name: Backtracking
is_optimal: true
code: |
def generate_parenthesis(n: int) -> list[str]:
result = []
def backtrack(current: str, open_count: int, close_count: int):
# Base case: we've placed all 2n parentheses
if len(current) == 2 * n:
result.append(current)
return
# Choice 1: Add opening parenthesis if we haven't used all n
if open_count < n:
backtrack(current + '(', open_count + 1, close_count)
# Choice 2: Add closing parenthesis if it won't make string invalid
if close_count < open_count:
backtrack(current + ')', open_count, close_count + 1)
backtrack('', 0, 0)
return result
explanation: |
**Time Complexity:** O(4^n / √n) — The number of valid sequences is the n<sup>th</sup> Catalan number.
**Space Complexity:** O(n) — Recursion stack depth plus the current string being built.
We recursively build strings by making valid choices at each step. The constraints (`open_count < n` and `close_count < open_count`) ensure we never explore invalid paths, making this efficient despite the exponential output size.
- approach_name: Iterative with Stack
is_optimal: false
code: |
def generate_parenthesis(n: int) -> list[str]:
result = []
# Stack holds tuples of (current_string, open_count, close_count)
stack = [('', 0, 0)]
while stack:
current, open_count, close_count = stack.pop()
# Base case: complete string
if len(current) == 2 * n:
result.append(current)
continue
# Add closing parenthesis option first (will be processed second due to LIFO)
if close_count < open_count:
stack.append((current + ')', open_count, close_count + 1))
# Add opening parenthesis option
if open_count < n:
stack.append((current + '(', open_count + 1, close_count))
return result
explanation: |
**Time Complexity:** O(4^n / √n) — Same as recursive, we explore all valid paths.
**Space Complexity:** O(4^n / √n) — The stack can hold many partial solutions simultaneously.
This converts the recursion to an explicit stack, which can be useful in languages with limited recursion depth. The logic is identical — we just manage the call stack manually. Note that space complexity is worse because we store all pending states on the heap rather than using the call stack.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def generate_parenthesis(n: int) -> list[str]:
# dp[i] contains all valid strings with i pairs of parentheses
dp = [[] for _ in range(n + 1)]
dp[0] = [''] # Base case: empty string for 0 pairs
for i in range(1, n + 1):
# Build strings for i pairs using smaller subproblems
# Pattern: "(" + dp[j] + ")" + dp[i-1-j] for all valid j
for j in range(i):
for left in dp[j]:
for right in dp[i - 1 - j]:
dp[i].append('(' + left + ')' + right)
return dp[n]
explanation: |
**Time Complexity:** O(4^n / √n) — We generate all Catalan(n) strings.
**Space Complexity:** O(4^n / √n) — We store all valid strings for all values up to n.
This builds solutions bottom-up. For `i` pairs, we consider all ways to split: `j` pairs inside the first `()` and `i-1-j` pairs after it. While correct, this uses more memory than backtracking since it stores all intermediate results.

View File

@@ -0,0 +1,188 @@
title: Greatest Common Divisor of Strings
slug: greatest-common-divisor-of-strings
difficulty: easy
leetcode_id: 1071
leetcode_url: https://leetcode.com/problems/greatest-common-divisor-of-strings/
categories:
- strings
- math
patterns:
- greedy
description: |
For two strings `s` and `t`, we say "`t` divides `s`" if and only if `s = t + t + t + ... + t` (i.e., `t` is concatenated with itself one or more times).
Given two strings `str1` and `str2`, return *the largest string* `x` *such that* `x` *divides both* `str1` *and* `str2`.
constraints: |
- `1 <= str1.length, str2.length <= 1000`
- `str1` and `str2` consist of English uppercase letters.
examples:
- input: 'str1 = "ABCABC", str2 = "ABC"'
output: '"ABC"'
explanation: '"ABC" divides both strings. "ABCABC" = "ABC" + "ABC" and "ABC" = "ABC".'
- input: 'str1 = "ABABAB", str2 = "ABAB"'
output: '"AB"'
explanation: '"AB" divides both strings. "ABABAB" = "AB" + "AB" + "AB" and "ABAB" = "AB" + "AB".'
- input: 'str1 = "LEET", str2 = "CODE"'
output: '""'
explanation: "There is no string that divides both str1 and str2."
explanation:
intuition: |
This problem cleverly connects string manipulation to a fundamental mathematical concept: the **Greatest Common Divisor (GCD)**.
Think of it like this: if a string `x` can "divide" both `str1` and `str2`, then `x` repeated some number of times equals `str1`, and `x` repeated another number of times equals `str2`. This is exactly analogous to how a number `d` divides both `a` and `b` if `a = d * m` and `b = d * n` for some integers `m` and `n`.
The key insight is that **if a common divisor string exists, the length of the GCD string must be the GCD of the two string lengths**. Why? Because if `x` divides both strings, then `len(str1)` must be a multiple of `len(x)` and `len(str2)` must be a multiple of `len(x)`. The *largest* such length is exactly `gcd(len(str1), len(str2))`.
But there's one more critical check: **not all string pairs have a common divisor**. For example, `"LEET"` and `"CODE"` have no common divisor because they're fundamentally incompatible. The elegant way to check compatibility is: if `str1 + str2 == str2 + str1`, then a common divisor exists. If the strings are "made of the same building block," the order of concatenation doesn't matter.
approach: |
We solve this using a **GCD-based Approach**:
**Step 1: Check if a common divisor exists**
- Concatenate `str1 + str2` and `str2 + str1`
- If these are not equal, the strings have no common divisor — return an empty string
- This check works because if both strings are built from the same repeating pattern, the order of concatenation won't matter
&nbsp;
**Step 2: Calculate the GCD of the string lengths**
- Use the Euclidean algorithm to find `gcd(len(str1), len(str2))`
- This gives us the length of the largest possible common divisor string
&nbsp;
**Step 3: Return the GCD string**
- Return the prefix of `str1` (or `str2`) with length equal to the GCD
- Since we've verified compatibility in Step 1, this prefix is guaranteed to divide both strings
&nbsp;
The mathematical foundation makes this solution both elegant and efficient — we avoid brute-force checking of all possible divisor strings.
common_pitfalls:
- title: Brute Force All Prefixes
description: |
A naive approach might try every possible prefix of the shorter string and check if it divides both strings. For each candidate prefix of length `k`, you'd verify if `str1` and `str2` are composed entirely of that prefix.
While correct, this is unnecessarily slow. With strings up to length 1000, and checking each prefix by iterating through both strings, you could do up to O(n^2) work.
The GCD approach reduces this to O(n) string concatenation checks plus O(log(min(n, m))) for the GCD calculation.
wrong_approach: "Try every prefix and check divisibility"
correct_approach: "Use mathematical GCD on lengths after compatibility check"
- title: Forgetting the Compatibility Check
description: |
You might be tempted to just compute `gcd(len(str1), len(str2))` and return that prefix. But this fails for cases like `str1 = "LEET"`, `str2 = "CODE"`.
The GCD of 4 and 4 is 4, but `"LEET"` does not equal `"CODE"` — there's no common divisor string at all! The `str1 + str2 == str2 + str1` check catches this: `"LEETCODE"` ≠ `"CODELEET"`.
wrong_approach: "Just return str1[:gcd(len(str1), len(str2))]"
correct_approach: "First verify str1 + str2 == str2 + str1"
- title: Checking Wrong String for Prefix
description: |
After finding the GCD length, some might try to construct the result by repeating characters or using complex logic. Simply take a prefix of either string — since we've verified they're compatible, both strings start with the same pattern.
wrong_approach: "Complex construction of the result string"
correct_approach: "Return str1[:gcd_length] directly"
key_takeaways:
- "**Mathematical insight**: String divisibility mirrors integer divisibility — the GCD concept transfers directly"
- "**Compatibility check first**: The `str1 + str2 == str2 + str1` test elegantly verifies that a common pattern exists"
- "**Euclidean algorithm**: The GCD of two numbers can be computed efficiently in O(log(min(a, b))) time"
- "**Pattern recognition**: Look for mathematical analogies when problems involve repetition or divisibility"
time_complexity: "O(n + m). We perform two string concatenations of total length `n + m`, one equality check of length `n + m`, and a GCD calculation in O(log(min(n, m)))."
space_complexity: "O(n + m). We create two concatenated strings of length `n + m` for the compatibility check."
solutions:
- approach_name: GCD of Lengths
is_optimal: true
code: |
from math import gcd
def gcd_of_strings(str1: str, str2: str) -> str:
# Check if a common divisor pattern exists
# If both strings are made of the same repeating unit,
# concatenation order doesn't matter
if str1 + str2 != str2 + str1:
return ""
# The GCD string length must be the GCD of both lengths
gcd_length = gcd(len(str1), len(str2))
# Return the prefix of that length
return str1[:gcd_length]
explanation: |
**Time Complexity:** O(n + m) — String concatenation and comparison dominate.
**Space Complexity:** O(n + m) — For the concatenated strings.
This elegant solution leverages the mathematical relationship between string divisibility and integer GCD. The concatenation equality check `str1 + str2 == str2 + str1` is a brilliant way to verify that both strings share a common building block pattern.
- approach_name: Iterative GCD Check
is_optimal: false
code: |
def gcd_of_strings(str1: str, str2: str) -> str:
def gcd(a: int, b: int) -> int:
# Euclidean algorithm
while b:
a, b = b, a % b
return a
def divides(s: str, t: str) -> bool:
# Check if s divides t (t is s repeated)
if len(t) % len(s) != 0:
return False
times = len(t) // len(s)
return s * times == t
# Find GCD length
gcd_len = gcd(len(str1), len(str2))
candidate = str1[:gcd_len]
# Verify it actually divides both strings
if divides(candidate, str1) and divides(candidate, str2):
return candidate
return ""
explanation: |
**Time Complexity:** O(n + m) — Checking divisibility for both strings.
**Space Complexity:** O(gcd(n, m)) — For the candidate string and repeated copies.
This approach is more explicit: it computes the candidate GCD string, then verifies it actually divides both inputs. While correct and intuitive, it's slightly less elegant than the concatenation trick which handles the compatibility check in one comparison.
- approach_name: Brute Force
is_optimal: false
code: |
def gcd_of_strings(str1: str, str2: str) -> str:
# Try all possible prefix lengths, from largest to smallest
min_len = min(len(str1), len(str2))
for length in range(min_len, 0, -1):
# Skip if lengths aren't divisible
if len(str1) % length != 0 or len(str2) % length != 0:
continue
# Get candidate prefix
candidate = str1[:length]
# Check if candidate divides both strings
times1 = len(str1) // length
times2 = len(str2) // length
if candidate * times1 == str1 and candidate * times2 == str2:
return candidate
return ""
explanation: |
**Time Complexity:** O(min(n, m) * (n + m)) — For each candidate length, we check both strings.
**Space Complexity:** O(n + m) — For the repeated candidate strings.
This brute force approach tries every possible prefix length from largest to smallest. While it works, it's inefficient because it doesn't leverage the mathematical insight that the answer length must be `gcd(n, m)`. Included to illustrate why the GCD approach is superior.

View File

@@ -0,0 +1,156 @@
title: Group Anagrams
slug: group-anagrams
difficulty: medium
leetcode_id: 49
leetcode_url: https://leetcode.com/problems/group-anagrams/
categories:
- strings
- hash-tables
- sorting
patterns:
- hashing
description: |
Given an array of strings `strs`, group the **anagrams** together. You can return the answer in **any order**.
An **anagram** is a word or phrase formed by rearranging the letters of a different word or phrase, using all the original letters exactly once.
constraints: |
- `1 <= strs.length <= 10^4`
- `0 <= strs[i].length <= 100`
- `strs[i]` consists of lowercase English letters
examples:
- input: 'strs = ["eat","tea","tan","ate","nat","bat"]'
output: '[["bat"],["nat","tan"],["ate","eat","tea"]]'
explanation: "Words with the same letters are grouped together."
- input: 'strs = [""]'
output: '[[""]]'
explanation: "Empty string forms its own group."
- input: 'strs = ["a"]'
output: '[["a"]]'
explanation: "Single character forms its own group."
explanation:
intuition: |
What makes two words anagrams? They have exactly the same letters in exactly the same quantities. "eat" and "tea" both have one 'e', one 'a', and one 't'.
Think of it like this: if you sort the letters of any anagram, you get the same result. `sorted("eat") = "aet"` and `sorted("tea") = "aet"`. This sorted form is a **canonical representation** — a fingerprint that's identical for all anagrams.
So the strategy is simple: for each word, compute its fingerprint (sorted letters), and group words with the same fingerprint together. A hash map is perfect for this — the fingerprint is the key, and each key maps to a list of original words.
There's an alternative fingerprint: instead of sorting, count each letter's frequency. `"eat"` becomes `(1,0,0,0,1,0,...,1,0,0)` — a tuple of 26 counts. This is O(k) instead of O(k log k), better for long strings.
approach: |
We solve this using **Hash Map with Sorted String Keys**:
**Step 1: Create a hash map for grouping**
- Use a `defaultdict(list)` so we can append to non-existent keys
- Keys will be the canonical form (sorted string)
- Values will be lists of original strings
&nbsp;
**Step 2: Process each string**
- For each string `s` in the input:
- Compute the key: `''.join(sorted(s))`
- Append the original string to `groups[key]`
&nbsp;
**Step 3: Return all groups**
- Return `list(groups.values())` — each value is one anagram group
&nbsp;
Why does sorting work? Two strings are anagrams if and only if they contain the same characters. Sorting arranges characters in a canonical order, so anagrams produce identical sorted strings.
common_pitfalls:
- title: Using Unhashable Types as Dictionary Keys
description: |
In Python, `sorted(s)` returns a **list**, which can't be a dictionary key (lists are mutable, hence unhashable).
You must convert to a hashable type:
- `''.join(sorted(s))` → string key
- `tuple(sorted(s))` → tuple key
wrong_approach: "groups[sorted(s)].append(s)"
correct_approach: "groups[''.join(sorted(s))].append(s)"
- title: Forgetting Empty Strings
description: |
An empty string `""` is a valid input. `sorted("")` returns `[]`, and `''.join([])` returns `""`. The algorithm handles this correctly, but edge case testing is important.
wrong_approach: "Assuming all strings are non-empty"
correct_approach: "Empty strings are handled naturally — they form their own group"
- title: Using Regular Dict Without Default
description: |
With a regular `dict`, you must check if a key exists before appending:
```python
if key not in groups:
groups[key] = []
groups[key].append(s)
```
Using `defaultdict(list)` eliminates this boilerplate.
wrong_approach: "groups[key].append(s) with regular dict (KeyError)"
correct_approach: "Use defaultdict(list) for automatic list creation"
key_takeaways:
- "**Canonical form for grouping**: Anagrams share a canonical representation (sorted or counted)"
- "**Hash map for grouping**: When grouping by some property, use that property as the key"
- "**Sorting vs counting**: Sorting is O(k log k), counting is O(k) — counting is faster for long strings"
- "**defaultdict simplifies code**: Eliminates key-existence checks when building lists"
time_complexity: "O(n × k log k). We process n strings, and sorting each string of length k takes O(k log k). With the counting approach, this becomes O(n × k)."
space_complexity: "O(n × k). We store all n strings in the hash map. Each string has length up to k."
solutions:
- approach_name: Sorted String Key
is_optimal: true
code: |
from collections import defaultdict
def group_anagrams(strs: list[str]) -> list[list[str]]:
# Map: sorted string -> list of original strings
groups = defaultdict(list)
for s in strs:
# All anagrams sort to the same string
key = ''.join(sorted(s))
groups[key].append(s)
# Return all groups (order doesn't matter)
return list(groups.values())
explanation: |
**Time Complexity:** O(n × k log k) — Sorting each of n strings of average length k.
**Space Complexity:** O(n × k) — Storing all strings in the hash map.
Sorting gives each string a canonical form. All anagrams produce the same sorted string, so they end up in the same bucket. Simple, readable, and efficient enough for most cases.
- approach_name: Character Count Key
is_optimal: true
code: |
from collections import defaultdict
def group_anagrams(strs: list[str]) -> list[list[str]]:
groups = defaultdict(list)
for s in strs:
# Count frequency of each letter (a-z)
count = [0] * 26
for c in s:
count[ord(c) - ord('a')] += 1
# Use tuple of counts as key (tuples are hashable)
groups[tuple(count)].append(s)
return list(groups.values())
explanation: |
**Time Complexity:** O(n × k) — Counting is O(k) per string, better than O(k log k) sorting.
**Space Complexity:** O(n × k) — Same as sorted approach.
Instead of sorting, we count the frequency of each letter. Two strings are anagrams if and only if they have identical character counts. The count array is converted to a tuple (hashable) for use as a dictionary key. This is faster for long strings.

View File

@@ -0,0 +1,170 @@
title: Guess Number Higher or Lower
slug: guess-number-higher-or-lower
difficulty: easy
leetcode_id: 374
leetcode_url: https://leetcode.com/problems/guess-number-higher-or-lower/
categories:
- binary-search
patterns:
- binary-search
description: |
We are playing the Guess Game. The game is as follows:
I pick a number from `1` to `n`. You have to guess which number I picked (the number I picked stays the same throughout the game).
Every time you guess wrong, I will tell you whether the number I picked is higher or lower than your guess.
You call a pre-defined API `int guess(int num)`, which returns three possible results:
- `-1`: Your guess is higher than the number I picked (i.e. `num > pick`).
- `1`: Your guess is lower than the number I picked (i.e. `num < pick`).
- `0`: Your guess is equal to the number I picked (i.e. `num == pick`).
Return *the number that I picked*.
constraints: |
- `1 <= n <= 2^31 - 1`
- `1 <= pick <= n`
examples:
- input: "n = 10, pick = 6"
output: "6"
explanation: "Using binary search, we narrow down the range until we find 6."
- input: "n = 1, pick = 1"
output: "1"
explanation: "There's only one number, so the answer is 1."
- input: "n = 2, pick = 1"
output: "1"
explanation: "We guess the middle (or lower), and the API tells us we found it or to go lower."
explanation:
intuition: |
Imagine playing a number guessing game with a friend. They're thinking of a number between 1 and 100, and after each guess, they tell you "higher" or "lower". What's the smartest strategy?
The optimal approach is to **always guess the middle of the remaining range**. If they say "higher", you eliminate all numbers below your guess. If they say "lower", you eliminate all numbers above. Each guess cuts the search space in half.
Think of it like searching for a word in a dictionary. You don't start from page 1 and check every page — you open to the middle, see if your word comes before or after, and repeat. This is the essence of **binary search**.
The key insight is that the API feedback (`-1`, `0`, `1`) directly tells us which half of the range to keep searching. We're guaranteed to find the answer because we systematically narrow down until only one number remains.
approach: |
We solve this using **Binary Search**:
**Step 1: Initialise the search boundaries**
- `low`: Set to `1` (the smallest possible number)
- `high`: Set to `n` (the largest possible number)
&nbsp;
**Step 2: Binary search loop**
- While `low <= high`, calculate the middle point: `mid = low + (high - low) // 2`
- Call `guess(mid)` to get feedback
- If result is `0`: We found the number — return `mid`
- If result is `-1`: Our guess is too high — search in the lower half by setting `high = mid - 1`
- If result is `1`: Our guess is too low — search in the upper half by setting `low = mid + 1`
&nbsp;
**Step 3: Return the result**
- The loop is guaranteed to find the answer since `1 <= pick <= n`
&nbsp;
Note: We use `mid = low + (high - low) // 2` instead of `(low + high) // 2` to avoid integer overflow when `low + high` exceeds the maximum integer value.
common_pitfalls:
- title: Integer Overflow in Midpoint Calculation
description: |
A classic bug is calculating the midpoint as `(low + high) // 2`. When `low` and `high` are both large (close to `2^31 - 1`), their sum overflows.
For example, if `low = 2^30` and `high = 2^31 - 1`, then `low + high` exceeds the 32-bit signed integer limit, causing incorrect behavior or runtime errors.
Always use `low + (high - low) // 2` to safely compute the midpoint.
wrong_approach: "(low + high) // 2"
correct_approach: "low + (high - low) // 2"
- title: Linear Search
description: |
Guessing numbers one by one from `1` to `n` works but is extremely inefficient. With `n` up to `2^31 - 1` (over 2 billion), a linear approach could require billions of guesses.
Binary search guarantees finding the answer in at most `log2(n)` guesses — about 31 guesses for the maximum `n`.
wrong_approach: "Loop from 1 to n, calling guess(i)"
correct_approach: "Binary search halving the range each time"
- title: Off-by-One Errors
description: |
When updating `low` and `high`, you must exclude the middle element since we already checked it:
- When `guess(mid)` returns `-1` (guess too high), set `high = mid - 1`, not `high = mid`
- When `guess(mid)` returns `1` (guess too low), set `low = mid + 1`, not `low = mid`
Using `mid` instead of `mid - 1` or `mid + 1` can cause infinite loops.
key_takeaways:
- "**Binary search foundation**: This problem teaches the core binary search template — divide the search space in half based on a condition"
- "**Overflow prevention**: Always use `low + (high - low) // 2` for midpoint calculation in production code"
- "**Interactive problems**: Problems with API calls follow the same patterns — the API response guides which half to search"
- "**Logarithmic efficiency**: Binary search reduces `O(n)` to `O(log n)`, essential for large input ranges"
time_complexity: "O(log n). Each guess eliminates half of the remaining candidates, so we need at most log2(n) guesses."
space_complexity: "O(1). We only use three variables (`low`, `high`, `mid`) regardless of the input size."
solutions:
- approach_name: Binary Search
is_optimal: true
code: |
# The guess API is already defined for you.
# @param num: your guess
# @return -1 if num is higher than the picked number
# 1 if num is lower than the picked number
# otherwise return 0
# def guess(num: int) -> int:
def guess_number(n: int) -> int:
low, high = 1, n
while low <= high:
# Safe midpoint calculation to avoid overflow
mid = low + (high - low) // 2
result = guess(mid)
if result == 0:
# Found the number
return mid
elif result == -1:
# Guess is too high, search lower half
high = mid - 1
else:
# Guess is too low, search upper half
low = mid + 1
# Should never reach here given problem constraints
return -1
explanation: |
**Time Complexity:** O(log n) — Each iteration halves the search space.
**Space Complexity:** O(1) — Only three integer variables used.
This is the classic binary search template. We maintain a range `[low, high]` and repeatedly guess the middle value. Based on the API response, we eliminate half the range until we find the target.
- approach_name: Linear Search
is_optimal: false
code: |
def guess_number(n: int) -> int:
# Check each number from 1 to n
for i in range(1, n + 1):
if guess(i) == 0:
return i
return -1
explanation: |
**Time Complexity:** O(n) — In the worst case, we check every number.
**Space Complexity:** O(1) — Only loop variable used.
This brute force approach checks numbers sequentially. While correct, it's far too slow for large `n` (up to 2 billion). With `n = 2^31 - 1`, this could require over 2 billion API calls, causing Time Limit Exceeded. Included to illustrate why binary search is necessary.

View File

@@ -0,0 +1,191 @@
title: Hand of Straights
slug: hand-of-straights
difficulty: medium
leetcode_id: 846
leetcode_url: https://leetcode.com/problems/hand-of-straights/
categories:
- arrays
- hash-tables
- sorting
patterns:
- greedy
- heap
description: |
Alice has some number of cards and she wants to rearrange the cards into groups so that each group is of size `groupSize`, and consists of `groupSize` consecutive cards.
Given an integer array `hand` where `hand[i]` is the value written on the i<sup>th</sup> card and an integer `groupSize`, return `true` if she can rearrange the cards, or `false` otherwise.
constraints: |
- `1 <= hand.length <= 10^4`
- `0 <= hand[i] <= 10^9`
- `1 <= groupSize <= hand.length`
examples:
- input: "hand = [1,2,3,6,2,3,4,7,8], groupSize = 3"
output: "true"
explanation: "Alice's hand can be rearranged as [1,2,3], [2,3,4], [6,7,8]."
- input: "hand = [1,2,3,4,5], groupSize = 4"
output: "false"
explanation: "Alice's hand cannot be rearranged into groups of 4 because 5 is not divisible by 4."
explanation:
intuition: |
Imagine you're organising a deck of cards into runs of consecutive numbers, like arranging cards in a hand of rummy.
The key insight is that **the smallest card in any valid arrangement must start a group**. Why? Because no smaller card exists to precede it in a consecutive sequence. So if the smallest card is `3`, you must form a group starting at `3` (i.e., `3, 4, 5` for `groupSize = 3`).
Think of it like this: you're forced to "use up" the smallest remaining card first. Once you commit to starting a group with that card, you must find the next `groupSize - 1` consecutive cards to complete the group. If any of those cards are missing, the arrangement is impossible.
This greedy approach works because:
- Every card must belong to exactly one group
- The smallest card has no choice — it must start a group
- By always processing the smallest unused card, we systematically build all possible groups
approach: |
We solve this using a **Greedy Approach with Hash Map Counting**:
**Step 1: Check divisibility**
- If `len(hand)` is not divisible by `groupSize`, return `false` immediately
- We need exactly `n / groupSize` complete groups
&nbsp;
**Step 2: Count card frequencies**
- Use a hash map to count occurrences of each card value
- This allows O(1) lookups and decrements
&nbsp;
**Step 3: Sort unique card values**
- Sort the unique card values (or use a min-heap)
- This ensures we always process the smallest available card first
&nbsp;
**Step 4: Greedily form groups**
- For each smallest card value with count > 0:
- Attempt to form a group starting at this value
- For each of the next `groupSize` consecutive values:
- If the count is 0 (card unavailable), return `false`
- Decrement the count of each card used
- If all groups formed successfully, return `true`
&nbsp;
The greedy choice of always starting from the smallest available card guarantees correctness because that card has no other valid placement.
common_pitfalls:
- title: Forgetting the Divisibility Check
description: |
Before any complex logic, check if `len(hand) % groupSize == 0`.
For example, with `hand = [1,2,3,4,5]` and `groupSize = 4`, it's impossible to form complete groups of 4 from 5 cards. This quick check avoids unnecessary computation.
wrong_approach: "Skipping the divisibility check and processing all cards"
correct_approach: "Return false immediately if total cards aren't divisible by groupSize"
- title: Not Processing Cards in Sorted Order
description: |
If you try to form groups starting from arbitrary cards, you might use up cards needed for smaller sequences.
For example, with `hand = [1,2,3,2,3,4]` and `groupSize = 3`, if you greedily grab `[2,3,4]` first, you're left with `[1,2,3]` which works. But if you tried `[1,2,3]` and `[2,3,4]` in different orderings without tracking properly, you could miss valid arrangements or incorrectly report failure.
By always starting groups from the **smallest available card**, you ensure deterministic and correct grouping.
wrong_approach: "Processing cards in arbitrary order"
correct_approach: "Sort cards and always start groups from the smallest value"
- title: Using Cards Multiple Times
description: |
Each card can only belong to one group. When forming a group, you must decrement the count for each card used.
A common bug is checking if a card exists but forgetting to reduce its count, leading to the same card being "used" in multiple groups.
wrong_approach: "Checking card existence without decrementing counts"
correct_approach: "Decrement count immediately after using each card"
key_takeaways:
- "**Greedy with constraints**: When elements have no flexibility in placement (smallest must start a group), greedy works"
- "**Hash map for frequency tracking**: Counting occurrences enables efficient lookups and updates in O(1)"
- "**Sort to establish processing order**: Sorting unique values ensures we always handle the most constrained element first"
- "**Early termination**: Simple checks like divisibility can save significant computation"
time_complexity: "O(n log n). Sorting the unique card values dominates. The grouping phase visits each card at most once, contributing O(n)."
space_complexity: "O(n). The hash map stores counts for up to `n` unique card values."
solutions:
- approach_name: Greedy with Hash Map
is_optimal: true
code: |
from collections import Counter
def is_n_straight_hand(hand: list[int], group_size: int) -> bool:
# Quick check: total cards must be divisible by group size
if len(hand) % group_size != 0:
return False
# Count frequency of each card value
card_count = Counter(hand)
# Process cards in sorted order (smallest first)
for card in sorted(card_count):
# If this card has remaining copies, it must start a group
count = card_count[card]
if count > 0:
# Try to form 'count' groups starting at this card
for i in range(group_size):
# Need 'count' copies of each consecutive card
if card_count[card + i] < count:
return False # Not enough cards to complete groups
card_count[card + i] -= count
return True
explanation: |
**Time Complexity:** O(n log n) — Sorting unique values takes O(k log k) where k ≤ n, and we process each card once.
**Space Complexity:** O(n) — Hash map stores up to n entries.
We count card frequencies, then iterate through sorted values. When a card has remaining copies, we greedily form as many groups as possible starting from that card. If any consecutive card is missing, we return false.
- approach_name: Min-Heap Approach
is_optimal: false
code: |
from collections import Counter
import heapq
def is_n_straight_hand(hand: list[int], group_size: int) -> bool:
if len(hand) % group_size != 0:
return False
card_count = Counter(hand)
# Min-heap of unique card values
min_heap = list(card_count.keys())
heapq.heapify(min_heap)
while min_heap:
# Get smallest card (must start a group)
smallest = min_heap[0]
# Form one group starting at smallest
for i in range(group_size):
card = smallest + i
if card_count[card] == 0:
return False # Card unavailable
card_count[card] -= 1
# Remove from heap if exhausted
if card_count[card] == 0:
# Only remove if it's the heap minimum
if card != min_heap[0]:
return False # Gap in sequence
heapq.heappop(min_heap)
return True
explanation: |
**Time Complexity:** O(n log n) — Heap operations for each card removal.
**Space Complexity:** O(n) — Hash map and heap storage.
This approach uses a min-heap to always access the smallest card. We form one group at a time, removing cards from the heap when exhausted. The constraint that we can only pop the heap minimum ensures consecutive sequences are valid. This is slightly less efficient than the hash map approach but demonstrates an alternative technique.

View File

@@ -0,0 +1,195 @@
title: Happy Number
slug: happy-number
difficulty: easy
leetcode_id: 202
leetcode_url: https://leetcode.com/problems/happy-number/
categories:
- math
- hash-tables
patterns:
- fast-slow-pointers
function_signature: "def is_happy(n: int) -> bool:"
test_cases:
visible:
- input: { n: 19 }
expected: true
- input: { n: 2 }
expected: false
- input: { n: 1 }
expected: true
hidden:
- input: { n: 7 }
expected: true
- input: { n: 4 }
expected: false
- input: { n: 100 }
expected: true
description: |
Write an algorithm to determine if a number `n` is happy.
A **happy number** is a number defined by the following process:
- Starting with any positive integer, replace the number by the sum of the squares of its digits.
- Repeat the process until the number equals `1` (where it will stay), or it **loops endlessly in a cycle** which does not include `1`.
- Those numbers for which this process **ends in 1** are happy.
Return `true` if `n` is a happy number, and `false` if not.
constraints: |
- `1 <= n <= 2^31 - 1`
examples:
- input: "n = 19"
output: "true"
explanation: "1^2 + 9^2 = 82 -> 8^2 + 2^2 = 68 -> 6^2 + 8^2 = 100 -> 1^2 + 0^2 + 0^2 = 1"
- input: "n = 2"
output: "false"
explanation: "The sequence 2 -> 4 -> 16 -> 37 -> 58 -> 89 -> 145 -> 42 -> 20 -> 4 enters a cycle that never reaches 1."
explanation:
intuition: |
Think of this problem as following a path through a maze of numbers. Starting from `n`, you compute the sum of squared digits to get the next number, then repeat. The key insight is that this sequence must eventually do one of two things: either reach `1` (happy!) or enter a cycle (unhappy).
Why must it cycle? Because the sum of squared digits for any number has an upper bound. For a number with `d` digits, the maximum sum is `d * 81` (when all digits are `9`). For the largest input (`2^31 - 1`, which has 10 digits), the maximum possible sum is 810. So after at most one step, you're working with numbers in a bounded range, and a bounded sequence that never terminates must eventually repeat.
This cycle-detection insight opens up two elegant solutions:
1. **Hash Set**: Track every number you've seen. If you see a repeat before reaching `1`, there's a cycle.
2. **Floyd's Cycle Detection (Fast-Slow Pointers)**: Use two "runners" through the sequence at different speeds. If there's a cycle, the fast runner will eventually lap the slow runner.
The fast-slow pointer approach is particularly elegant because it uses O(1) space instead of O(n) for storing visited numbers.
approach: |
We solve this using **Floyd's Cycle Detection** (also known as the tortoise and hare algorithm):
**Step 1: Define a helper function**
- `get_next(n)`: Computes the sum of squares of digits
- Extract each digit using modulo and integer division
- Square each digit and accumulate the sum
&nbsp;
**Step 2: Initialise two pointers**
- `slow`: Starts at `n`, moves one step at a time
- `fast`: Starts at `get_next(n)`, moves two steps at a time
&nbsp;
**Step 3: Run the cycle detection loop**
- While `fast != 1` and `fast != slow`:
- Move `slow` one step: `slow = get_next(slow)`
- Move `fast` two steps: `fast = get_next(get_next(fast))`
- If they meet before reaching `1`, there's a cycle (unhappy)
- If `fast` reaches `1`, the number is happy
&nbsp;
**Step 4: Return the result**
- Return `fast == 1`
- If `fast` is `1`, we found happiness; otherwise we detected a cycle
common_pitfalls:
- title: Infinite Loop Without Cycle Detection
description: |
A naive approach might just keep computing the next number forever:
```python
while n != 1:
n = sum_of_squares(n)
return True
```
This will never terminate for unhappy numbers like `2`, which cycle endlessly through `2 -> 4 -> 16 -> 37 -> 58 -> 89 -> 145 -> 42 -> 20 -> 4 -> ...`
You **must** detect cycles, either with a hash set or Floyd's algorithm.
wrong_approach: "Loop until n equals 1"
correct_approach: "Track visited numbers or use Floyd's cycle detection"
- title: Forgetting Edge Cases
description: |
The number `1` is already happy (sum of squares of `1` is `1`). Single-digit numbers like `7` are also happy (`7 -> 49 -> 97 -> 130 -> 10 -> 1`).
Make sure your initial setup handles these correctly. With Floyd's algorithm, initialising `slow = n` and `fast = get_next(n)` naturally handles `n = 1` because `fast` immediately becomes `1`.
- title: Integer Overflow in get_next
description: |
When extracting digits, some implementations might use string conversion which is slower. The mathematical approach using `n % 10` and `n // 10` is both faster and avoids any potential issues with very large numbers during intermediate steps.
However, since the sum of squared digits is bounded (maximum ~810 for 10-digit numbers), overflow is not a concern for the result.
key_takeaways:
- "**Cycle detection pattern**: Floyd's algorithm (fast-slow pointers) is useful whenever you need to detect cycles in a sequence with O(1) space"
- "**Bounded sequences**: Recognising that the sequence values are bounded (max ~810) proves that cycles must occur for non-happy numbers"
- "**Math vs Hash Table tradeoff**: The hash set approach is simpler to understand but uses O(k) space where k is the cycle length; Floyd's uses O(1)"
- "**Related problems**: This pattern applies to Linked List Cycle, Find the Duplicate Number, and other sequence-based cycle problems"
time_complexity: "O(log n). The number of digits in n is O(log n), and we process each number in the sequence. The sequence length is bounded by a constant for any starting value."
space_complexity: "O(1) for Floyd's algorithm, or O(log n) for the hash set approach (storing visited numbers)."
solutions:
- approach_name: Floyd's Cycle Detection
is_optimal: true
code: |
def is_happy(n: int) -> bool:
def get_next(num: int) -> int:
"""Calculate sum of squares of digits."""
total = 0
while num > 0:
digit = num % 10 # Extract last digit
total += digit * digit # Add its square
num //= 10 # Remove last digit
return total
# Floyd's algorithm: slow moves 1 step, fast moves 2 steps
slow = n
fast = get_next(n)
# Continue until fast reaches 1 or they meet (cycle detected)
while fast != 1 and slow != fast:
slow = get_next(slow) # One step
fast = get_next(get_next(fast)) # Two steps
# Happy if we reached 1, unhappy if cycle detected
return fast == 1
explanation: |
**Time Complexity:** O(log n) — Each number has O(log n) digits to process, and the sequence is bounded.
**Space Complexity:** O(1) — Only uses two pointer variables regardless of input size.
Floyd's cycle detection elegantly solves the problem: if a cycle exists, the fast pointer will eventually catch up to the slow pointer. If no cycle exists (happy number), fast reaches 1 first.
- approach_name: Hash Set
is_optimal: false
code: |
def is_happy(n: int) -> bool:
def get_next(num: int) -> int:
"""Calculate sum of squares of digits."""
total = 0
while num > 0:
digit = num % 10
total += digit * digit
num //= 10
return total
# Track all numbers we've seen
seen = set()
while n != 1 and n not in seen:
seen.add(n) # Mark current number as visited
n = get_next(n) # Move to next in sequence
# Happy if we reached 1, unhappy if we saw a repeat
return n == 1
explanation: |
**Time Complexity:** O(log n) — Same as Floyd's approach.
**Space Complexity:** O(log n) — Stores visited numbers in the set.
This approach is more intuitive: just remember what you've seen. If you see a number twice before reaching 1, you're in a cycle. The tradeoff is using extra memory for the set.

View File

@@ -0,0 +1,213 @@
title: House Robber II
slug: house-robber-ii
difficulty: medium
leetcode_id: 213
leetcode_url: https://leetcode.com/problems/house-robber-ii/
categories:
- arrays
- dynamic-programming
patterns:
- dynamic-programming
description: |
You are a professional robber planning to rob houses along a street. Each house has a certain amount of money stashed. All houses at this place are **arranged in a circle**. That means the first house is the neighbour of the last one. Meanwhile, adjacent houses have a security system connected, and **it will automatically contact the police if two adjacent houses were broken into on the same night**.
Given an integer array `nums` representing the amount of money of each house, return *the maximum amount of money you can rob tonight without alerting the police*.
constraints: |
- `1 <= nums.length <= 100`
- `0 <= nums[i] <= 1000`
examples:
- input: "nums = [2,3,2]"
output: "3"
explanation: "You cannot rob house 1 (money = 2) and then rob house 3 (money = 2), because they are adjacent houses."
- input: "nums = [1,2,3,1]"
output: "4"
explanation: "Rob house 1 (money = 1) and then rob house 3 (money = 3). Total amount you can rob = 1 + 3 = 4."
- input: "nums = [1,2,3]"
output: "3"
explanation: "Rob house 2 (money = 3) since it's the highest value and not adjacent to itself."
explanation:
intuition: |
This problem is a clever extension of the classic House Robber problem. The twist? The houses are arranged in a **circle**, meaning the first and last houses are neighbours.
Think of it like this: imagine the houses arranged around a cul-de-sac instead of a straight street. If you rob the first house, you can't rob the last one (they share a fence). Conversely, if you rob the last house, you can't rob the first.
The key insight is that **you can never rob both the first and last house** — they're mutually exclusive. This transforms the circular problem into two linear problems:
- **Scenario A**: Rob from houses `0` to `n-2` (exclude the last house)
- **Scenario B**: Rob from houses `1` to `n-1` (exclude the first house)
The answer is simply the maximum of these two scenarios. Each scenario is just the original House Robber problem, which we solve with dynamic programming!
approach: |
We solve this by **reducing the circular problem to two linear problems**:
**Step 1: Handle the edge case**
- If there's only one house, return `nums[0]` — no circular constraint applies
&nbsp;
**Step 2: Define a helper function for linear House Robber**
- This function solves the original problem on a subarray
- Use two variables (`prev1`, `prev2`) to track the maximum money achievable
- Recurrence: `current = max(nums[i] + prev2, prev1)`
&nbsp;
**Step 3: Run the helper on two scenarios**
- `rob_linear(nums[0:n-1])`: Exclude the last house (can rob the first)
- `rob_linear(nums[1:n])`: Exclude the first house (can rob the last)
- These two ranges cover all valid combinations — if we rob both first and last, neither scenario includes it
&nbsp;
**Step 4: Return the maximum**
- `max(scenario_a, scenario_b)` gives the optimal answer
- One of these scenarios will contain the true optimal solution
common_pitfalls:
- title: Treating It Like a Linear Array
description: |
A common mistake is to directly apply the House Robber I solution without considering the circular constraint.
For `nums = [2, 3, 2]`:
- Linear approach might yield `2 + 2 = 4` (houses 0 and 2)
- But houses 0 and 2 are adjacent in a circle!
- Correct answer is `3` (just house 1)
Always remember: in a circle, index `0` and index `n-1` are neighbours.
wrong_approach: "Apply House Robber I directly"
correct_approach: "Split into two linear subproblems excluding first or last house"
- title: Forgetting the Single House Case
description: |
When `nums.length == 1`, both scenarios (`nums[0:0]` and `nums[1:1]`) would be empty arrays, returning 0.
But with one house, there are no neighbours — you can simply rob it! Always handle this edge case explicitly by returning `nums[0]` when `n == 1`.
wrong_approach: "Let the helper function handle all cases"
correct_approach: "Check n == 1 before splitting into scenarios"
- title: Off-by-One in Array Slicing
description: |
When excluding the last house, use `nums[0:n-1]` (indices 0 to n-2 inclusive).
When excluding the first house, use `nums[1:n]` (indices 1 to n-1 inclusive).
Python's slice notation is `[start:end)` — end is exclusive. A common error is:
- `nums[0:n-2]` — misses one house
- `nums[1:n+1]` — goes out of bounds
Double-check your slice boundaries match the scenarios described.
key_takeaways:
- "**Problem reduction**: Convert a harder problem (circular) into simpler subproblems (linear)"
- "**Mutual exclusion insight**: When constraints create mutually exclusive choices, solve each case separately"
- "**Reuse existing solutions**: House Robber II builds directly on House Robber I — recognise when you can leverage solved subproblems"
- "**Pattern for circular arrays**: Many circular array problems can be solved by breaking the cycle and running linear algorithms twice"
time_complexity: "O(n). We run the linear House Robber algorithm twice, each taking O(n) time, giving O(2n) = O(n)."
space_complexity: "O(1). The space-optimised linear algorithm uses only two variables, and we run it twice sequentially."
solutions:
- approach_name: Two-Pass Dynamic Programming
is_optimal: true
code: |
def rob(nums: list[int]) -> int:
# Edge case: single house has no circular constraint
if len(nums) == 1:
return nums[0]
def rob_linear(houses: list[int]) -> int:
"""Solve the linear House Robber problem."""
prev2 = 0 # Max money from two houses back
prev1 = 0 # Max money from previous house
for money in houses:
# Rob this house + prev2, or skip and keep prev1
current = max(money + prev2, prev1)
prev2 = prev1
prev1 = current
return prev1
n = len(nums)
# Scenario A: exclude last house (can rob first)
# Scenario B: exclude first house (can rob last)
return max(rob_linear(nums[:n-1]), rob_linear(nums[1:]))
explanation: |
**Time Complexity:** O(n) — Two linear passes through subarrays of size n-1.
**Space Complexity:** O(1) — Only uses constant extra space (two variables per pass).
By excluding either the first or last house, we break the circular constraint and can apply the standard House Robber DP approach. The maximum of both scenarios gives us the optimal answer because any valid solution must exclude at least one of the endpoints.
- approach_name: Two-Pass with Explicit Ranges
is_optimal: false
code: |
def rob(nums: list[int]) -> int:
n = len(nums)
# Edge case: single house
if n == 1:
return nums[0]
def rob_range(start: int, end: int) -> int:
"""Rob houses from index start to end (inclusive)."""
prev2 = 0
prev1 = 0
for i in range(start, end + 1):
current = max(nums[i] + prev2, prev1)
prev2 = prev1
prev1 = current
return prev1
# Exclude last house OR exclude first house
return max(rob_range(0, n - 2), rob_range(1, n - 1))
explanation: |
**Time Complexity:** O(n) — Two passes through subarrays.
**Space Complexity:** O(1) — Constant extra space.
This version uses explicit index ranges instead of array slicing. It avoids creating subarray copies (though in practice, Python's slice is efficient). The logic is identical: solve two linear subproblems and take the maximum.
- approach_name: DP with Array (Educational)
is_optimal: false
code: |
def rob(nums: list[int]) -> int:
n = len(nums)
if n == 1:
return nums[0]
if n == 2:
return max(nums[0], nums[1])
def rob_linear(houses: list[int]) -> int:
"""Standard House Robber with DP array."""
m = len(houses)
if m == 1:
return houses[0]
dp = [0] * m
dp[0] = houses[0]
dp[1] = max(houses[0], houses[1])
for i in range(2, m):
dp[i] = max(houses[i] + dp[i - 2], dp[i - 1])
return dp[m - 1]
# Two scenarios: exclude last or exclude first
return max(rob_linear(nums[:-1]), rob_linear(nums[1:]))
explanation: |
**Time Complexity:** O(n) — Two linear passes.
**Space Complexity:** O(n) — DP arrays of size n-1 for each pass.
This version explicitly builds the DP table, making the recurrence relation easier to trace. Each `dp[i]` represents the maximum money from houses 0 to i in that subarray. While less space-efficient, this is useful for understanding the DP transition before optimising to O(1) space.

View File

@@ -0,0 +1,246 @@
title: House Robber III
slug: house-robber-iii
difficulty: medium
leetcode_id: 337
leetcode_url: https://leetcode.com/problems/house-robber-iii/
categories:
- trees
- dynamic-programming
patterns:
- dfs
- dynamic-programming
description: |
The thief has found himself a new place for his thievery again. There is only one entrance to this area, called `root`.
Besides the `root`, each house has one and only one parent house. After a tour, the smart thief realised that all houses in this place form a **binary tree**. It will automatically contact the police if **two directly-linked houses were broken into on the same night**.
Given the `root` of the binary tree, return *the maximum amount of money the thief can rob without alerting the police*.
constraints: |
- The number of nodes in the tree is in the range `[1, 10^4]`
- `0 <= Node.val <= 10^4`
examples:
- input: "root = [3,2,3,null,3,null,1]"
output: "7"
explanation: "Maximum amount of money the thief can rob = 3 + 3 + 1 = 7 (root and two grandchildren)."
- input: "root = [3,4,5,1,3,null,1]"
output: "9"
explanation: "Maximum amount of money the thief can rob = 4 + 5 = 9 (the two children of root)."
explanation:
intuition: |
Picture a family tree where each person holds some cash. You want to collect as much money as possible, but there's a catch: if you take money from someone, you can't take from their direct parent or children — only from grandparents, grandchildren, or unrelated branches.
The key insight is that at every node, you face a **binary choice**:
- **Rob this node**: You get its value, but you *cannot* rob its children. However, you *can* rob its grandchildren (the children's children).
- **Skip this node**: You don't get its value, but you're free to rob its children (and potentially their children too).
Think of it like this: each node needs to report back two pieces of information to its parent — *"Here's how much you can get if you rob me, and here's how much you can get if you skip me."* The parent then uses both pieces to make its own optimal decision.
This naturally suggests a **post-order DFS** approach: process children first, collect their "rob/skip" information, then compute the current node's optimal values.
approach: |
We solve this using **Tree DP with DFS**, where each node returns a pair of values: `(rob_this_node, skip_this_node)`.
**Step 1: Define what each node returns**
- `rob`: Maximum money if we rob this node (includes node's value, but excludes children)
- `skip`: Maximum money if we skip this node (children are free to be robbed or skipped)
&nbsp;
**Step 2: Handle the base case**
- For a `null` node (empty subtree), return `(0, 0)` — no money either way
- This provides the termination condition for our recursion
&nbsp;
**Step 3: Recurse on children (post-order DFS)**
- Call the function on `left` child, getting `(left_rob, left_skip)`
- Call the function on `right` child, getting `(right_rob, right_skip)`
- We now know the optimal values for both subtrees
&nbsp;
**Step 4: Calculate current node's values**
- `rob_current = node.val + left_skip + right_skip`
- If we rob this node, children must be skipped
- `skip_current = max(left_rob, left_skip) + max(right_rob, right_skip)`
- If we skip this node, each child independently chooses its best option
&nbsp;
**Step 5: Return the answer**
- At the root, return `max(rob_root, skip_root)`
- This gives the global maximum across all valid robbery plans
common_pitfalls:
- title: Naive Recursion Without Memoisation
description: |
A tempting approach is to write a simple recursive function:
```python
def rob(node):
if not node:
return 0
# Rob this node + grandchildren
rob_this = node.val
if node.left:
rob_this += rob(node.left.left) + rob(node.left.right)
if node.right:
rob_this += rob(node.right.left) + rob(node.right.right)
# Skip this node, rob children
skip_this = rob(node.left) + rob(node.right)
return max(rob_this, skip_this)
```
This recalculates the same subtrees multiple times. For example, `rob(node.left)` is computed both when considering robbing the current node's grandchildren and when skipping the current node. This leads to **exponential time complexity O(2^n)** and will cause TLE.
wrong_approach: "Simple recursion visiting same nodes repeatedly"
correct_approach: "Return (rob, skip) pair so each node is visited exactly once"
- title: Forgetting the Skip Option Gives Freedom
description: |
When you skip a node, you're not forced to rob its children — you simply have the *option* to rob them.
The correct formula for `skip_current` is:
```
skip = max(left_rob, left_skip) + max(right_rob, right_skip)
```
A common mistake is writing `skip = left_rob + right_rob`, which forces robbing both children. But sometimes skipping a child yields more money (e.g., if the grandchildren have higher values).
wrong_approach: "skip = left_rob + right_rob"
correct_approach: "skip = max(left_rob, left_skip) + max(right_rob, right_skip)"
- title: Confusing Tree DP with Array DP
description: |
Unlike House Robber I (array) where you track `dp[i-1]` and `dp[i-2]`, tree DP tracks relationships via parent-child edges, not indices.
You can't simply apply the array recurrence `dp[i] = max(nums[i] + dp[i-2], dp[i-1])` because:
- Trees have multiple children (not just one "previous" element)
- The "skip two" concept becomes "skip direct link" (rob grandchildren, not `i-2`)
- Each node can have 0, 1, or 2 children
The pair-returning approach `(rob, skip)` is the tree analogue of the space-optimised array DP.
key_takeaways:
- "**Tree DP pattern**: Return multiple values (rob/skip) from recursion to avoid redundant computation"
- "**Post-order traversal**: Process children first, then compute parent's answer from children's results"
- "**Binary choice at each node**: Rob (take value, skip children) vs Skip (children choose freely)"
- "**Generalises House Robber**: Same core constraint (no adjacent), different data structure (tree vs array)"
time_complexity: "O(n). Each node is visited exactly once during the DFS traversal, and we do O(1) work per node."
space_complexity: "O(h) where h is the tree height. The recursion stack can grow as deep as the tree. In the worst case (skewed tree), this is O(n); for a balanced tree, it's O(log n)."
solutions:
- approach_name: Tree DP with DFS
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def rob(root: TreeNode | None) -> int:
def dfs(node: TreeNode | None) -> tuple[int, int]:
# Base case: null node contributes nothing
if not node:
return (0, 0)
# Post-order: process children first
left_rob, left_skip = dfs(node.left)
right_rob, right_skip = dfs(node.right)
# If we rob this node, we must skip both children
rob_current = node.val + left_skip + right_skip
# If we skip this node, each child chooses its best option
skip_current = max(left_rob, left_skip) + max(right_rob, right_skip)
return (rob_current, skip_current)
rob_root, skip_root = dfs(root)
return max(rob_root, skip_root)
explanation: |
**Time Complexity:** O(n) — Each node visited exactly once.
**Space Complexity:** O(h) — Recursion stack depth equals tree height.
Each node returns a pair: (max if robbed, max if skipped). The parent combines these to compute its own pair. At the root, we take the maximum of both options. This eliminates redundant computation by ensuring each subtree is evaluated exactly once.
- approach_name: Naive Recursion (TLE)
is_optimal: false
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def rob(root: TreeNode | None) -> int:
if not root:
return 0
# Option 1: Rob this node + grandchildren
rob_this = root.val
if root.left:
rob_this += rob(root.left.left) + rob(root.left.right)
if root.right:
rob_this += rob(root.right.left) + rob(root.right.right)
# Option 2: Skip this node, consider children
skip_this = rob(root.left) + rob(root.right)
return max(rob_this, skip_this)
explanation: |
**Time Complexity:** O(2^n) — Exponential due to overlapping subproblems.
**Space Complexity:** O(h) — Recursion stack depth.
This approach correctly identifies the two choices (rob or skip) but recalculates the same subtrees multiple times. For example, `rob(root.left)` is computed both directly and indirectly through grandchildren. This causes TLE on large trees. Included to illustrate why the pair-returning approach is necessary.
- approach_name: Recursion with Memoisation
is_optimal: false
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def rob(root: TreeNode | None) -> int:
memo = {}
def helper(node: TreeNode | None) -> int:
if not node:
return 0
if node in memo:
return memo[node]
# Option 1: Rob this node + grandchildren
rob_this = node.val
if node.left:
rob_this += helper(node.left.left) + helper(node.left.right)
if node.right:
rob_this += helper(node.right.left) + helper(node.right.right)
# Option 2: Skip this node, consider children
skip_this = helper(node.left) + helper(node.right)
memo[node] = max(rob_this, skip_this)
return memo[node]
return helper(root)
explanation: |
**Time Complexity:** O(n) — Each node computed once due to memoisation.
**Space Complexity:** O(n) — Hash map stores result for each node, plus O(h) recursion stack.
Adding memoisation to the naive approach fixes the exponential blowup. However, this uses O(n) extra space for the hash map, whereas the pair-returning approach achieves the same time complexity with only O(h) space. This solution is correct and efficient, but the optimal approach is more elegant.

View File

@@ -0,0 +1,182 @@
title: House Robber
slug: house-robber
difficulty: medium
leetcode_id: 198
leetcode_url: https://leetcode.com/problems/house-robber/
categories:
- arrays
- dynamic-programming
patterns:
- dynamic-programming
description: |
You are a professional robber planning to rob houses along a street. Each house has a certain amount of money stashed, the only constraint stopping you from robbing each of them is that adjacent houses have security systems connected and **it will automatically contact the police if two adjacent houses were broken into on the same night**.
Given an integer array `nums` representing the amount of money of each house, return *the maximum amount of money you can rob tonight without alerting the police*.
constraints: |
- `1 <= nums.length <= 100`
- `0 <= nums[i] <= 400`
examples:
- input: "nums = [1,2,3,1]"
output: "4"
explanation: "Rob house 1 (money = 1) and then rob house 3 (money = 3). Total amount you can rob = 1 + 3 = 4."
- input: "nums = [2,7,9,3,1]"
output: "12"
explanation: "Rob house 1 (money = 2), rob house 3 (money = 9) and rob house 5 (money = 1). Total amount you can rob = 2 + 9 + 1 = 12."
explanation:
intuition: |
Imagine walking down a street, deciding which houses to rob. At each house, you face a simple choice: **rob it or skip it**.
If you rob the current house, you can't rob the previous one (they're adjacent). But if you skip the current house, you keep whatever maximum you could achieve up to the previous house.
Think of it like this: for every house, you're asking *"What's better — taking this house plus the best I could do two houses ago, or skipping this house and keeping the best I could do at the previous house?"*
This is the core insight: the **optimal decision at each house only depends on the optimal decisions for the previous two houses**. This "overlapping subproblems" property makes it a textbook dynamic programming problem.
The key realisation is that you don't need to track *which specific houses* you robbed — you only need to track the **maximum money possible** up to each point.
approach: |
We solve this using **Dynamic Programming with Space Optimisation**:
**Step 1: Define the recurrence relation**
- Let `dp[i]` represent the maximum money we can rob from houses `0` to `i`
- At each house `i`, we have two choices:
- **Rob house `i`**: Take `nums[i]` plus the best from two houses back: `nums[i] + dp[i-2]`
- **Skip house `i`**: Keep the best from the previous house: `dp[i-1]`
- The recurrence: `dp[i] = max(nums[i] + dp[i-2], dp[i-1])`
&nbsp;
**Step 2: Recognise we only need two variables**
- The recurrence only looks back two steps (`dp[i-1]` and `dp[i-2]`)
- Instead of storing an entire array, use two variables:
- `prev1`: Maximum money up to the previous house (i.e., `dp[i-1]`)
- `prev2`: Maximum money up to two houses back (i.e., `dp[i-2]`)
&nbsp;
**Step 3: Iterate through each house**
- For each house, calculate: `current = max(nums[i] + prev2, prev1)`
- Update variables: `prev2 = prev1`, then `prev1 = current`
- This "slides" our window of knowledge forward by one house
&nbsp;
**Step 4: Return the result**
- After processing all houses, `prev1` contains the maximum money achievable
- Return `prev1`
common_pitfalls:
- title: The Greedy Trap (Alternating Houses)
description: |
A common first instinct is to simply take every other house — either all odd-indexed or all even-indexed houses.
This fails for cases like `nums = [2, 1, 1, 2]`:
- Odd indices (0, 2): `2 + 1 = 3`
- Even indices (1, 3): `1 + 2 = 3`
- But optimal is indices (0, 3): `2 + 2 = 4`
The pattern of which houses to rob isn't regular — it depends on the actual values. You might skip two houses in a row if the third house has a high value.
wrong_approach: "Take every other house"
correct_approach: "DP considering all valid combinations"
- title: Off-by-One Errors in Base Cases
description: |
The DP approach requires handling base cases carefully:
- If there's only one house, return `nums[0]`
- If there are two houses, return `max(nums[0], nums[1])`
Forgetting these edge cases leads to index out-of-bounds errors or incorrect results for small inputs.
wrong_approach: "Start iteration at index 0 without base cases"
correct_approach: "Handle n=1 and n=2 explicitly, then iterate from index 2"
- title: Confusing the Variable Updates
description: |
When using the space-optimised approach, the order of updates matters:
```python
# WRONG: prev2 gets the new value before we use it
prev2 = prev1
prev1 = max(nums[i] + prev2, prev1)
# CORRECT: Calculate first, then update in order
current = max(nums[i] + prev2, prev1)
prev2 = prev1
prev1 = current
```
Always calculate the new value *before* updating the variables it depends on.
key_takeaways:
- "**Classic DP pattern**: When optimal solutions depend on previous optimal solutions, think dynamic programming"
- "**Space optimisation**: If recurrence only looks back a fixed number of steps, replace the array with variables"
- "**Greedy doesn't always work**: Problems with non-local dependencies (like adjacency constraints) often need DP"
- "**Foundation for variants**: This logic extends to House Robber II (circular street) and House Robber III (binary tree)"
time_complexity: "O(n). We iterate through the array exactly once, making a constant-time decision at each house."
space_complexity: "O(1). We only use two variables (`prev1` and `prev2`) regardless of input size, thanks to space optimisation."
solutions:
- approach_name: Dynamic Programming (Space Optimised)
is_optimal: true
code: |
def rob(nums: list[int]) -> int:
# Edge case: only one house
if len(nums) == 1:
return nums[0]
# prev2 = max money from two houses back
# prev1 = max money from previous house
prev2 = 0
prev1 = nums[0]
for i in range(1, len(nums)):
# Choice: rob this house + prev2, or skip and keep prev1
current = max(nums[i] + prev2, prev1)
# Slide the window forward
prev2 = prev1
prev1 = current
return prev1
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(1) — Only two variables used.
We iterate once, at each step choosing the better option: rob current house (add to best from 2 houses back) or skip (keep best from previous house). The space optimisation works because we only ever look back two positions.
- approach_name: Dynamic Programming (Array)
is_optimal: false
code: |
def rob(nums: list[int]) -> int:
n = len(nums)
# Edge cases
if n == 1:
return nums[0]
if n == 2:
return max(nums[0], nums[1])
# dp[i] = max money from houses 0..i
dp = [0] * n
dp[0] = nums[0]
dp[1] = max(nums[0], nums[1])
for i in range(2, n):
# Rob house i (add to best from i-2) or skip (keep best from i-1)
dp[i] = max(nums[i] + dp[i - 2], dp[i - 1])
return dp[n - 1]
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(n) — We store the entire DP array.
This version explicitly builds the DP table, making the recurrence relation clearer. Each `dp[i]` represents the maximum money achievable from houses 0 to i. While correct, it uses more space than necessary since we only need the last two values.

View File

@@ -0,0 +1,204 @@
title: Implement Queue using Stacks
slug: implement-queue-using-stacks
difficulty: easy
leetcode_id: 232
leetcode_url: https://leetcode.com/problems/implement-queue-using-stacks/
categories:
- stack
- queue
patterns:
- monotonic-stack
description: |
Implement a first in first out (FIFO) queue using only two stacks. The implemented queue should support all the functions of a normal queue (`push`, `peek`, `pop`, and `empty`).
Implement the `MyQueue` class:
- `void push(int x)` Pushes element `x` to the back of the queue.
- `int pop()` Removes the element from the front of the queue and returns it.
- `int peek()` Returns the element at the front of the queue.
- `boolean empty()` Returns `true` if the queue is empty, `false` otherwise.
**Notes:**
- You must use **only** standard operations of a stack, which means only `push to top`, `peek/pop from top`, `size`, and `is empty` operations are valid.
- Depending on your language, the stack may not be supported natively. You may simulate a stack using a list or deque (double-ended queue) as long as you use only a stack's standard operations.
constraints: |
- `1 <= x <= 9`
- At most `100` calls will be made to `push`, `pop`, `peek`, and `empty`
- All the calls to `pop` and `peek` are valid
examples:
- input: |
["MyQueue", "push", "push", "peek", "pop", "empty"]
[[], [1], [2], [], [], []]
output: "[null, null, null, 1, 1, false]"
explanation: |
MyQueue myQueue = new MyQueue();
myQueue.push(1); // queue is: [1]
myQueue.push(2); // queue is: [1, 2] (leftmost is front of the queue)
myQueue.peek(); // return 1
myQueue.pop(); // return 1, queue is [2]
myQueue.empty(); // return false
explanation:
intuition: |
Think of this problem like having two buckets to simulate a conveyor belt.
A **queue** follows First-In-First-Out (FIFO) order — the first item added is the first to leave, like a line at a coffee shop. A **stack** follows Last-In-First-Out (LIFO) order — the last item added is the first to leave, like a stack of plates.
The key insight is that **reversing a stack gives you the opposite order**. If you push elements `1, 2, 3` onto a stack, they come out as `3, 2, 1`. But if you pop them all onto a *second* stack, they reverse again to `1, 2, 3` — exactly FIFO order!
So we use two stacks:
- An **input stack** where new elements are pushed
- An **output stack** from which elements are popped
When we need to pop or peek and the output stack is empty, we transfer all elements from the input stack to the output stack. This reversal converts the LIFO order to FIFO order.
approach: |
We solve this using a **Two-Stack Approach** with lazy transfer:
**Step 1: Initialise two stacks**
- `input_stack`: Where we push new elements
- `output_stack`: Where we pop/peek elements from
&nbsp;
**Step 2: Push operation**
- Simply push the element onto `input_stack`
- This is always O(1)
&nbsp;
**Step 3: Pop/Peek operation**
- If `output_stack` is empty, transfer all elements from `input_stack` to `output_stack`
- Each transfer reverses the order, converting LIFO to FIFO
- Pop or peek from `output_stack`
&nbsp;
**Step 4: Empty check**
- Queue is empty only when both stacks are empty
&nbsp;
The lazy transfer approach is key: we only move elements when necessary, which gives us amortised O(1) time per operation.
common_pitfalls:
- title: Transferring on Every Operation
description: |
A naive approach might transfer elements between stacks on every push or pop. This leads to O(n) for every operation.
The correct approach uses **lazy transfer**: only move elements from input to output when output is empty and we need to pop/peek. Each element is moved at most twice (once to input, once to output), giving amortised O(1).
wrong_approach: "Transfer between stacks on every operation"
correct_approach: "Lazy transfer only when output stack is empty"
- title: Forgetting to Check Output Stack First
description: |
When implementing pop/peek, you must first check if the output stack has elements before transferring from input. If you always transfer, you'll break the FIFO order.
For example, if output has `[1]` and input has `[2, 3]`, transferring would make output `[1, 3, 2]` which is wrong.
wrong_approach: "Always transfer from input to output"
correct_approach: "Only transfer when output is empty"
- title: Not Handling Peek Efficiently
description: |
Some implementations might pop, save the value, then push back for peek. This is unnecessary.
Since we have access to the top of the output stack, we can simply return the top element without modification.
key_takeaways:
- "**Data structure simulation**: You can simulate one data structure with another by understanding their fundamental properties"
- "**Amortised analysis**: Each element is pushed and popped at most twice total, so n operations take O(n) time overall"
- "**Lazy evaluation**: Deferring work (transfers) until necessary often improves average performance"
- "**Related problem**: The inverse problem — implementing a stack using queues (LeetCode 225) — uses similar reversal logic"
time_complexity: "O(1) amortised for all operations. Each element is moved between stacks at most once, so n operations take O(n) total."
space_complexity: "O(n). We store all n elements across the two stacks."
solutions:
- approach_name: Two Stacks with Lazy Transfer
is_optimal: true
code: |
class MyQueue:
def __init__(self):
# Input stack: where we push new elements
self.input_stack = []
# Output stack: where we pop/peek from
self.output_stack = []
def push(self, x: int) -> None:
# Always push to input stack - O(1)
self.input_stack.append(x)
def pop(self) -> int:
# Ensure output stack has elements
self._transfer_if_needed()
# Pop from output stack - FIFO order
return self.output_stack.pop()
def peek(self) -> int:
# Ensure output stack has elements
self._transfer_if_needed()
# Return top of output stack without removing
return self.output_stack[-1]
def empty(self) -> bool:
# Queue is empty only when both stacks are empty
return not self.input_stack and not self.output_stack
def _transfer_if_needed(self) -> None:
# Only transfer when output is empty - lazy evaluation
if not self.output_stack:
# Move all elements from input to output
# This reverses order: LIFO -> FIFO
while self.input_stack:
self.output_stack.append(self.input_stack.pop())
explanation: |
**Time Complexity:** O(1) amortised for all operations.
- `push`: O(1) — direct append
- `pop`/`peek`: O(1) amortised — each element is transferred at most once
- `empty`: O(1) — two boolean checks
**Space Complexity:** O(n) — storing n elements across both stacks.
The key insight is lazy transfer: we only move elements when the output stack is empty. Since each element moves from input to output exactly once, the amortised cost per operation is O(1).
- approach_name: Two Stacks with Eager Transfer
is_optimal: false
code: |
class MyQueue:
def __init__(self):
self.stack1 = [] # Main storage
self.stack2 = [] # Temporary for reversal
def push(self, x: int) -> None:
# Move everything to stack2
while self.stack1:
self.stack2.append(self.stack1.pop())
# Push new element to bottom of stack1
self.stack1.append(x)
# Move everything back
while self.stack2:
self.stack1.append(self.stack2.pop())
def pop(self) -> int:
# Top of stack1 is front of queue
return self.stack1.pop()
def peek(self) -> int:
return self.stack1[-1]
def empty(self) -> bool:
return not self.stack1
explanation: |
**Time Complexity:** O(n) for push, O(1) for pop/peek/empty.
**Space Complexity:** O(n) — storing n elements across both stacks.
This approach maintains FIFO order at all times by doing expensive work during push. Every push transfers all elements twice. While pop/peek become O(1), push is O(n), making this less efficient than lazy transfer when pushes are frequent.

View File

@@ -0,0 +1,216 @@
title: Implement Stack using Queues
slug: implement-stack-using-queues
difficulty: easy
leetcode_id: 225
leetcode_url: https://leetcode.com/problems/implement-stack-using-queues/
categories:
- stack
- queue
patterns:
- monotonic-stack
description: |
Implement a last-in-first-out (LIFO) stack using only two queues. The implemented stack should support all the functions of a normal stack (`push`, `top`, `pop`, and `empty`).
Implement the `MyStack` class:
- `void push(int x)` Pushes element `x` to the top of the stack.
- `int pop()` Removes the element on the top of the stack and returns it.
- `int top()` Returns the element on the top of the stack.
- `boolean empty()` Returns `true` if the stack is empty, `false` otherwise.
**Notes:**
- You must use **only** standard operations of a queue, which means only `push to back`, `peek/pop from front`, `size`, and `is empty` operations are valid.
- Depending on your language, the queue may not be supported natively. You may simulate a queue using a list or deque (double-ended queue) as long as you use only a queue's standard operations.
**Follow-up:** Can you implement the stack using only one queue?
constraints: |
- `1 <= x <= 9`
- At most `100` calls will be made to `push`, `pop`, `top`, and `empty`
- All the calls to `pop` and `top` are valid
examples:
- input: |
["MyStack", "push", "push", "top", "pop", "empty"]
[[], [1], [2], [], [], []]
output: "[null, null, null, 2, 2, false]"
explanation: |
MyStack myStack = new MyStack();
myStack.push(1); // stack is: [1]
myStack.push(2); // stack is: [1, 2] (rightmost is top of stack)
myStack.top(); // return 2
myStack.pop(); // return 2, stack is [1]
myStack.empty(); // return false
explanation:
intuition: |
Think of this problem like having a queue of people, but you want the *last* person who joined to be served first — the opposite of how a normal queue works.
A **stack** follows Last-In-First-Out (LIFO) order — the most recently added item is the first to leave, like a stack of plates. A **queue** follows First-In-First-Out (FIFO) order — the first item added is the first to leave, like a line at a coffee shop.
The key insight is that **rotating a queue puts the back element at the front**. If we push a new element to a queue and then rotate all the *other* elements behind it (by dequeuing from front and enqueuing to back), the new element ends up at the front — exactly where we need it for LIFO access!
For example, if the queue contains `[1, 2]` (front to back) and we push `3`:
1. Queue becomes `[1, 2, 3]`
2. Rotate twice: dequeue `1`, enqueue `1` → `[2, 3, 1]`
3. Rotate again: dequeue `2`, enqueue `2` → `[3, 1, 2]`
Now `3` (the most recent) is at the front, ready to be popped first!
approach: |
We solve this using a **Single Queue with Rotation** approach:
**Step 1: Initialise one queue**
- `queue`: A deque used with only queue operations (append to back, pop from front)
&nbsp;
**Step 2: Push operation**
- Append the new element to the back of the queue
- Rotate the queue by moving all *previous* elements behind the new one
- Specifically: pop from front and append to back, `n-1` times (where `n` is the current size)
- After rotation, the new element is at the front
&nbsp;
**Step 3: Pop operation**
- Simply pop from the front of the queue
- Since we maintain LIFO order, the front element is always the most recently pushed
&nbsp;
**Step 4: Top operation**
- Return the front element without removing it
- Use index access `queue[0]` or peek operation
&nbsp;
**Step 5: Empty check**
- Return whether the queue is empty
&nbsp;
This approach makes push O(n) but keeps pop and top at O(1), which is often preferable since pops are typically more frequent than pushes.
common_pitfalls:
- title: Rotating the Wrong Number of Times
description: |
When pushing a new element, you need to rotate exactly `n-1` elements (the elements that were already in the queue before the push), not `n` elements.
If you rotate `n` times, you'll move the new element to the back again, undoing the work:
- Push `3` to `[1, 2]` → `[1, 2, 3]`
- Rotate 3 times: `[2, 3, 1]` → `[3, 1, 2]` → `[1, 2, 3]` (back to original!)
The correct rotation count is `len(queue) - 1` after appending.
wrong_approach: "Rotate n times after pushing"
correct_approach: "Rotate n-1 times (previous size) after pushing"
- title: Using Two Queues Unnecessarily
description: |
The problem mentions "two queues", but the follow-up asks if you can do it with one. Many solutions use two queues and swap between them, which adds complexity without benefit.
The single-queue rotation approach is simpler and equally efficient. The two-queue approach might seem more intuitive at first, but it requires more bookkeeping.
- title: Confusing Queue Operations with Deque Operations
description: |
In Python, `collections.deque` supports both `appendleft` and `append`, but a true queue only allows:
- `append` (enqueue to back)
- `popleft` (dequeue from front)
- `len` and checking if empty
Using `appendleft` or `pop` (from back) violates the "queue only" constraint. Make sure your solution only uses valid queue operations.
key_takeaways:
- "**Data structure simulation**: You can simulate one data structure with another by understanding their ordering properties"
- "**Rotation technique**: Moving elements from front to back of a queue is a powerful way to reorder elements"
- "**Trade-off decisions**: Making push expensive (O(n)) keeps pop/top cheap (O(1)) — choose based on expected usage patterns"
- "**Related problem**: The inverse problem — implementing a queue using stacks (LeetCode 232) — uses a similar reversal concept"
time_complexity: "O(n) for push, O(1) for pop/top/empty. Each push rotates n-1 elements, while other operations access the front directly."
space_complexity: "O(n). We store all n elements in a single queue."
solutions:
- approach_name: Single Queue with Rotation
is_optimal: true
code: |
from collections import deque
class MyStack:
def __init__(self):
# Single queue to simulate stack
self.queue = deque()
def push(self, x: int) -> None:
# Add new element to back
self.queue.append(x)
# Rotate all previous elements behind it
# This puts the new element at the front
for _ in range(len(self.queue) - 1):
self.queue.append(self.queue.popleft())
def pop(self) -> int:
# Front of queue is top of stack (most recent)
return self.queue.popleft()
def top(self) -> int:
# Return front without removing
return self.queue[0]
def empty(self) -> bool:
return len(self.queue) == 0
explanation: |
**Time Complexity:**
- `push`: O(n) — rotate n-1 elements
- `pop`: O(1) — remove from front
- `top`: O(1) — access front element
- `empty`: O(1) — check length
**Space Complexity:** O(n) — storing n elements in the queue.
The key insight is rotating during push: after adding a new element to the back, we cycle all previous elements behind it. This ensures the most recently pushed element is always at the front, ready for O(1) pop/top access.
- approach_name: Two Queues with Transfer
is_optimal: false
code: |
from collections import deque
class MyStack:
def __init__(self):
self.q1 = deque() # Main queue
self.q2 = deque() # Temporary queue
def push(self, x: int) -> None:
# Push to temporary queue first
self.q2.append(x)
# Move all elements from q1 to q2
while self.q1:
self.q2.append(self.q1.popleft())
# Swap queues - q2 becomes the new main queue
self.q1, self.q2 = self.q2, self.q1
def pop(self) -> int:
# Front of q1 is top of stack
return self.q1.popleft()
def top(self) -> int:
return self.q1[0]
def empty(self) -> bool:
return len(self.q1) == 0
explanation: |
**Time Complexity:**
- `push`: O(n) — transfer all elements
- `pop`: O(1) — remove from front
- `top`: O(1) — access front element
- `empty`: O(1) — check length
**Space Complexity:** O(n) — storing n elements across two queues.
This approach uses two queues. On each push, we add the new element to an empty queue, then transfer all elements from the main queue. This puts the new element at the front. While correct, the single-queue rotation approach is simpler.

View File

@@ -0,0 +1,245 @@
title: Implement Trie (Prefix Tree)
slug: implement-trie-prefix-tree
difficulty: medium
leetcode_id: 208
leetcode_url: https://leetcode.com/problems/implement-trie-prefix-tree/
categories:
- strings
- hash-tables
patterns:
- trie
description: |
A [**trie**](https://en.wikipedia.org/wiki/Trie) (pronounced as "try") or **prefix tree** is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.
Implement the `Trie` class:
- `Trie()` Initialises the trie object.
- `void insert(String word)` Inserts the string `word` into the trie.
- `boolean search(String word)` Returns `true` if the string `word` is in the trie (i.e., was inserted before), and `false` otherwise.
- `boolean startsWith(String prefix)` Returns `true` if there is a previously inserted string `word` that has the prefix `prefix`, and `false` otherwise.
constraints: |
- `1 <= word.length, prefix.length <= 2000`
- `word` and `prefix` consist only of lowercase English letters.
- At most `3 * 10^4` calls **in total** will be made to `insert`, `search`, and `startsWith`.
examples:
- input: |
["Trie", "insert", "search", "search", "startsWith", "insert", "search"]
[[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]
output: "[null, null, true, false, true, null, true]"
explanation: |
Trie trie = new Trie();
trie.insert("apple");
trie.search("apple"); // return True
trie.search("app"); // return False
trie.startsWith("app"); // return True
trie.insert("app");
trie.search("app"); // return True
explanation:
intuition: |
Imagine building a word-completion system like the one in your phone's keyboard. When you type "app", the system suggests "apple", "application", "approve", and so on. How can we efficiently store thousands of words and quickly find all words that start with a given prefix?
A **trie** (prefix tree) is the perfect data structure for this. Think of it like a tree where each node represents a single character, and paths from the root to nodes spell out words or prefixes. Unlike a hash table which stores complete words, a trie shares common prefixes among words, making it extremely efficient for prefix-based operations.
Visualise it like a family tree of letters:
```
root
|
a
|
p
|
p
/ \
l (end of "app")
|
e
|
(end of "apple")
```
The key insight is that **each node stores its children** (the next possible characters) and a **flag indicating if a complete word ends here**. This allows us to distinguish between "app" being a complete word vs. just a prefix of "apple".
approach: |
We implement the Trie using nodes, where each node contains:
- A dictionary/hashmap mapping characters to child nodes
- A boolean flag indicating if a word ends at this node
&nbsp;
**Step 1: Define the TrieNode structure**
- `children`: A dictionary to store child nodes, keyed by character
- `is_end_of_word`: A boolean flag, initially `False`
&nbsp;
**Step 2: Implement insert(word)**
- Start at the root node
- For each character in the word:
- If the character doesn't exist in current node's children, create a new node
- Move to the child node for this character
- After processing all characters, mark the final node as `is_end_of_word = True`
&nbsp;
**Step 3: Implement search(word)**
- Start at the root node
- For each character in the word:
- If the character doesn't exist in current node's children, return `False`
- Move to the child node for this character
- After processing all characters, return the value of `is_end_of_word`
- This distinguishes between finding a prefix vs. a complete word
&nbsp;
**Step 4: Implement startsWith(prefix)**
- Follow the same traversal as `search`
- The only difference: return `True` if we successfully traverse all characters
- We don't need to check `is_end_of_word` since we only care if the prefix exists
&nbsp;
The beauty of this approach is that all three operations share the same traversal logic, just with different termination conditions.
common_pitfalls:
- title: Confusing search() with startsWith()
description: |
A common mistake is implementing `search()` the same way as `startsWith()` — both traverse the trie, but they have different success conditions.
For example, if we insert "apple" and then call `search("app")`, we should return `False` because "app" was never inserted as a complete word. However, `startsWith("app")` should return `True` because "apple" starts with "app".
The fix: `search()` must check `is_end_of_word` after traversal, while `startsWith()` only needs to confirm the path exists.
wrong_approach: "Return True after traversing the prefix successfully in search()"
correct_approach: "Check is_end_of_word flag after traversal for search(), not for startsWith()"
- title: Using Arrays Instead of Hash Maps
description: |
Some implementations use a fixed-size array of 26 elements (for lowercase letters a-z) instead of a hash map. While this works, it has drawbacks:
- Wastes memory for sparse nodes (most nodes won't have all 26 children)
- Less flexible if requirements change (e.g., supporting uppercase or other characters)
Using a hash map is more space-efficient for typical use cases and more adaptable.
wrong_approach: "Fixed array children[26] for every node"
correct_approach: "Dictionary/hash map for children"
- title: Forgetting to Initialise the Root
description: |
The root node is special — it doesn't represent any character but serves as the starting point for all operations. Forgetting to initialise it in the constructor leads to null pointer errors.
Always create an empty root node in `__init__()` with an empty children dictionary.
key_takeaways:
- "**Trie fundamentals**: Each node has children (a map of character → node) and an `is_end_of_word` flag"
- "**Prefix sharing**: Tries naturally share common prefixes, making them memory-efficient for related words"
- "**O(m) operations**: All operations (insert, search, startsWith) run in O(m) time where m is the word/prefix length — independent of how many words are stored"
- "**Foundation for advanced problems**: Tries are essential for autocomplete, spell checking, word search, and problems like Word Search II"
time_complexity: "O(m) for all operations, where `m` is the length of the word or prefix being processed. We traverse at most `m` nodes."
space_complexity: "O(n * m) in the worst case, where `n` is the number of words and `m` is the average word length. However, shared prefixes reduce actual space usage significantly."
solutions:
- approach_name: Hash Map Based Trie
is_optimal: true
code: |
class TrieNode:
def __init__(self):
# Maps character -> child TrieNode
self.children: dict[str, 'TrieNode'] = {}
# True if a complete word ends at this node
self.is_end_of_word: bool = False
class Trie:
def __init__(self):
# Root node doesn't represent any character
self.root = TrieNode()
def insert(self, word: str) -> None:
node = self.root
for char in word:
# Create child node if it doesn't exist
if char not in node.children:
node.children[char] = TrieNode()
# Move to the child node
node = node.children[char]
# Mark the end of the word
node.is_end_of_word = True
def search(self, word: str) -> bool:
node = self._traverse(word)
# Word exists only if we found the path AND it's marked as end
return node is not None and node.is_end_of_word
def startsWith(self, prefix: str) -> bool:
# Prefix exists if we can traverse to it (don't need end marker)
return self._traverse(prefix) is not None
def _traverse(self, s: str) -> TrieNode | None:
"""Helper to traverse the trie following string s.
Returns the final node if path exists, None otherwise."""
node = self.root
for char in s:
if char not in node.children:
return None
node = node.children[char]
return node
explanation: |
**Time Complexity:** O(m) for all operations, where m is the length of the input string.
**Space Complexity:** O(n * m) worst case for storing n words of average length m.
This implementation uses a hash map for children, providing O(1) average-case lookup per character. The `_traverse` helper method eliminates code duplication between `search` and `startsWith`.
- approach_name: Array Based Trie
is_optimal: false
code: |
class TrieNode:
def __init__(self):
# Fixed array for 26 lowercase letters (a=0, b=1, ..., z=25)
self.children: list[TrieNode | None] = [None] * 26
self.is_end_of_word: bool = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word: str) -> None:
node = self.root
for char in word:
# Convert character to index (a=0, b=1, etc.)
index = ord(char) - ord('a')
if node.children[index] is None:
node.children[index] = TrieNode()
node = node.children[index]
node.is_end_of_word = True
def search(self, word: str) -> bool:
node = self._traverse(word)
return node is not None and node.is_end_of_word
def startsWith(self, prefix: str) -> bool:
return self._traverse(prefix) is not None
def _traverse(self, s: str) -> TrieNode | None:
node = self.root
for char in s:
index = ord(char) - ord('a')
if node.children[index] is None:
return None
node = node.children[index]
return node
explanation: |
**Time Complexity:** O(m) for all operations — same as hash map version.
**Space Complexity:** O(n * 26 * m) worst case, since each node allocates 26 slots.
This approach uses a fixed-size array instead of a hash map. It has O(1) guaranteed lookup (no hash collisions), but wastes memory for sparse nodes. Useful when you know the character set is small and fixed.

View File

@@ -0,0 +1,194 @@
title: Insert Interval
slug: insert-interval
difficulty: medium
leetcode_id: 57
leetcode_url: https://leetcode.com/problems/insert-interval/
categories:
- arrays
patterns:
- intervals
description: |
You are given an array of non-overlapping intervals `intervals` where `intervals[i] = [start_i, end_i]` represent the start and the end of the i<sup>th</sup> interval and `intervals` is sorted in ascending order by `start_i`. You are also given an interval `newInterval = [start, end]` that represents the start and end of another interval.
Insert `newInterval` into `intervals` such that `intervals` is still sorted in ascending order by `start_i` and `intervals` still does not have any overlapping intervals (merge overlapping intervals if necessary).
Return `intervals` *after the insertion*.
**Note** that you don't need to modify `intervals` in-place. You can make a new array and return it.
constraints: |
- `0 <= intervals.length <= 10^4`
- `intervals[i].length == 2`
- `0 <= start_i <= end_i <= 10^5`
- `intervals` is sorted by `start_i` in **ascending** order
- `newInterval.length == 2`
- `0 <= start <= end <= 10^5`
examples:
- input: "intervals = [[1,3],[6,9]], newInterval = [2,5]"
output: "[[1,5],[6,9]]"
explanation: "The new interval [2,5] overlaps with [1,3], so they merge into [1,5]. The interval [6,9] doesn't overlap and remains unchanged."
- input: "intervals = [[1,2],[3,5],[6,7],[8,10],[12,16]], newInterval = [4,8]"
output: "[[1,2],[3,10],[12,16]]"
explanation: "The new interval [4,8] overlaps with [3,5], [6,7], and [8,10]. These all merge into [3,10]. Intervals [1,2] and [12,16] don't overlap."
- input: "intervals = [], newInterval = [5,7]"
output: "[[5,7]]"
explanation: "When there are no existing intervals, simply return the new interval."
explanation:
intuition: |
Imagine you have a timeline with several non-overlapping time blocks already scheduled, and you need to add a new meeting. Some existing blocks might overlap with your new meeting and need to be combined into one larger block.
The key insight is that the intervals are **already sorted** by start time. This means we can process them in order, and any interval that overlaps with our new interval must be *consecutive* in the list. There can't be a non-overlapping interval sandwiched between two overlapping ones.
Think of it as walking through a sorted list of events: first, we encounter events that end before our new event starts (no overlap, keep them). Then we hit events that overlap with ours (merge them all together). Finally, we see events that start after our merged event ends (no overlap, keep them too).
This three-phase approach lets us solve the problem in a single pass through the intervals.
approach: |
We solve this using a **Single Pass with Three Phases**:
**Step 1: Initialise result list**
- `result`: Empty list to store our final intervals
&nbsp;
**Step 2: Add all intervals that come before the new interval**
- Iterate through intervals while `intervals[i].end < newInterval.start`
- These intervals end before our new interval starts, so no overlap
- Add each of these directly to `result`
&nbsp;
**Step 3: Merge all overlapping intervals**
- Continue iterating while `intervals[i].start <= newInterval.end`
- These intervals overlap with our new interval (they start before it ends)
- Expand `newInterval` to encompass each overlapping interval:
- `newInterval.start = min(newInterval.start, intervals[i].start)`
- `newInterval.end = max(newInterval.end, intervals[i].end)`
- After processing all overlaps, add the merged `newInterval` to `result`
&nbsp;
**Step 4: Add all remaining intervals**
- Any remaining intervals start after our merged interval ends
- Add each of these directly to `result`
&nbsp;
**Step 5: Return the result**
- Return the `result` list containing all non-overlapping intervals
common_pitfalls:
- title: Forgetting to Handle Edge Cases
description: |
The new interval might need to be inserted at the very beginning (before all existing intervals), at the very end (after all existing intervals), or the input might be empty.
For example, with `intervals = [[3,5],[6,9]]` and `newInterval = [1,2]`, the new interval comes before everything. With `newInterval = [10,12]`, it comes after everything.
The three-phase approach handles these naturally: if no intervals are "before," phase 2 starts immediately. If no intervals overlap, we just add the new interval. If no intervals are "after," we're done after phase 3.
wrong_approach: "Assuming the new interval always overlaps with something"
correct_approach: "Handle all three phases even if some are empty"
- title: Incorrect Overlap Detection
description: |
Two intervals `[a, b]` and `[c, d]` overlap if and only if `a <= d` AND `c <= b`. A common mistake is checking only one condition.
For example, `[1, 5]` and `[3, 7]` overlap because `1 <= 7` AND `3 <= 5`.
In our algorithm, we use: "no overlap before" means `end < newStart`, and "overlap" means `start <= newEnd`. These conditions partition all intervals correctly.
wrong_approach: "Checking only if starts overlap or only if ends overlap"
correct_approach: "Check both conditions: interval.end >= new.start AND interval.start <= new.end"
- title: Mutating the New Interval Incorrectly
description: |
When merging, you must expand `newInterval` using both `min` for the start and `max` for the end. A common bug is only updating one bound.
For example, merging `[4, 8]` with `[3, 5]` should give `[3, 8]`, not `[4, 8]` or `[3, 5]`.
wrong_approach: "Only updating end or only updating start during merge"
correct_approach: "Always update: start = min(start, interval.start), end = max(end, interval.end)"
key_takeaways:
- "**Intervals pattern**: When intervals are sorted, overlapping intervals are always consecutive, enabling single-pass solutions"
- "**Three-phase structure**: Before, during, and after overlap is a common pattern for interval insertion and merging problems"
- "**Overlap condition**: Two intervals `[a,b]` and `[c,d]` overlap if and only if `max(a,c) <= min(b,d)`"
- "**Foundation for harder problems**: This technique extends to Merge Intervals, Meeting Rooms, and interval scheduling problems"
time_complexity: "O(n). We traverse the list of intervals exactly once, performing constant-time operations for each interval."
space_complexity: "O(n). We create a new result list that stores all intervals. In the worst case (no merging), this contains n+1 intervals."
solutions:
- approach_name: Single Pass with Three Phases
is_optimal: true
code: |
def insert(intervals: list[list[int]], newInterval: list[int]) -> list[list[int]]:
result = []
i = 0
n = len(intervals)
# Phase 1: Add all intervals that end before newInterval starts
while i < n and intervals[i][1] < newInterval[0]:
result.append(intervals[i])
i += 1
# Phase 2: Merge all overlapping intervals with newInterval
while i < n and intervals[i][0] <= newInterval[1]:
# Expand newInterval to include the overlapping interval
newInterval[0] = min(newInterval[0], intervals[i][0])
newInterval[1] = max(newInterval[1], intervals[i][1])
i += 1
# Add the merged interval
result.append(newInterval)
# Phase 3: Add all intervals that start after newInterval ends
while i < n:
result.append(intervals[i])
i += 1
return result
explanation: |
**Time Complexity:** O(n) — Single pass through all intervals.
**Space Complexity:** O(n) — Result list stores up to n+1 intervals.
We process intervals in three phases: (1) add non-overlapping intervals before, (2) merge all overlapping intervals into one, (3) add non-overlapping intervals after. The sorted property guarantees overlapping intervals are consecutive.
- approach_name: Binary Search Optimisation
is_optimal: false
code: |
import bisect
def insert(intervals: list[list[int]], newInterval: list[int]) -> list[list[int]]:
if not intervals:
return [newInterval]
# Find where overlaps might start and end using binary search
starts = [interval[0] for interval in intervals]
ends = [interval[1] for interval in intervals]
# Find first interval that might overlap (ends >= newInterval start)
left = bisect.bisect_left(ends, newInterval[0])
# Find last interval that might overlap (starts <= newInterval end)
right = bisect.bisect_right(starts, newInterval[1])
# If there are overlapping intervals, merge them
if left < right:
newInterval[0] = min(newInterval[0], intervals[left][0])
newInterval[1] = max(newInterval[1], intervals[right - 1][1])
# Build result: before + merged + after
return intervals[:left] + [newInterval] + intervals[right:]
explanation: |
**Time Complexity:** O(n) — While binary search is O(log n), slicing creates new lists in O(n).
**Space Complexity:** O(n) — Creating lists for starts, ends, and the result.
This approach uses binary search to find the range of overlapping intervals quickly. While binary search itself is O(log n), the overall complexity remains O(n) due to list slicing. This approach is more elegant but not faster in practice. It's included to show how binary search can identify overlap boundaries.

View File

@@ -0,0 +1,182 @@
title: Insert into a Binary Search Tree
slug: insert-into-a-binary-search-tree
difficulty: easy
leetcode_id: 701
leetcode_url: https://leetcode.com/problems/insert-into-a-binary-search-tree/
categories:
- trees
patterns:
- tree-traversal
description: |
You are given the `root` node of a binary search tree (BST) and a `val` to insert into the tree. Return *the root node of the BST after the insertion*. It is **guaranteed** that the new value does not exist in the original BST.
**Notice** that there may exist multiple valid ways for the insertion, as long as the tree remains a BST after insertion. You can return **any of them**.
constraints: |
- `0 <= Number of nodes <= 10^4`
- `-10^8 <= Node.val <= 10^8`
- All values `Node.val` are **unique**
- `-10^8 <= val <= 10^8`
- It's **guaranteed** that `val` does not exist in the original BST
examples:
- input: "root = [4,2,7,1,3], val = 5"
output: "[4,2,7,1,3,5]"
explanation: "Insert 5 as the left child of 7, since 5 < 7 and 7 has no left child. Another valid answer would insert 5 elsewhere while maintaining BST properties."
- input: "root = [40,20,60,10,30,50,70], val = 25"
output: "[40,20,60,10,30,50,70,null,null,25]"
explanation: "Navigate right from 20 (since 25 > 20), then left from 30 (since 25 < 30). Insert 25 as the left child of 30."
- input: "root = [], val = 5"
output: "[5]"
explanation: "When the tree is empty, the new value becomes the root."
explanation:
intuition: |
Think of a BST as a decision tree for binary search. At each node, you ask: "Is my value smaller or larger than this node?" The answer tells you which direction to go — left for smaller, right for larger.
The key insight is that **every value has exactly one "correct" leaf position** where it can be inserted without restructuring the tree. You simply follow the BST property until you hit an empty spot (`None`), and that's where the new node belongs.
Imagine you're looking up a word in a dictionary. You flip to the middle, decide if your word comes before or after, and keep narrowing down. When you reach the exact spot where your word *would* be if it existed, that's where you insert it.
This means insertion is essentially a **search that ends at a null pointer**, and we replace that null with our new node.
approach: |
We solve this using a **Recursive BST Traversal**:
**Step 1: Handle the base case**
- If `root` is `None`, we've found the insertion point
- Create and return a new `TreeNode` with the given value
&nbsp;
**Step 2: Decide which subtree to explore**
- If `val < root.val`, the new node belongs in the **left subtree**
- If `val > root.val`, the new node belongs in the **right subtree**
&nbsp;
**Step 3: Recursively insert and connect**
- Make a recursive call on the appropriate child
- Assign the result back to `root.left` or `root.right`
- This automatically handles the case where the child was `None`
&nbsp;
**Step 4: Return the root**
- Return the (unchanged) root to maintain the tree structure
- The recursive assignment ensures the new node gets properly linked
&nbsp;
The recursion naturally terminates when we reach a `None` child, which becomes the insertion point. By assigning the recursive result back to the parent's child pointer, we elegantly connect the new node.
common_pitfalls:
- title: Forgetting to Handle the Empty Tree
description: |
When `root` is `None` (empty tree), you must return a new node as the root. Some solutions only handle the case of inserting into existing trees, causing a crash or returning `None` for empty input.
Always check for `root is None` as your base case and return the new node.
wrong_approach: "Only handling non-empty trees"
correct_approach: "Check if root is None and return new TreeNode(val)"
- title: Not Connecting the New Node
description: |
A common mistake is to traverse to the correct position but forget to actually link the new node to its parent. Simply creating a new node isn't enough — you must assign it to the parent's `left` or `right` pointer.
The recursive approach handles this elegantly by assigning `root.left = insertIntoBST(root.left, val)`.
wrong_approach: "Creating node without linking to parent"
correct_approach: "Assign recursive result back to parent's child pointer"
- title: Modifying Existing Node Values
description: |
BST insertion adds a **new node**, not modifying an existing one. Don't try to swap values or restructure the tree — simply find the correct empty spot and insert there.
The problem guarantees the value doesn't exist, so you'll always reach a `None` position.
wrong_approach: "Changing existing node values"
correct_approach: "Always insert as a new leaf node"
key_takeaways:
- "**BST property drives insertion**: Left for smaller, right for larger — no complex logic needed"
- "**Recursion simplifies tree operations**: The base case handles insertion, recursive calls handle navigation"
- "**Insertion is search + create**: Follow the search path until hitting `None`, then insert"
- "**Foundation for BST operations**: This pattern extends to deletion, search, and validation problems"
time_complexity: "O(h) where h is the height of the tree. In a balanced BST, h = log(n), giving O(log n). In the worst case (skewed tree), h = n, giving O(n)."
space_complexity: "O(h) for the recursion stack. In a balanced tree this is O(log n), worst case O(n) for a skewed tree. The iterative approach achieves O(1) space."
solutions:
- approach_name: Recursive
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def insert_into_bst(root: TreeNode | None, val: int) -> TreeNode:
# Base case: found the insertion point
if root is None:
return TreeNode(val)
# Decide which subtree to insert into
if val < root.val:
# Value belongs in left subtree
root.left = insert_into_bst(root.left, val)
else:
# Value belongs in right subtree
root.right = insert_into_bst(root.right, val)
# Return root to maintain tree structure
return root
explanation: |
**Time Complexity:** O(h) — We traverse from root to a leaf, where h is the tree height.
**Space Complexity:** O(h) — Recursion stack depth equals the path length.
The recursive approach elegantly handles both navigation and connection. When we hit `None`, we return the new node, and the parent's assignment (`root.left = ...`) automatically links it.
- approach_name: Iterative
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def insert_into_bst(root: TreeNode | None, val: int) -> TreeNode:
# Handle empty tree
if root is None:
return TreeNode(val)
# Find the correct position
current = root
while True:
if val < current.val:
# Go left
if current.left is None:
# Found insertion point
current.left = TreeNode(val)
break
current = current.left
else:
# Go right
if current.right is None:
# Found insertion point
current.right = TreeNode(val)
break
current = current.right
return root
explanation: |
**Time Complexity:** O(h) — Same traversal as recursive approach.
**Space Complexity:** O(1) — No recursion stack, only a single pointer.
The iterative approach uses a while loop to find the insertion point, then directly links the new node. This avoids recursion overhead and is slightly more efficient in practice, though both have the same time complexity.

View File

@@ -0,0 +1,215 @@
title: Integer Break
slug: integer-break
difficulty: medium
leetcode_id: 343
leetcode_url: https://leetcode.com/problems/integer-break/
categories:
- dynamic-programming
- math
patterns:
- dynamic-programming
- greedy
description: |
Given an integer `n`, break it into the sum of `k` **positive integers**, where `k >= 2`, and maximize the product of those integers.
Return *the maximum product you can get*.
constraints: |
- `2 <= n <= 58`
examples:
- input: "n = 2"
output: "1"
explanation: "2 = 1 + 1, 1 x 1 = 1."
- input: "n = 10"
output: "36"
explanation: "10 = 3 + 3 + 4, 3 x 3 x 4 = 36."
explanation:
intuition: |
Imagine you have a rope of length `n` and you must cut it into at least two pieces. You want the **product of the piece lengths** to be as large as possible.
The key mathematical insight is: **3s are magical**. When you break a number into parts, using 3s (with some 2s for adjustment) produces the maximum product.
Why? Consider that for any number greater than 4, breaking off a 3 gives a better product than keeping it whole. For example, `6` as a single piece contributes 6 to the product, but `3 + 3` contributes `3 x 3 = 9`.
Think of it like this: you're trying to pack as many 3s as possible because they're the most "efficient" multiplier. The exceptions are:
- If the remainder is 1, you should take one 3 back and make two 2s instead (because `2 x 2 = 4 > 3 x 1 = 3`)
- If the remainder is 2, just keep it as a 2
This greedy insight can also be approached with dynamic programming, where we build up optimal products for smaller numbers.
approach: |
We can solve this using either a **Mathematical (Greedy)** approach or **Dynamic Programming**. The math approach is O(1), but DP helps understand the structure.
**Mathematical Approach:**
**Step 1: Handle base cases**
- If `n == 2`: Return `1` (must split into `1 + 1`)
- If `n == 3`: Return `2` (must split into `1 + 2`, giving `1 x 2 = 2`)
&nbsp;
**Step 2: Divide n by 3 to determine the split**
- If `n % 3 == 0`: Use all 3s. The answer is `3^(n/3)`
- If `n % 3 == 1`: Use one fewer 3 and add two 2s. The answer is `3^(n/3 - 1) x 4`
- If `n % 3 == 2`: Use all 3s plus one 2. The answer is `3^(n/3) x 2`
&nbsp;
**Dynamic Programming Approach:**
**Step 1: Initialise the DP array**
- Create array `dp` of size `n + 1` where `dp[i]` represents the maximum product for integer `i`
- Set `dp[1] = 1` as the base case
&nbsp;
**Step 2: Fill the DP table**
- For each `i` from `2` to `n`:
- Try every possible first cut `j` from `1` to `i - 1`
- The product is either `j x (i - j)` (if we don't break further) or `j x dp[i - j]` (if we continue breaking)
- Take the maximum across all cuts
&nbsp;
**Step 3: Return the result**
- Return `dp[n]`
common_pitfalls:
- title: Forgetting Base Cases
description: |
For `n = 2` and `n = 3`, the optimal "unforced" choice would be to not break at all, but the problem requires at least 2 pieces.
For `n = 2`: Must return `1` (from `1 + 1`)
For `n = 3`: Must return `2` (from `1 + 2`), not `3`
When using DP, remember that `dp[2]` and `dp[3]` used as subproblems can return their full value (2 and 3), since the "must break" constraint only applies to the original number.
wrong_approach: "Returning 2 for n=2 or 3 for n=3"
correct_approach: "Handle n=2 and n=3 as special cases with forced splits"
- title: Not Considering Both Options in DP
description: |
When computing `dp[i]`, for each cut position `j`, you must consider two options:
- `j x (i - j)`: Don't break the remaining part further
- `j x dp[i - j]`: Break the remaining part optimally
Missing the first option means you miss cases where not breaking further is optimal. For example, when `i = 4` and `j = 2`, the answer is `2 x 2 = 4`, not `2 x dp[2] = 2 x 1 = 2`.
wrong_approach: "Only considering j x dp[i - j]"
correct_approach: "max(j x (i - j), j x dp[i - j]) for each j"
- title: Using 1s in the Split
description: |
Including 1 in your split is almost always suboptimal. For any factor of 1, you could add that 1 to another factor to increase the product.
For example, `3 + 3 + 1` gives `3 x 3 x 1 = 9`, but `3 + 4` gives `3 x 4 = 12`.
The only time 1 appears is in forced base cases (`n = 2` and `n = 3`).
wrong_approach: "Splitting into parts that include 1"
correct_approach: "Only use 2s and 3s (except for base cases)"
key_takeaways:
- "**The power of 3**: For maximising products, 3 is the optimal factor (provable via calculus or discrete analysis)"
- "**Greedy meets math**: Sometimes a mathematical insight replaces the need for DP entirely, reducing O(n^2) to O(1)"
- "**DP transition structure**: The pattern of choosing whether to break further (`j x (i-j)` vs `j x dp[i-j]`) appears in many partition problems"
- "**Related problems**: This connects to *Cutting a Rod*, *Partition Equal Subset Sum*, and other optimisation over partitions"
time_complexity: "O(1) for the mathematical approach. O(n^2) for dynamic programming, as we compute each `dp[i]` by iterating through all possible cuts."
space_complexity: "O(1) for the mathematical approach. O(n) for dynamic programming to store the `dp` array."
solutions:
- approach_name: Mathematical (Greedy)
is_optimal: true
code: |
def integer_break(n: int) -> int:
# Base cases: must split, but would prefer not to
if n == 2:
return 1 # 1 + 1 = 2, product = 1
if n == 3:
return 2 # 1 + 2 = 3, product = 2
# For n >= 4, use as many 3s as possible
if n % 3 == 0:
# n is divisible by 3, use all 3s
return 3 ** (n // 3)
elif n % 3 == 1:
# Remainder 1: take one 3 back, use 2 + 2 instead
# (because 2 x 2 = 4 > 3 x 1 = 3)
return 3 ** (n // 3 - 1) * 4
else:
# Remainder 2: use all 3s plus one 2
return 3 ** (n // 3) * 2
explanation: |
**Time Complexity:** O(1) — Just arithmetic operations (exponentiation is O(log n) but with small exponents here).
**Space Complexity:** O(1) — Only a few variables used.
The mathematical insight is that 3 is the optimal factor. For any `n >= 5`, breaking off a 3 and multiplying gives a larger product than keeping the number whole. We handle remainders: if `n % 3 == 1`, we use `2 + 2` instead of `3 + 1` since `4 > 3`.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def integer_break(n: int) -> int:
# dp[i] = maximum product for integer i
dp = [0] * (n + 1)
dp[1] = 1 # Base case
for i in range(2, n + 1):
for j in range(1, i):
# Option 1: don't break (i - j) further
# Option 2: break (i - j) optimally using dp
product = max(j * (i - j), j * dp[i - j])
dp[i] = max(dp[i], product)
return dp[n]
explanation: |
**Time Complexity:** O(n^2) — For each `i` from 2 to n, we try all cuts from 1 to i-1.
**Space Complexity:** O(n) — We store the dp array of size n+1.
For each number `i`, we try every possible first cut `j`. The remaining part `i - j` can either stay whole (giving `j * (i - j)`) or be broken further (giving `j * dp[i - j]`). We take the maximum across all possibilities.
- approach_name: Recursion with Memoization
is_optimal: false
code: |
def integer_break(n: int) -> int:
memo = {}
def helper(num: int, must_break: bool) -> int:
# If we've computed this before, return cached result
if (num, must_break) in memo:
return memo[(num, must_break)]
# Base case
if num <= 1:
return num
# If we don't have to break, we can return num itself
if not must_break:
result = num
else:
result = 0
# Try all possible first cuts
for i in range(1, num):
# First piece is i, remaining is num - i (which can stay whole)
product = i * helper(num - i, False)
result = max(result, product)
memo[(num, must_break)] = result
return result
# Start with must_break=True since we need at least 2 pieces
return helper(n, True)
explanation: |
**Time Complexity:** O(n^2) — Each subproblem is solved once, and each takes O(n) to compute.
**Space Complexity:** O(n) — Memoization cache plus recursion stack.
This top-down approach explicitly tracks whether we're forced to break. The original call has `must_break=True`, but recursive calls for remaining parts use `must_break=False` since they can stay whole if that's optimal.

View File

@@ -0,0 +1,249 @@
title: Interleaving String
slug: interleaving-string
difficulty: medium
leetcode_id: 97
leetcode_url: https://leetcode.com/problems/interleaving-string/
categories:
- strings
- dynamic-programming
patterns:
- dynamic-programming
description: |
Given strings `s1`, `s2`, and `s3`, find whether `s3` is formed by an **interleaving** of `s1` and `s2`.
An **interleaving** of two strings `s` and `t` is a configuration where `s` and `t` are divided into `n` and `m` substrings respectively, such that:
- `s = s1 + s2 + ... + sn`
- `t = t1 + t2 + ... + tm`
- `|n - m| <= 1`
- The **interleaving** is `s1 + t1 + s2 + t2 + s3 + t3 + ...` or `t1 + s1 + t2 + s2 + t3 + s3 + ...`
**Note:** `a + b` is the concatenation of strings `a` and `b`.
constraints: |
- `0 <= s1.length, s2.length <= 100`
- `0 <= s3.length <= 200`
- `s1`, `s2`, and `s3` consist of lowercase English letters.
examples:
- input: 's1 = "aabcc", s2 = "dbbca", s3 = "aadbbcbcac"'
output: "true"
explanation: "One way to obtain s3 is: Split s1 into s1 = \"aa\" + \"bc\" + \"c\", and s2 into s2 = \"dbbc\" + \"a\". Interleaving the two splits, we get \"aa\" + \"dbbc\" + \"bc\" + \"a\" + \"c\" = \"aadbbcbcac\"."
- input: 's1 = "aabcc", s2 = "dbbca", s3 = "aadbbbaccc"'
output: "false"
explanation: "It is impossible to interleave s2 with any other string to obtain s3."
- input: 's1 = "", s2 = "", s3 = ""'
output: "true"
explanation: "Two empty strings trivially interleave to form an empty string."
explanation:
intuition: |
Imagine you have two decks of cards (representing `s1` and `s2`), and you want to merge them into a single pile (`s3`) while preserving the relative order of cards within each original deck. The question is: can `s3` be formed by picking cards alternately (or in any valid interleaving pattern) from the tops of these two decks?
The key insight is that at any point in building `s3`, you have a **choice**: take the next character from `s1` or from `s2`. This decision tree branches exponentially, but many branches lead to the same "state" — defined by how many characters we've used from each string.
Think of it as navigating a 2D grid where the x-axis represents progress through `s1` and the y-axis represents progress through `s2`. Starting at `(0, 0)`, you want to reach `(len(s1), len(s2))`. At each cell `(i, j)`, you can move right (use a character from `s1`) or down (use a character from `s2`) — but only if that character matches the next character needed in `s3`.
This grid perspective reveals the **optimal substructure**: whether we can reach `(i, j)` depends only on whether we could reach `(i-1, j)` or `(i, j-1)` with a matching character. This is the hallmark of dynamic programming.
approach: |
We solve this using **2D Dynamic Programming**:
**Step 1: Early termination check**
- If `len(s1) + len(s2) != len(s3)`, return `False` immediately — the lengths don't match, so interleaving is impossible
&nbsp;
**Step 2: Initialize the DP table**
- Create a 2D boolean table `dp` of size `(len(s1) + 1) x (len(s2) + 1)`
- `dp[i][j]` represents: "Can the first `i` characters of `s1` and first `j` characters of `s2` interleave to form the first `i + j` characters of `s3`?"
- Set `dp[0][0] = True` — empty strings trivially interleave to form an empty string
&nbsp;
**Step 3: Fill the first row and column**
- First row (`dp[0][j]`): Can `s2[:j]` alone form `s3[:j]`? Only if all characters match sequentially
- First column (`dp[i][0]`): Can `s1[:i]` alone form `s3[:i]`? Only if all characters match sequentially
- These represent paths that use only one string
&nbsp;
**Step 4: Fill the rest of the table**
- For each cell `dp[i][j]`, check two possibilities:
- **From the left** (`dp[i-1][j]`): If we could form `s3[:i+j-1]` and `s1[i-1] == s3[i+j-1]`, then `dp[i][j] = True`
- **From above** (`dp[i][j-1]`): If we could form `s3[:i+j-1]` and `s2[j-1] == s3[i+j-1]`, then `dp[i][j] = True`
- Either path being valid makes the current state valid
&nbsp;
**Step 5: Return the answer**
- Return `dp[len(s1)][len(s2)]` — whether we can use all of both strings to form all of `s3`
common_pitfalls:
- title: Exponential Brute Force
description: |
A naive recursive approach tries every possible way to interleave:
- At each position in `s3`, try matching with `s1` or `s2`
- This leads to `2^(m+n)` possibilities in the worst case
With `s1.length, s2.length <= 100`, this means up to `2^200` operations — astronomically too slow. The key insight is that many recursive calls compute the same subproblem (same `(i, j)` position), making this a perfect candidate for memoization or bottom-up DP.
wrong_approach: "Recursive backtracking without memoization"
correct_approach: "Dynamic programming with O(m*n) states"
- title: Forgetting the Length Check
description: |
If `len(s1) + len(s2) != len(s3)`, it's impossible to interleave — every character from `s1` and `s2` must appear exactly once in `s3`.
Without this early check, your DP might give false positives for cases like `s1 = "a"`, `s2 = "b"`, `s3 = "ab"` (valid) vs `s3 = "abc"` (invalid — extra character). Always verify lengths first.
wrong_approach: "Skip length validation"
correct_approach: "Check len(s1) + len(s2) == len(s3) upfront"
- title: Off-by-One Index Errors
description: |
The DP table has dimensions `(m+1) x (n+1)` to handle empty prefixes. When accessing characters:
- `dp[i][j]` uses `s1[i-1]` and `s2[j-1]` (0-indexed strings)
- The corresponding `s3` character is at index `i + j - 1`
Confusing 0-indexed strings with 1-indexed DP indices is a common source of bugs. Draw the grid and trace through an example to verify your indexing.
wrong_approach: "Using dp[i][j] with s1[i] and s2[j]"
correct_approach: "Using dp[i][j] with s1[i-1] and s2[j-1]"
key_takeaways:
- "**2D DP for two sequences**: When combining or comparing two sequences, think of a 2D grid where axes represent progress through each sequence"
- "**State definition is crucial**: Here, `dp[i][j]` captures whether prefixes of length `i` and `j` can form a prefix of `s3` — a clean, sufficient state"
- "**Space optimization possible**: The follow-up asks for `O(s2.length)` space — since each row only depends on the previous row and current row, you can use a 1D array"
- "**Early termination**: Simple checks like length validation can save significant computation and handle edge cases cleanly"
time_complexity: "O(m * n). We fill a 2D table of size `(m+1) x (n+1)` where `m = len(s1)` and `n = len(s2)`, with O(1) work per cell."
space_complexity: "O(m * n). We store a 2D boolean table. This can be optimized to O(n) by using a 1D array and updating in-place."
solutions:
- approach_name: 2D Dynamic Programming
is_optimal: true
code: |
def is_interleave(s1: str, s2: str, s3: str) -> bool:
m, n = len(s1), len(s2)
# Early termination: lengths must match
if m + n != len(s3):
return False
# dp[i][j] = can s1[:i] and s2[:j] interleave to form s3[:i+j]?
dp = [[False] * (n + 1) for _ in range(m + 1)]
# Base case: empty strings form empty string
dp[0][0] = True
# Fill first column: using only s1
for i in range(1, m + 1):
dp[i][0] = dp[i - 1][0] and s1[i - 1] == s3[i - 1]
# Fill first row: using only s2
for j in range(1, n + 1):
dp[0][j] = dp[0][j - 1] and s2[j - 1] == s3[j - 1]
# Fill the rest of the table
for i in range(1, m + 1):
for j in range(1, n + 1):
# Current position in s3
k = i + j - 1
# Can we get here from the left (using s1[i-1])?
from_s1 = dp[i - 1][j] and s1[i - 1] == s3[k]
# Can we get here from above (using s2[j-1])?
from_s2 = dp[i][j - 1] and s2[j - 1] == s3[k]
dp[i][j] = from_s1 or from_s2
return dp[m][n]
explanation: |
**Time Complexity:** O(m * n) — We iterate through every cell in the DP table once.
**Space Complexity:** O(m * n) — We store the full 2D table.
This solution builds up the answer by considering all valid ways to consume characters from `s1` and `s2`. Each cell represents a subproblem that's computed exactly once.
- approach_name: Space-Optimized DP (1D Array)
is_optimal: true
code: |
def is_interleave(s1: str, s2: str, s3: str) -> bool:
m, n = len(s1), len(s2)
# Early termination: lengths must match
if m + n != len(s3):
return False
# Use 1D array: dp[j] represents dp[i][j] for current row i
dp = [False] * (n + 1)
# Fill the DP table row by row
for i in range(m + 1):
for j in range(n + 1):
if i == 0 and j == 0:
dp[j] = True
elif i == 0:
# First row: only using s2
dp[j] = dp[j - 1] and s2[j - 1] == s3[j - 1]
elif j == 0:
# First column: only using s1
dp[j] = dp[j] and s1[i - 1] == s3[i - 1]
else:
# General case: from left (s1) or from above (s2)
k = i + j - 1
dp[j] = (dp[j] and s1[i - 1] == s3[k]) or \
(dp[j - 1] and s2[j - 1] == s3[k])
return dp[n]
explanation: |
**Time Complexity:** O(m * n) — Same iteration as 2D approach.
**Space Complexity:** O(n) — Only one row of the DP table is stored.
This answers the follow-up question. Since each row only depends on the current and previous row values, we can overwrite the array in-place. `dp[j]` holds the "from above" value before we update it, and `dp[j-1]` holds the already-updated "from left" value.
- approach_name: Recursive with Memoization
is_optimal: false
code: |
def is_interleave(s1: str, s2: str, s3: str) -> bool:
m, n = len(s1), len(s2)
if m + n != len(s3):
return False
# Memoization cache
memo = {}
def dp(i: int, j: int) -> bool:
# Base case: used all characters
if i == m and j == n:
return True
# Check cache
if (i, j) in memo:
return memo[(i, j)]
k = i + j # Current position in s3
result = False
# Try using next character from s1
if i < m and s1[i] == s3[k]:
result = dp(i + 1, j)
# Try using next character from s2
if not result and j < n and s2[j] == s3[k]:
result = dp(i, j + 1)
memo[(i, j)] = result
return result
return dp(0, 0)
explanation: |
**Time Complexity:** O(m * n) — Each unique `(i, j)` pair is computed once.
**Space Complexity:** O(m * n) — For the memoization cache, plus O(m + n) recursion stack.
This top-down approach is often more intuitive to write. It explores the decision tree but caches results to avoid redundant computation. The bottom-up DP is generally preferred for avoiding stack overflow on large inputs.

View File

@@ -0,0 +1,214 @@
title: Invert Binary Tree
slug: invert-binary-tree
difficulty: easy
leetcode_id: 226
leetcode_url: https://leetcode.com/problems/invert-binary-tree/
categories:
- trees
- recursion
patterns:
- tree-traversal
- dfs
- bfs
description: |
Given the `root` of a binary tree, invert the tree, and return *its root*.
Inverting a binary tree means swapping the left and right children of every node in the tree, creating a mirror image of the original structure.
constraints: |
- The number of nodes in the tree is in the range `[0, 100]`
- `-100 <= Node.val <= 100`
examples:
- input: "root = [4,2,7,1,3,6,9]"
output: "[4,7,2,9,6,3,1]"
explanation: "The tree is inverted by swapping left and right children at every level. Node 4's children swap (2↔7), then node 7's children swap (6↔9) and node 2's children swap (1↔3)."
- input: "root = [2,1,3]"
output: "[2,3,1]"
explanation: "The root's left child (1) and right child (3) are swapped."
- input: "root = []"
output: "[]"
explanation: "An empty tree remains empty after inversion."
explanation:
intuition: |
Imagine holding a mirror up to a binary tree. The reflection you see is the inverted tree — every left branch becomes a right branch, and vice versa.
The key insight is that **inverting a tree is a recursive operation**: to invert a tree rooted at any node, you simply swap its left and right children, then recursively invert each subtree. This naturally follows the structure of the tree itself.
Think of it like this: if someone asked you to mirror-flip a family tree diagram, you'd swap each parent's children, then do the same for each of those children's subtrees. The operation is the same at every level — a perfect fit for recursion.
The beauty of this problem is that the recursive solution directly mirrors the problem definition: invert the left subtree, invert the right subtree, then swap them.
approach: |
We solve this using a **Recursive (DFS) Approach**:
**Step 1: Handle the base case**
- If the node is `None`, return `None` immediately
- This handles empty trees and serves as the recursion termination condition
&nbsp;
**Step 2: Swap the children**
- Store the left child in a temporary variable (or use simultaneous assignment)
- Assign the right child to the left
- Assign the stored left child to the right
- This mirrors the current node's immediate children
&nbsp;
**Step 3: Recursively invert subtrees**
- Call `invert_tree` on the new left child (which was originally the right child)
- Call `invert_tree` on the new right child (which was originally the left child)
- This ensures all descendants are also inverted
&nbsp;
**Step 4: Return the root**
- Return the current node after its subtree has been fully inverted
- This allows the recursion to build back up correctly
&nbsp;
The order of swapping vs recursing doesn't matter — you can swap first then recurse, or recurse first then swap. Both produce the same result because every node gets visited and swapped exactly once.
common_pitfalls:
- title: Forgetting the Base Case
description: |
Without checking for `None`, your recursion will crash when it tries to access `.left` or `.right` on a null node.
Always start recursive tree functions with:
```python
if not root:
return None
```
wrong_approach: "Directly accessing root.left without null check"
correct_approach: "Check if root is None before any operations"
- title: Only Swapping at One Level
description: |
A common mistake is to swap the root's children but forget to recursively process the subtrees.
For example, with `[4,2,7,1,3,6,9]`, only swapping at the root gives `[4,7,2,1,3,6,9]` — the grandchildren are in the wrong positions. You need to continue swapping at every level.
wrong_approach: "Only swapping root.left and root.right"
correct_approach: "Recursively invert both subtrees after swapping"
- title: Overwriting Before Saving
description: |
If you write `root.left = root.right` first, you lose the original left child before you can assign it to the right.
Use either a temporary variable or Python's simultaneous assignment:
```python
root.left, root.right = root.right, root.left
```
wrong_approach: "Sequential assignment without temp variable"
correct_approach: "Simultaneous swap or use temporary variable"
key_takeaways:
- "**Recursive tree operations**: Many tree problems have elegant recursive solutions where you process subtrees and combine results"
- "**Base case discipline**: Always handle the `None` case first in tree recursion"
- "**Multiple valid approaches**: This can be solved with DFS (recursive or stack-based) or BFS (queue-based) — all with the same complexity"
- "**Famous interview problem**: This problem gained notoriety when a senior engineer reportedly couldn't solve it on a whiteboard, highlighting that even simple problems require practice"
time_complexity: "O(n). We visit each node exactly once to swap its children, where n is the number of nodes in the tree."
space_complexity: "O(h). The recursion stack can grow to the height of the tree. In the worst case (skewed tree), this is O(n). For a balanced tree, it's O(log n)."
solutions:
- approach_name: Recursive DFS
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def invert_tree(root: TreeNode | None) -> TreeNode | None:
# Base case: empty tree or leaf's null child
if not root:
return None
# Swap the left and right children
root.left, root.right = root.right, root.left
# Recursively invert both subtrees
invert_tree(root.left)
invert_tree(root.right)
# Return the root of the inverted tree
return root
explanation: |
**Time Complexity:** O(n) — Each node is visited exactly once.
**Space Complexity:** O(h) — Recursion stack depth equals tree height.
This elegant solution directly mirrors the problem definition. We swap children at the current node, then recursively handle subtrees. The simultaneous assignment `root.left, root.right = root.right, root.left` safely swaps without needing a temporary variable.
- approach_name: Iterative BFS
is_optimal: true
code: |
from collections import deque
def invert_tree(root: TreeNode | None) -> TreeNode | None:
if not root:
return None
# Use a queue for level-order traversal
queue = deque([root])
while queue:
# Process the next node
node = queue.popleft()
# Swap its children
node.left, node.right = node.right, node.left
# Add children to queue for processing
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
return root
explanation: |
**Time Complexity:** O(n) — Each node is visited exactly once.
**Space Complexity:** O(w) — Queue holds at most one level, where w is the maximum width of the tree. In the worst case (complete tree), this is O(n/2) = O(n).
This iterative approach uses BFS to visit nodes level by level. At each node, we swap its children and add them to the queue. This avoids recursion stack overhead but uses explicit queue memory instead.
- approach_name: Iterative DFS (Stack)
is_optimal: true
code: |
def invert_tree(root: TreeNode | None) -> TreeNode | None:
if not root:
return None
# Use a stack for depth-first traversal
stack = [root]
while stack:
# Process the next node
node = stack.pop()
# Swap its children
node.left, node.right = node.right, node.left
# Add children to stack for processing
if node.left:
stack.append(node.left)
if node.right:
stack.append(node.right)
return root
explanation: |
**Time Complexity:** O(n) — Each node is visited exactly once.
**Space Complexity:** O(h) — Stack depth equals tree height in the worst case.
This converts the recursive DFS to an iterative approach using an explicit stack. The logic is identical to the recursive version but avoids potential stack overflow for very deep trees. The order of processing differs from BFS but the final result is the same.

View File

@@ -0,0 +1,207 @@
title: IPO
slug: ipo
difficulty: hard
leetcode_id: 502
leetcode_url: https://leetcode.com/problems/ipo/
categories:
- arrays
- heap
- sorting
patterns:
- heap
- greedy
description: |
Suppose LeetCode will start its **IPO** soon. In order to sell a good price of its shares to Venture Capital, LeetCode would like to work on some projects to increase its capital before the **IPO**. Since it has limited resources, it can only finish at most `k` distinct projects before the **IPO**. Help LeetCode design the best way to maximize its total capital after finishing at most `k` distinct projects.
You are given `n` projects where the i<sup>th</sup> project has a pure profit `profits[i]` and a minimum capital of `capital[i]` is needed to start it.
Initially, you have `w` capital. When you finish a project, you will obtain its pure profit and the profit will be added to your total capital.
Pick a list of **at most** `k` distinct projects from given projects to **maximize your final capital**, and return *the final maximized capital*.
The answer is guaranteed to fit in a 32-bit signed integer.
constraints: |
- `1 <= k <= 10^5`
- `0 <= w <= 10^9`
- `n == profits.length`
- `n == capital.length`
- `1 <= n <= 10^5`
- `0 <= profits[i] <= 10^4`
- `0 <= capital[i] <= 10^9`
examples:
- input: "k = 2, w = 0, profits = [1,2,3], capital = [0,1,1]"
output: "4"
explanation: "Since your initial capital is 0, you can only start the project indexed 0. After finishing it you will obtain profit 1 and your capital becomes 1. With capital 1, you can either start the project indexed 1 or the project indexed 2. Since you can choose at most 2 projects, you need to finish the project indexed 2 to get the maximum capital. Therefore, output the final maximized capital, which is 0 + 1 + 3 = 4."
- input: "k = 3, w = 0, profits = [1,2,3], capital = [0,1,2]"
output: "6"
explanation: "With initial capital 0, start project 0 (profit 1, capital becomes 1). With capital 1, start project 1 (profit 2, capital becomes 3). With capital 3, start project 2 (profit 3, capital becomes 6)."
explanation:
intuition: |
Imagine you're an investor with limited starting capital, and you want to grow your wealth as quickly as possible by completing projects. Each project requires a minimum investment (capital) to start, but once completed, you pocket the profit and can reinvest.
The key insight is a **greedy observation**: at any point, among all the projects you *can* afford (those with `capital[i] <= current_capital`), you should always pick the one with the **highest profit**. Why? Because maximising your capital at each step opens up more project options for future rounds.
Think of it like this: you're standing at the edge of a pool of projects. Some are within reach (you can afford them), others are too expensive. Among those within reach, grab the most valuable one. After completing it, your reach extends further, potentially unlocking more lucrative projects.
This greedy strategy works because:
1. Completing a project never decreases your capital (profits are non-negative)
2. More capital means more options — you can only unlock projects, never lose access to previously affordable ones
3. We want to maximise final capital, so greedily maximising at each step is optimal
The challenge is efficiently finding the highest-profit affordable project at each step, which is where the **two-heap** (or sorted array + max-heap) approach shines.
approach: |
We solve this using a **Greedy approach with a Max-Heap**:
**Step 1: Pair and sort projects by capital requirement**
- Create pairs of `(capital[i], profits[i])` for each project
- Sort these pairs by capital requirement in ascending order
- This allows us to efficiently unlock projects as our capital grows
&nbsp;
**Step 2: Initialise tracking variables**
- `current_capital`: Set to `w` (our starting capital)
- `max_heap`: Empty heap to store profits of affordable projects (use negative values for max-heap in Python)
- `project_index`: Set to `0` to track which projects we've processed
&nbsp;
**Step 3: Repeat up to k times (greedy selection)**
- **Unlock projects**: While there are unprocessed projects and the next project's capital requirement is within our budget, push its profit onto the max-heap and move to the next project
- **Select best project**: If the heap is non-empty, pop the maximum profit and add it to `current_capital`
- **Early exit**: If no projects are affordable (heap is empty), we cannot proceed further — break early
&nbsp;
**Step 4: Return the result**
- Return `current_capital` after completing up to `k` projects
&nbsp;
The sorting ensures we process projects in order of affordability, while the max-heap lets us instantly retrieve the highest-profit option among all currently affordable projects.
common_pitfalls:
- title: Brute Force Selection
description: |
A naive approach might scan all projects at each step to find the best affordable one:
- For each of `k` rounds, scan all `n` projects
- Check if affordable and track the maximum profit
This results in **O(k × n)** time complexity. With `k` and `n` both up to `10^5`, this means up to 10 billion operations — causing **Time Limit Exceeded (TLE)**.
The heap-based approach reduces this to O(n log n) for sorting + O(k log n) for heap operations.
wrong_approach: "Linear scan for best affordable project each round"
correct_approach: "Max-heap to track affordable projects"
- title: Using a Min-Heap Instead of Max-Heap
description: |
Python's `heapq` module implements a min-heap by default. If you push profits directly, you'll get the *smallest* profit, not the largest.
Always negate profits when pushing (`-profit`) and negate again when popping to get the actual maximum. Alternatively, use a max-heap wrapper.
wrong_approach: "heappush(heap, profit)"
correct_approach: "heappush(heap, -profit) and negate when popping"
- title: Forgetting Early Termination
description: |
If at any point no projects are affordable (heap is empty after unlocking), continuing the loop is wasteful. More importantly, trying to pop from an empty heap causes an error.
Always check if the heap is non-empty before popping. If empty, break out of the loop early — no further progress is possible.
wrong_approach: "Always iterate k times"
correct_approach: "Break early if no affordable projects remain"
- title: Not Sorting by Capital
description: |
Without sorting by capital requirement, you'd need to scan all projects each round to find affordable ones. Sorting by capital allows linear unlocking as your capital grows — once a project is unaffordable, all subsequent ones (in sorted order) are too.
wrong_approach: "Check all projects for affordability each round"
correct_approach: "Sort by capital, unlock in order as budget grows"
key_takeaways:
- "**Greedy + Heap pattern**: When repeatedly selecting the 'best' option from a growing set, use a heap to efficiently track candidates"
- "**Two-phase processing**: Sort by one criterion (capital) to control unlocking, heap by another (profit) to optimise selection"
- "**Greedy validity**: This greedy approach works because completing projects only increases capital, never restricting future options"
- "**Real-world analogy**: This mirrors investment strategies where you reinvest profits to access larger opportunities — a common pattern in scheduling and resource allocation problems"
time_complexity: "O(n log n). Sorting takes O(n log n), and we perform at most n heap pushes and k heap pops, each O(log n)."
space_complexity: "O(n). We store all projects as pairs and the heap can hold up to n profit values."
solutions:
- approach_name: Greedy with Max-Heap
is_optimal: true
code: |
import heapq
def find_maximized_capital(k: int, w: int, profits: list[int], capital: list[int]) -> int:
n = len(profits)
# Pair projects as (capital_required, profit) and sort by capital
projects = sorted(zip(capital, profits))
current_capital = w
max_heap = [] # Max-heap (using negative values)
project_index = 0
for _ in range(k):
# Unlock all projects we can now afford
while project_index < n and projects[project_index][0] <= current_capital:
# Push negative profit for max-heap behavior
heapq.heappush(max_heap, -projects[project_index][1])
project_index += 1
# If no affordable projects, we're done
if not max_heap:
break
# Take the most profitable affordable project
current_capital += -heapq.heappop(max_heap)
return current_capital
explanation: |
**Time Complexity:** O(n log n) — Sorting dominates; heap operations are O(log n) each.
**Space Complexity:** O(n) — Storage for sorted pairs and heap.
We sort projects by capital requirement, then greedily select the highest-profit affordable project at each step using a max-heap. The sorted order ensures we efficiently unlock projects as our capital grows.
- approach_name: Brute Force
is_optimal: false
code: |
def find_maximized_capital(k: int, w: int, profits: list[int], capital: list[int]) -> int:
n = len(profits)
current_capital = w
completed = [False] * n # Track which projects are done
for _ in range(k):
best_profit = -1
best_index = -1
# Find the best affordable project
for i in range(n):
if not completed[i] and capital[i] <= current_capital:
if profits[i] > best_profit:
best_profit = profits[i]
best_index = i
# No affordable project found
if best_index == -1:
break
# Complete the best project
completed[best_index] = True
current_capital += best_profit
return current_capital
explanation: |
**Time Complexity:** O(k × n) — For each of k rounds, scan all n projects.
**Space Complexity:** O(n) — Boolean array to track completed projects.
This approach scans all projects each round to find the best affordable one. While correct, it's too slow for large inputs where k and n approach 10^5. Included to illustrate why the heap optimisation is necessary.

View File

@@ -0,0 +1,218 @@
title: Island Perimeter
slug: island-perimeter
difficulty: easy
leetcode_id: 463
leetcode_url: https://leetcode.com/problems/island-perimeter/
categories:
- arrays
- math
patterns:
- matrix-traversal
description: |
You are given a `row x col` grid representing a map where `grid[i][j] = 1` represents land and `grid[i][j] = 0` represents water.
Grid cells are connected **horizontally/vertically** (not diagonally). The `grid` is completely surrounded by water, and there is exactly one island (i.e., one or more connected land cells).
The island doesn't have "lakes", meaning the water inside isn't connected to the water around the island. One cell is a square with side length 1. The grid is rectangular, width and height don't exceed 100.
Determine the perimeter of the island.
constraints: |
- `row == grid.length`
- `col == grid[i].length`
- `1 <= row, col <= 100`
- `grid[i][j]` is `0` or `1`
- There is exactly one island in `grid`
examples:
- input: "grid = [[0,1,0,0],[1,1,1,0],[0,1,0,0],[1,1,0,0]]"
output: "16"
explanation: "The perimeter is formed by counting the edges of land cells that touch water or the grid boundary."
- input: "grid = [[1]]"
output: "4"
explanation: "A single land cell has 4 sides, all contributing to the perimeter."
- input: "grid = [[1,0]]"
output: "4"
explanation: "The single land cell is surrounded by water on one side and grid boundaries on the others."
explanation:
intuition: |
Imagine looking at the island from above, like a map. Each land cell is a square with 4 sides. The **perimeter** is the total length of the island's outer boundary — every edge where land meets water or the edge of the grid.
Think of it like this: if you placed a fence around every land cell, you'd have 4 fence segments per cell. But when two land cells are **adjacent** (share an edge), those touching sides are *internal* to the island — they shouldn't count toward the perimeter.
The key insight is: **each land cell contributes 4 to the perimeter, minus 2 for each neighbour it has**. Why minus 2? Because when two cells share an edge, that edge is counted by both cells, but it's actually an internal edge that shouldn't be part of the perimeter. We lose 1 from each cell's contribution.
Alternatively, you can think of it as: for each land cell, count how many of its 4 sides touch water or the boundary. That's its direct contribution to the perimeter.
approach: |
We solve this using a **Simple Counting** approach:
**Step 1: Initialise counters**
- `perimeter`: Set to `0` to accumulate the total perimeter
&nbsp;
**Step 2: Iterate through every cell in the grid**
- Use nested loops to visit each cell at position `(i, j)`
- If `grid[i][j] == 1` (it's a land cell), count its perimeter contribution
&nbsp;
**Step 3: For each land cell, check all 4 sides**
- **Top side**: If `i == 0` (top boundary) or `grid[i-1][j] == 0` (water above), add 1
- **Bottom side**: If `i == rows-1` (bottom boundary) or `grid[i+1][j] == 0` (water below), add 1
- **Left side**: If `j == 0` (left boundary) or `grid[i][j-1] == 0` (water left), add 1
- **Right side**: If `j == cols-1` (right boundary) or `grid[i][j+1] == 0` (water right), add 1
&nbsp;
**Step 4: Return the total perimeter**
- After checking all cells, return the accumulated `perimeter`
&nbsp;
This approach works because we directly count the edges that form the island's boundary — any edge touching water or the grid boundary contributes to the perimeter.
common_pitfalls:
- title: Counting Internal Edges
description: |
A common mistake is to count 4 for every land cell without subtracting shared edges between adjacent land cells.
For example, if two land cells are horizontally adjacent, the edge between them is internal — it's not part of the perimeter. You must either subtract these internal edges or only count edges that touch water/boundary.
With the grid `[[1,1]]`, simply counting `4 * 2 = 8` is wrong. The correct answer is `6` because the two cells share one edge.
wrong_approach: "Count 4 for every land cell"
correct_approach: "Count edges touching water or boundary only"
- title: Index Out of Bounds
description: |
When checking neighbours, it's easy to accidentally access `grid[i-1][j]` when `i == 0`, causing an index error.
Always check boundary conditions **before** accessing neighbouring cells. The order of conditions matters: `i == 0 or grid[i-1][j] == 0` short-circuits correctly, but `grid[i-1][j] == 0 or i == 0` will crash.
wrong_approach: "Check neighbour first, then boundary"
correct_approach: "Check boundary first (short-circuit evaluation)"
- title: Overcomplicating with DFS/BFS
description: |
While DFS or BFS can solve this problem, they're unnecessary complexity for this particular task. The problem states there's exactly one island with no lakes, so you don't need to track visited cells or flood-fill.
A simple double loop examining each cell independently is cleaner and equally efficient at O(rows * cols).
wrong_approach: "Implement full DFS/BFS traversal"
correct_approach: "Simple iteration checking each cell's edges"
key_takeaways:
- "**Count contributions**: Each land cell contributes its edges that touch water or boundaries"
- "**Boundary checks first**: Use short-circuit evaluation to avoid index errors when checking neighbours"
- "**Matrix traversal pattern**: Iterating through a 2D grid with nested loops is fundamental for many problems"
- "**Simplicity wins**: Don't overcomplicate — this problem doesn't need DFS/BFS despite being tagged as such"
time_complexity: "O(m * n). We visit each cell in the grid exactly once, where m is the number of rows and n is the number of columns."
space_complexity: "O(1). We only use a single counter variable regardless of input size."
solutions:
- approach_name: Simple Counting
is_optimal: true
code: |
def island_perimeter(grid: list[list[int]]) -> int:
rows, cols = len(grid), len(grid[0])
perimeter = 0
for i in range(rows):
for j in range(cols):
# Only process land cells
if grid[i][j] == 1:
# Check all 4 sides - add 1 for each edge touching water/boundary
# Top: boundary or water above
if i == 0 or grid[i - 1][j] == 0:
perimeter += 1
# Bottom: boundary or water below
if i == rows - 1 or grid[i + 1][j] == 0:
perimeter += 1
# Left: boundary or water to the left
if j == 0 or grid[i][j - 1] == 0:
perimeter += 1
# Right: boundary or water to the right
if j == cols - 1 or grid[i][j + 1] == 0:
perimeter += 1
return perimeter
explanation: |
**Time Complexity:** O(m * n) — We iterate through every cell once.
**Space Complexity:** O(1) — Only a counter variable is used.
For each land cell, we check its 4 neighbours. If a neighbour is water or out of bounds, that edge contributes to the perimeter. This direct approach is clean and efficient.
- approach_name: Count Land and Subtract Neighbours
is_optimal: true
code: |
def island_perimeter(grid: list[list[int]]) -> int:
rows, cols = len(grid), len(grid[0])
land_cells = 0
neighbour_edges = 0
for i in range(rows):
for j in range(cols):
if grid[i][j] == 1:
land_cells += 1
# Count neighbours (only check right and down to avoid double counting)
if i < rows - 1 and grid[i + 1][j] == 1:
neighbour_edges += 1
if j < cols - 1 and grid[i][j + 1] == 1:
neighbour_edges += 1
# Each land cell contributes 4, each shared edge removes 2 from perimeter
return land_cells * 4 - neighbour_edges * 2
explanation: |
**Time Complexity:** O(m * n) — Single pass through the grid.
**Space Complexity:** O(1) — Two counter variables.
This alternative approach uses the formula: `perimeter = 4 * land_cells - 2 * shared_edges`. Each land cell starts with 4 sides. Each pair of adjacent land cells shares an edge, removing 2 from the total perimeter (1 from each cell). We only check right and down neighbours to avoid counting each shared edge twice.
- approach_name: DFS Traversal
is_optimal: false
code: |
def island_perimeter(grid: list[list[int]]) -> int:
rows, cols = len(grid), len(grid[0])
visited = set()
def dfs(i: int, j: int) -> int:
# Out of bounds or water - this edge contributes 1 to perimeter
if i < 0 or i >= rows or j < 0 or j >= cols or grid[i][j] == 0:
return 1
# Already visited - don't count again
if (i, j) in visited:
return 0
visited.add((i, j))
# Explore all 4 directions and sum up perimeter
return (dfs(i - 1, j) + dfs(i + 1, j) +
dfs(i, j - 1) + dfs(i, j + 1))
# Find the first land cell and start DFS
for i in range(rows):
for j in range(cols):
if grid[i][j] == 1:
return dfs(i, j)
return 0
explanation: |
**Time Complexity:** O(m * n) — Each cell is visited at most once.
**Space Complexity:** O(m * n) — Visited set and recursion stack in worst case.
DFS explores the island by recursively visiting connected land cells. When we hit water or the boundary, that's a perimeter edge (return 1). When we hit a visited cell, return 0 to avoid double counting. While correct, this uses extra space and is overkill for this problem.

View File

@@ -0,0 +1,178 @@
title: Jump Game II
slug: jump-game-ii
difficulty: medium
leetcode_id: 45
leetcode_url: https://leetcode.com/problems/jump-game-ii/
categories:
- arrays
- dynamic-programming
patterns:
- greedy
description: |
You are given a **0-indexed** array of integers `nums` of length `n`. You are initially positioned at index `0`.
Each element `nums[i]` represents the maximum length of a forward jump from index `i`. In other words, if you are at index `i`, you can jump to any index `i + j` where:
- `0 <= j <= nums[i]` and
- `i + j < n`
Return *the minimum number of jumps to reach index* `n - 1`. The test cases are generated such that you can reach index `n - 1`.
constraints: |
- `1 <= nums.length <= 10^4`
- `0 <= nums[i] <= 1000`
- It's guaranteed that you can reach `nums[n - 1]`
examples:
- input: "nums = [2,3,1,1,4]"
output: "2"
explanation: "The minimum number of jumps to reach the last index is 2. Jump 1 step from index 0 to 1, then 3 steps to the last index."
- input: "nums = [2,3,0,1,4]"
output: "2"
explanation: "Jump 1 step from index 0 to 1, then 3 steps to the last index."
explanation:
intuition: |
Imagine you're standing at the start of a path with numbered tiles, and each tile tells you the maximum distance you can leap forward. Your goal is to reach the end in as few jumps as possible.
Think of it like a **level-based exploration**: from your current position, you can reach a range of tiles. Within that range, you want to pick the tile that lets you jump the *farthest* on your next move. This is the **greedy insight** — at each "level" (jump), choose the landing spot that maximises your future reach.
Visualise it as expanding waves: your first jump creates a "wave" of reachable positions. From all positions in that wave, you determine how far the next wave can extend. Each wave represents one jump.
The key observation is that you don't need to try every possible path. By always tracking the farthest point reachable within your current jump's range, you guarantee the minimum number of jumps. This works because reaching farther never hurts — a farther position can reach everything a closer position can, plus more.
approach: |
We solve this using a **Greedy (BFS-like) Approach**:
**Step 1: Handle edge cases**
- If the array has only one element, we're already at the destination — return `0` jumps
&nbsp;
**Step 2: Initialise tracking variables**
- `jumps`: Counter for the number of jumps made, starting at `0`
- `current_end`: The farthest index reachable with the current number of jumps (initially `0`)
- `farthest`: The farthest index we can reach from any position within the current range (initially `0`)
&nbsp;
**Step 3: Iterate through the array**
- For each index `i` from `0` to `n - 2` (we don't need to process the last index):
- Update `farthest` to be the maximum of `farthest` and `i + nums[i]`
- When we reach `current_end` (the boundary of our current jump range):
- Increment `jumps` — we must take another jump
- Update `current_end` to `farthest` — this is now our new reachable boundary
- If `current_end` reaches or exceeds the last index, we can stop
&nbsp;
**Step 4: Return the result**
- Return `jumps` after processing the array
&nbsp;
This approach works because we're essentially doing a BFS level by level. Each "level" represents positions reachable in exactly `k` jumps. We greedily extend to the farthest reachable point at each level, ensuring minimum jumps.
common_pitfalls:
- title: Using Dynamic Programming When Greedy Suffices
description: |
A natural first approach is DP: let `dp[i]` be the minimum jumps to reach index `i`. For each position, check all positions that can reach it.
While correct, this is **O(n^2) time complexity**. For `n = 10^4`, this means up to 100 million operations, which may cause TLE.
The greedy approach achieves **O(n)** by recognising that we don't need to track exact paths — just the farthest reachable point at each jump level.
wrong_approach: "DP with O(n^2) transitions"
correct_approach: "Greedy tracking of reachable range per jump"
- title: Processing the Last Index
description: |
A subtle bug occurs when iterating through all indices including `n - 1`. If the last index happens to equal `current_end`, you'd incorrectly count an extra jump.
We only need to iterate to `n - 2`. Once we know we can reach the last index, we're done. Processing the last index itself is unnecessary and can inflate the jump count.
wrong_approach: "Iterating i from 0 to n - 1"
correct_approach: "Iterating i from 0 to n - 2"
- title: Forgetting to Update Farthest Before Checking Boundary
description: |
The order of operations matters. You must update `farthest = max(farthest, i + nums[i])` *before* checking if `i == current_end`.
If you check the boundary first and then update farthest, you miss accounting for the current position's reach, potentially getting a wrong answer.
wrong_approach: "Check boundary, then update farthest"
correct_approach: "Update farthest, then check boundary"
key_takeaways:
- "**Greedy as implicit BFS**: When optimising for minimum steps in reachability problems, think of expanding 'waves' or 'levels' of positions reachable in k jumps"
- "**Track ranges, not paths**: Instead of enumerating all possible paths (exponential), track the reachable range at each step (linear)"
- "**Foundation for Jump Game variants**: This pattern extends to problems with obstacles, costs, or different movement rules"
- "**Recognise when DP is overkill**: If the problem has optimal substructure but greedy choice works, prefer the simpler O(n) greedy solution"
time_complexity: "O(n). We traverse the array exactly once, processing each element in constant time."
space_complexity: "O(1). We only use three variables (`jumps`, `current_end`, `farthest`), regardless of input size."
solutions:
- approach_name: Greedy (BFS-like)
is_optimal: true
code: |
def jump(nums: list[int]) -> int:
n = len(nums)
# Already at destination
if n <= 1:
return 0
jumps = 0 # Number of jumps taken
current_end = 0 # Farthest we can reach with current jumps
farthest = 0 # Farthest we can reach from positions in current range
# Don't process last index - we just need to reach it
for i in range(n - 1):
# Update the farthest point reachable from current position
farthest = max(farthest, i + nums[i])
# Reached the end of current jump's range
if i == current_end:
jumps += 1 # Must take another jump
current_end = farthest # Extend range to farthest reachable
# Early exit if we can reach the end
if current_end >= n - 1:
break
return jumps
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(1) — Only three integer variables used.
We simulate a BFS where each "level" represents positions reachable with the same number of jumps. At each level, we track the farthest position we can reach, then "jump" to extend our range. The greedy choice of always extending to the farthest point guarantees minimum jumps.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def jump(nums: list[int]) -> int:
n = len(nums)
# dp[i] = minimum jumps to reach index i
dp = [float('inf')] * n
dp[0] = 0 # Start position needs 0 jumps
for i in range(n):
# Skip unreachable positions
if dp[i] == float('inf'):
continue
# Update all positions reachable from i
for j in range(1, nums[i] + 1):
if i + j < n:
dp[i + j] = min(dp[i + j], dp[i] + 1)
return dp[n - 1]
explanation: |
**Time Complexity:** O(n × m) where m is the average jump length — can be O(n^2) in worst case.
**Space Complexity:** O(n) — DP array storing minimum jumps to each index.
For each position, we update all positions reachable from it. While correct, this is slower than the greedy approach because we're doing redundant work. Included to illustrate why greedy is preferred when it works.

View File

@@ -0,0 +1,206 @@
title: Jump Game VII
slug: jump-game-vii
difficulty: medium
leetcode_id: 1871
leetcode_url: https://leetcode.com/problems/jump-game-vii/
categories:
- strings
- dynamic-programming
patterns:
- bfs
- sliding-window
- dynamic-programming
description: |
You are given a **0-indexed** binary string `s` and two integers `minJump` and `maxJump`. In the beginning, you are standing at index `0`, which is equal to `'0'`. You can move from index `i` to index `j` if the following conditions are fulfilled:
- `i + minJump <= j <= min(i + maxJump, s.length - 1)`, and
- `s[j] == '0'`.
Return `true` *if you can reach index* `s.length - 1` *in* `s`*, or* `false` *otherwise*.
constraints: |
- `2 <= s.length <= 10^5`
- `s[i]` is either `'0'` or `'1'`
- `s[0] == '0'`
- `1 <= minJump <= maxJump < s.length`
examples:
- input: 's = "011010", minJump = 2, maxJump = 3'
output: "true"
explanation: "In the first step, move from index 0 to index 3. In the second step, move from index 3 to index 5."
- input: 's = "01101110", minJump = 2, maxJump = 3'
output: "false"
explanation: "There is no way to reach the last index starting from index 0."
explanation:
intuition: |
Imagine you're hopping across stepping stones in a river, where `'0'` represents a safe stone and `'1'` represents water. From any stone, you can only jump forward between `minJump` and `maxJump` steps.
The naive approach would be to try every possible jump from every reachable position — but with up to 10^5 characters and a jump range that could span thousands of indices, this becomes extremely slow.
The key insight is that we need to efficiently track **which positions are reachable** and then, for each new position, check if **any** reachable position can jump to it. Instead of checking every source position individually, we can use a **sliding window** or **prefix sum** to answer "is there any reachable position in the valid jump range?" in O(1) time.
Think of it like this: as you scan left to right, you maintain a count of how many reachable positions fall within the "jump window" for the current index. If that count is positive and the current character is `'0'`, you can reach it.
approach: |
We solve this using **Dynamic Programming with Prefix Sum optimization**:
**Step 1: Set up the DP array**
- Create a boolean array `dp` where `dp[i]` indicates whether index `i` is reachable
- Set `dp[0] = True` since we start at index 0
- Initialize a counter `reachable` to track reachable positions in the current window
&nbsp;
**Step 2: Iterate through the string**
- For each index `i` from 1 to `n-1`:
- First, check if a new position has entered our "from" window: if `i >= minJump` and `dp[i - minJump]` is true, increment `reachable`
- Then, check if a position has left the window: if `i > maxJump` and `dp[i - maxJump - 1]` is true, decrement `reachable`
- If `reachable > 0` and `s[i] == '0'`, mark `dp[i] = True`
&nbsp;
**Step 3: Return the result**
- Return `dp[n - 1]` — whether the last index is reachable
&nbsp;
This approach works because the sliding window maintains a count of all reachable positions that could potentially jump to the current index. We add positions as they enter the jump range and remove them as they exit.
common_pitfalls:
- title: The BFS/DFS Timeout
description: |
A natural approach is to use BFS or DFS to explore all reachable positions. However, with a string of length `10^5` and a jump range that could span thousands of indices, this approach degenerates to O(n * (maxJump - minJump)) which is potentially O(n^2).
For example, with `n = 100000`, `minJump = 1`, and `maxJump = 50000`, each position could have 50,000 neighbors to explore!
wrong_approach: "Plain BFS/DFS exploring all neighbors in jump range"
correct_approach: "BFS with visited tracking and early termination, or DP with sliding window"
- title: Checking Every Source Position
description: |
For each index `i`, you might think to loop through all `j` from `i - maxJump` to `i - minJump` to check if any `dp[j]` is true. This is O(n * range) which is too slow.
The sliding window/prefix sum optimization reduces this to O(1) per index by maintaining a running count of reachable positions in the valid range.
wrong_approach: "For each i, loop through all j in jump range"
correct_approach: "Maintain sliding window count of reachable positions"
- title: Off-by-One Errors in Window Bounds
description: |
The jump constraints are `i + minJump <= j <= i + maxJump`. When working backwards (asking "can I reach index j?"), the valid source range is `j - maxJump <= i <= j - minJump`.
Be careful when a position enters and exits the window:
- Position `j - minJump` enters the window when we reach index `j`
- Position `j - maxJump - 1` exits the window when we reach index `j`
wrong_approach: "Incorrect window boundary calculations"
correct_approach: "Carefully track when positions enter (at i - minJump) and exit (at i - maxJump - 1)"
key_takeaways:
- "**Sliding window for range queries**: When you need to check if *any* value in a range satisfies a condition, maintain a running count as the window slides"
- "**Prefix sum for cumulative queries**: This pattern appears frequently — counting elements in ranges can be done in O(1) with preprocessing"
- "**Optimizing DP transitions**: When DP transitions involve checking a range of previous states, look for ways to avoid the inner loop"
- "**Jump Game series**: This problem extends the classic Jump Game pattern — earlier versions use greedy, this one requires DP with optimization"
time_complexity: "O(n). We iterate through the string once, and each position enters and exits the sliding window exactly once."
space_complexity: "O(n). We use a DP array of size `n` to track reachability of each position."
solutions:
- approach_name: DP with Sliding Window
is_optimal: true
code: |
def can_reach(s: str, min_jump: int, max_jump: int) -> bool:
n = len(s)
# dp[i] = True if we can reach index i
dp = [False] * n
dp[0] = True # Start at index 0
# Count of reachable positions in the current jump window
reachable = 0
for i in range(1, n):
# Position (i - min_jump) just entered our "can jump from" window
if i >= min_jump and dp[i - min_jump]:
reachable += 1
# Position (i - max_jump - 1) just left the window
if i > max_jump and dp[i - max_jump - 1]:
reachable -= 1
# If any reachable position can jump here and it's a '0', mark reachable
if reachable > 0 and s[i] == '0':
dp[i] = True
return dp[n - 1]
explanation: |
**Time Complexity:** O(n) — Single pass through the string with O(1) work per index.
**Space Complexity:** O(n) — DP array to track reachability.
The sliding window maintains a count of positions that could jump to the current index. As we move right, positions enter the window when they're exactly `minJump` away, and exit when they're more than `maxJump` away.
- approach_name: BFS with Optimization
is_optimal: false
code: |
from collections import deque
def can_reach(s: str, min_jump: int, max_jump: int) -> bool:
n = len(s)
if s[n - 1] == '1':
return False
queue = deque([0])
# Track the farthest index we've added to avoid duplicates
farthest = 0
while queue:
i = queue.popleft()
# Start of jump range: don't re-explore already visited indices
start = max(i + min_jump, farthest + 1)
end = min(i + max_jump, n - 1)
for j in range(start, end + 1):
if s[j] == '0':
if j == n - 1:
return True
queue.append(j)
# Update farthest to avoid revisiting
farthest = max(farthest, i + max_jump)
return False
explanation: |
**Time Complexity:** O(n) — Each index is added to the queue at most once due to the `farthest` optimization.
**Space Complexity:** O(n) — Queue can hold up to n positions in the worst case.
This BFS approach uses a `farthest` pointer to avoid re-exploring indices. When processing position `i`, we only explore indices beyond what we've already added. This ensures each index is visited at most once, giving linear time.
- approach_name: Brute Force DP
is_optimal: false
code: |
def can_reach(s: str, min_jump: int, max_jump: int) -> bool:
n = len(s)
dp = [False] * n
dp[0] = True
for i in range(1, n):
if s[i] == '1':
continue
# Check all positions that could jump to i
for j in range(max(0, i - max_jump), i - min_jump + 1):
if dp[j]:
dp[i] = True
break
return dp[n - 1]
explanation: |
**Time Complexity:** O(n * (maxJump - minJump)) — For each position, we check up to `maxJump - minJump + 1` previous positions.
**Space Complexity:** O(n) — DP array.
This approach is correct but too slow for large inputs. With `n = 10^5` and a large jump range, this becomes O(n^2) and will TLE. Included to illustrate why the sliding window optimization is necessary.

View File

@@ -0,0 +1,182 @@
title: Jump Game
slug: jump-game
difficulty: medium
leetcode_id: 55
leetcode_url: https://leetcode.com/problems/jump-game/
categories:
- arrays
- dynamic-programming
patterns:
- greedy
- dynamic-programming
description: |
You are given an integer array `nums`. You are initially positioned at the array's **first index**, and each element in the array represents your maximum jump length at that position.
Return `true` *if you can reach the last index*, or `false` *otherwise*.
constraints: |
- `1 <= nums.length <= 10^4`
- `0 <= nums[i] <= 10^5`
examples:
- input: "nums = [2,3,1,1,4]"
output: "true"
explanation: "Jump 1 step from index 0 to 1, then 3 steps to the last index."
- input: "nums = [3,2,1,0,4]"
output: "false"
explanation: "You will always arrive at index 3 no matter what. Its maximum jump length is 0, which makes it impossible to reach the last index."
explanation:
intuition: |
Imagine you're hopping across stepping stones to reach the other side of a river. Each stone tells you the *maximum* distance you can jump from it — but you can choose to jump any shorter distance too.
The key insight is that you don't need to track every possible path. Instead, think about it this way: **what's the farthest position you can possibly reach?** As you walk through the array, each position extends your reach. If at any point you find yourself stuck (your current position is beyond your maximum reach), you know you'll never make it.
Think of it like filling a gas tank: at each position, you're potentially adding "fuel" (jump range) to extend how far you can go. The question becomes: can you keep extending your reach until it covers the finish line?
This greedy approach works because we only care about *whether* we can reach the end, not *how* we reach it. If we can reach position `i`, and from `i` we can jump to position `j`, then we can definitely reach `j` — we don't need to track the exact path.
approach: |
We solve this using a **Greedy Approach** by tracking the maximum reachable index:
**Step 1: Initialise the maximum reach**
- `max_reach`: Set to `0` initially (we start at index 0, which we can trivially reach)
&nbsp;
**Step 2: Iterate through the array**
- For each index `i`, first check if `i > max_reach`
- If yes, we're stuck — we can't even reach this position, so return `false`
- If no, calculate how far we can reach from here: `i + nums[i]`
- Update `max_reach` to be the maximum of its current value and `i + nums[i]`
- This ensures we always track the farthest point we could possibly reach
&nbsp;
**Step 3: Return the result**
- If we complete the loop without getting stuck, return `true`
- We know we can reach the end because `max_reach` must be at least `n - 1`
&nbsp;
The greedy choice at each step (always extend our reach as far as possible) guarantees we find a solution if one exists.
common_pitfalls:
- title: Simulating Every Possible Jump Path
description: |
A common first instinct is to use recursion or BFS to explore all possible jump sequences. This leads to **exponential time complexity** because from each position, you have up to `nums[i]` choices of where to jump next.
With `nums.length <= 10^4` and `nums[i] <= 10^5`, this approach will cause a **Time Limit Exceeded (TLE)** error. The greedy approach avoids this by recognising that we only need to track the maximum reach, not every individual path.
wrong_approach: "Recursively exploring all jump combinations"
correct_approach: "Track maximum reachable index in single pass"
- title: Forgetting to Check Reachability Before Updating
description: |
A subtle bug occurs when you update `max_reach` without first checking if the current index is reachable. Consider `nums = [0, 2, 3]`:
- At index 0: `max_reach = 0 + 0 = 0`
- At index 1: If you don't check reachability, you'd calculate `max_reach = 1 + 2 = 3`
- But index 1 was never reachable from index 0!
Always check `i <= max_reach` before processing position `i`.
wrong_approach: "Update max_reach without checking if current index is reachable"
correct_approach: "Check i <= max_reach before processing each position"
- title: Off-by-One Errors with Array Length
description: |
Remember that the goal is to reach the **last index** (position `n - 1`), not to jump beyond the array. Your condition should check whether `max_reach >= n - 1`, not `max_reach >= n`.
For a single-element array `[0]`, you're already at the last index, so the answer is `true` even though you can't jump anywhere.
key_takeaways:
- "**Greedy reachability**: When you only need to know *if* a destination is reachable (not *how*), tracking the maximum reachable position is often sufficient"
- "**Single-pass efficiency**: By maintaining running state (`max_reach`), we avoid expensive path enumeration and achieve O(n) time"
- "**Foundation for Jump Game II**: This problem extends to finding the *minimum* number of jumps (LeetCode #45), which uses a similar greedy interval approach"
- "**Early termination**: The greedy approach allows us to return `false` as soon as we detect we're stuck, avoiding unnecessary computation"
time_complexity: "O(n). We traverse the array exactly once, performing constant-time operations at each index."
space_complexity: "O(1). We only use a single variable (`max_reach`) regardless of input size."
solutions:
- approach_name: Greedy (Maximum Reach)
is_optimal: true
code: |
def can_jump(nums: list[int]) -> bool:
# Track the farthest index we can reach
max_reach = 0
for i in range(len(nums)):
# If current index is beyond our reach, we're stuck
if i > max_reach:
return False
# Update max reach from current position
# We can jump up to nums[i] steps from index i
max_reach = max(max_reach, i + nums[i])
# If we processed all indices, we can reach the end
return True
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(1) — Only one variable used.
We iterate through each index, checking if it's reachable and updating our maximum reach. If we ever find ourselves at an unreachable position, we return `false`. Otherwise, completing the loop means the end is reachable.
- approach_name: Greedy (Backward)
is_optimal: true
code: |
def can_jump(nums: list[int]) -> bool:
# Start with the goal at the last index
goal = len(nums) - 1
# Work backwards through the array
for i in range(len(nums) - 2, -1, -1):
# If we can reach the goal from position i,
# then position i becomes our new goal
if i + nums[i] >= goal:
goal = i
# If goal moved all the way to index 0, we can reach the end
return goal == 0
explanation: |
**Time Complexity:** O(n) — Single pass through the array (backwards).
**Space Complexity:** O(1) — Only one variable used.
This alternative greedy approach works backwards from the end. We ask: "What's the leftmost position that can reach my current goal?" Each time we find such a position, it becomes the new goal. If the goal reaches index 0, we know the end is reachable from the start.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def can_jump(nums: list[int]) -> bool:
n = len(nums)
# dp[i] indicates whether index i is reachable
dp = [False] * n
dp[0] = True # Starting position is always reachable
for i in range(n):
# Skip unreachable positions
if not dp[i]:
continue
# Mark all positions reachable from i
for j in range(1, nums[i] + 1):
if i + j < n:
dp[i + j] = True
# Early exit if we've reached the end
if dp[n - 1]:
return True
return dp[n - 1]
explanation: |
**Time Complexity:** O(n × max(nums[i])) — For each position, we may mark up to `nums[i]` subsequent positions.
**Space Complexity:** O(n) — Boolean array tracking reachability of each position.
This DP approach explicitly tracks which positions are reachable. While correct, it's slower than the greedy approach because it does redundant work marking positions that have already been marked. Included to illustrate the progression from DP thinking to greedy optimisation.

View File

@@ -0,0 +1,197 @@
title: K Closest Points to Origin
slug: k-closest-points-to-origin
difficulty: medium
leetcode_id: 973
leetcode_url: https://leetcode.com/problems/k-closest-points-to-origin/
categories:
- arrays
- heap
- sorting
patterns:
- heap
description: |
Given an array of `points` where `points[i] = [x_i, y_i]` represents a point on the **X-Y** plane and an integer `k`, return the `k` closest points to the origin `(0, 0)`.
The distance between two points on the **X-Y** plane is the Euclidean distance (i.e., `sqrt((x1 - x2)^2 + (y1 - y2)^2)`).
You may return the answer in **any order**. The answer is **guaranteed** to be **unique** (except for the order that it is in).
constraints: |
- `1 <= k <= points.length <= 10^4`
- `-10^4 <= x_i, y_i <= 10^4`
examples:
- input: "points = [[1,3],[-2,2]], k = 1"
output: "[[-2,2]]"
explanation: "The distance between (1, 3) and the origin is sqrt(10). The distance between (-2, 2) and the origin is sqrt(8). Since sqrt(8) < sqrt(10), (-2, 2) is closer to the origin. We only want the closest k = 1 points from the origin, so the answer is just [[-2,2]]."
- input: "points = [[3,3],[5,-1],[-2,4]], k = 2"
output: "[[3,3],[-2,4]]"
explanation: "The answer [[-2,4],[3,3]] would also be accepted since any order is valid."
explanation:
intuition: |
Imagine you have a map with several pins representing locations, and you're standing at the center (the origin). You need to find the `k` pins closest to you.
The **core insight** is that we need to efficiently select the k smallest values from a collection — this is the classic **top-k problem**. While sorting all points would work, it's more work than necessary. We don't need full ordering; we just need to identify which k points are closest.
Think of it like this: imagine you're a bouncer at a club with a capacity of `k` people. As people (points) arrive, you only let them in if there's room or if they're "better" (closer) than someone already inside. You don't need to rank everyone perfectly — you just need to maintain the best k at any moment.
A **max-heap of size k** is perfect for this. The heap always holds the k closest points seen so far. When we encounter a new point, we compare it to the *farthest* point in our heap (the max). If the new point is closer, we evict the farthest and add the new one.
**Key optimization**: Since we only care about *relative* distances, we can compare squared distances (`x^2 + y^2`) instead of actual Euclidean distances. This avoids expensive square root calculations without affecting correctness.
approach: |
We solve this using a **Max-Heap of Size K** approach:
**Step 1: Define a distance function**
- Create a helper to compute squared Euclidean distance: `x^2 + y^2`
- We use squared distance to avoid the expensive `sqrt()` operation — comparing `d1^2` vs `d2^2` gives the same ordering as `d1` vs `d2`
&nbsp;
**Step 2: Build a max-heap of size k**
- Iterate through each point in the input
- Push each point onto a max-heap (in Python, negate the distance for a max-heap using `heapq`)
- If the heap size exceeds `k`, pop the largest (farthest) point
&nbsp;
**Step 3: Extract results from the heap**
- After processing all points, the heap contains exactly the k closest points
- Extract and return these points
&nbsp;
**Why this works**: By maintaining a max-heap of size k, the root is always the *farthest* among our k candidates. When a closer point arrives, it replaces the farthest, ensuring we always have the k closest. This is more efficient than sorting when `k << n`.
common_pitfalls:
- title: Computing Actual Euclidean Distance
description: |
A common mistake is to compute the actual Euclidean distance using `sqrt(x^2 + y^2)` for every point.
While mathematically correct, the `sqrt()` function is computationally expensive. Since we only need to *compare* distances (not their exact values), squared distances work just as well: if `a^2 < b^2` and both are positive, then `a < b`.
This optimisation can provide a noticeable performance boost, especially with `10^4` points.
wrong_approach: "Using sqrt(x^2 + y^2) for distance"
correct_approach: "Using x^2 + y^2 for distance comparison"
- title: Sorting All Points
description: |
The naive approach is to sort all n points by distance and take the first k.
This gives O(n log n) time complexity regardless of k. When k is small (e.g., k = 10 with n = 10,000), we're doing far more work than necessary.
The heap approach is O(n log k), which is significantly faster when `k << n`. For k = 10 and n = 10,000, that's roughly 3x fewer operations.
wrong_approach: "Sort all points, take first k"
correct_approach: "Use a max-heap of size k"
- title: Using a Min-Heap Instead of Max-Heap
description: |
If you use a min-heap and push all n points, you'd need to pop k times at the end. This requires O(n) space for the full heap.
A max-heap of size k is more memory-efficient (O(k) space) and naturally evicts the farthest point when a closer one arrives. In Python, since `heapq` is a min-heap by default, negate the distances to simulate a max-heap.
wrong_approach: "Min-heap with all n points"
correct_approach: "Max-heap limited to size k"
key_takeaways:
- "**Top-k pattern**: When you need the k smallest/largest elements, a heap of size k is often optimal — O(n log k) beats O(n log n) sorting when `k << n`"
- "**Squared distance optimisation**: Avoid `sqrt()` when comparing distances — squared distances preserve ordering and are faster to compute"
- "**Max-heap for k smallest**: Use a max-heap to track k smallest values; the root lets you quickly check if a new element belongs"
- "**Related problems**: This pattern applies to Kth Largest Element, Top K Frequent Elements, and similar selection problems"
time_complexity: "O(n log k). We iterate through all n points, and each heap operation (push/pop) takes O(log k) time since the heap is capped at size k."
space_complexity: "O(k). The heap stores at most k points at any time."
solutions:
- approach_name: Max-Heap
is_optimal: true
code: |
import heapq
def k_closest(points: list[list[int]], k: int) -> list[list[int]]:
# Max-heap to store k closest points (negate distance for max-heap)
max_heap = []
for x, y in points:
# Squared distance avoids expensive sqrt()
dist = x * x + y * y
# Push negative distance to simulate max-heap
heapq.heappush(max_heap, (-dist, [x, y]))
# If heap exceeds size k, remove the farthest point
if len(max_heap) > k:
heapq.heappop(max_heap)
# Extract the k closest points from the heap
return [point for _, point in max_heap]
explanation: |
**Time Complexity:** O(n log k) — We process n points, each heap operation is O(log k).
**Space Complexity:** O(k) — The heap stores at most k elements.
By maintaining a max-heap of size k, we efficiently track the k closest points. The negative distance trick converts Python's min-heap into a max-heap, ensuring the farthest point is always at the root for quick comparison and removal.
- approach_name: Sort by Distance
is_optimal: false
code: |
def k_closest(points: list[list[int]], k: int) -> list[list[int]]:
# Sort all points by squared distance from origin
points.sort(key=lambda p: p[0] * p[0] + p[1] * p[1])
# Return the first k points
return points[:k]
explanation: |
**Time Complexity:** O(n log n) — Sorting dominates the complexity.
**Space Complexity:** O(1) to O(n) — Depends on the sorting algorithm used.
This approach is simpler to implement and may be preferred when k is close to n. However, for small k values relative to n, the heap approach is more efficient. The simplicity makes this a good choice when optimisation isn't critical.
- approach_name: Quickselect
is_optimal: false
code: |
import random
def k_closest(points: list[list[int]], k: int) -> list[list[int]]:
def dist(point: list[int]) -> int:
return point[0] * point[0] + point[1] * point[1]
def partition(left: int, right: int, pivot_idx: int) -> int:
pivot_dist = dist(points[pivot_idx])
# Move pivot to end
points[pivot_idx], points[right] = points[right], points[pivot_idx]
store_idx = left
# Move all closer points to the left
for i in range(left, right):
if dist(points[i]) < pivot_dist:
points[store_idx], points[i] = points[i], points[store_idx]
store_idx += 1
# Move pivot to its final position
points[store_idx], points[right] = points[right], points[store_idx]
return store_idx
left, right = 0, len(points) - 1
while left < right:
pivot_idx = random.randint(left, right)
pivot_idx = partition(left, right, pivot_idx)
if pivot_idx == k:
break
elif pivot_idx < k:
left = pivot_idx + 1
else:
right = pivot_idx - 1
return points[:k]
explanation: |
**Time Complexity:** O(n) average, O(n^2) worst case — Quickselect has linear average time.
**Space Complexity:** O(1) — In-place partitioning.
Quickselect is theoretically optimal with O(n) average time. It partitions the array around a pivot, similar to quicksort, but only recurses into the partition containing the k-th element. The randomised pivot selection helps avoid worst-case scenarios. However, the heap approach is often preferred in practice due to its guaranteed O(n log k) bound.

View File

@@ -0,0 +1,174 @@
title: Koko Eating Bananas
slug: koko-eating-bananas
difficulty: medium
leetcode_id: 875
leetcode_url: https://leetcode.com/problems/koko-eating-bananas/
categories:
- arrays
- binary-search
patterns:
- binary-search
description: |
Koko loves to eat bananas. There are `n` piles of bananas, the i<sup>th</sup> pile has `piles[i]` bananas. The guards have gone and will come back in `h` hours.
Koko can decide her bananas-per-hour eating speed of `k`. Each hour, she chooses some pile of bananas and eats `k` bananas from that pile. If the pile has less than `k` bananas, she eats all of them instead and will not eat any more bananas during that hour.
Koko likes to eat slowly but still wants to finish eating all the bananas before the guards return.
Return *the minimum integer* `k` *such that she can eat all the bananas within* `h` *hours*.
constraints: |
- `1 <= piles.length <= 10^4`
- `piles.length <= h <= 10^9`
- `1 <= piles[i] <= 10^9`
examples:
- input: "piles = [3,6,7,11], h = 8"
output: "4"
explanation: "At speed k = 4, Koko takes ceil(3/4) + ceil(6/4) + ceil(7/4) + ceil(11/4) = 1 + 2 + 2 + 3 = 8 hours, which exactly meets the deadline."
- input: "piles = [30,11,23,4,20], h = 5"
output: "30"
explanation: "With only 5 hours and 5 piles, Koko must finish each pile in exactly 1 hour. She needs k = 30 (the largest pile) to eat any pile in one hour."
- input: "piles = [30,11,23,4,20], h = 6"
output: "23"
explanation: "With 6 hours for 5 piles, Koko has one extra hour. At k = 23, the pile of 30 takes ceil(30/23) = 2 hours, while all others take 1 hour each, totaling 6 hours."
explanation:
intuition: |
Imagine you're given a dial that controls Koko's eating speed. Turn it up, and she finishes faster but eats more per hour than necessary. Turn it down, and she enjoys smaller bites but risks running out of time.
The key insight is that this dial creates a **monotonic relationship**: if Koko can finish at speed `k`, she can definitely finish at any speed greater than `k`. Conversely, if she can't finish at speed `k`, she can't finish at any slower speed either.
This monotonicity is the hallmark of a **binary search on the answer** problem. Instead of searching for an element in an array, we're searching for the minimum valid value in a range of possible speeds.
Think of it like this: imagine all possible speeds from `1` to `max(piles)` laid out on a number line. At some point, there's a boundary — speeds below it fail (too slow), and speeds at or above it succeed. Binary search efficiently finds that boundary.
approach: |
We use **Binary Search on the Answer** to find the minimum valid eating speed:
**Step 1: Define the search space**
- `left`: Set to `1` — the minimum possible speed (eating at least one banana per hour)
- `right`: Set to `max(piles)` — eating faster than the largest pile is wasteful since Koko can only eat from one pile per hour
&nbsp;
**Step 2: Implement the feasibility check**
- For a given speed `k`, calculate the total hours needed to eat all piles
- For each pile, the hours needed is `ceil(pile / k)`, which equals `(pile + k - 1) // k` using integer math
- If total hours `<= h`, the speed is feasible
&nbsp;
**Step 3: Binary search for the minimum valid speed**
- Calculate `mid = (left + right) // 2`
- If `mid` is a feasible speed, it might be our answer, but there could be a smaller valid speed — search left by setting `right = mid`
- If `mid` is not feasible, we need a faster speed — search right by setting `left = mid + 1`
- Continue until `left == right`
&nbsp;
**Step 4: Return the result**
- Return `left` (or `right`) as the minimum speed that allows Koko to finish on time
common_pitfalls:
- title: Trying All Speeds Linearly
description: |
A naive approach checks every speed from `1` to `max(piles)` and returns the first one that works.
With `max(piles)` up to `10^9`, this linear search performs up to a billion iterations — far too slow.
Binary search reduces this to at most `log2(10^9) ≈ 30` iterations.
wrong_approach: "Linear search from 1 to max(piles)"
correct_approach: "Binary search on the speed range"
- title: Incorrect Ceiling Division
description: |
When calculating hours for a pile, we need `ceil(pile / k)`. Using regular integer division `pile // k` gives the floor, which undercounts.
For example, `pile = 7, k = 4`: floor is `1`, but Koko actually needs `2` hours (one hour for 4 bananas, another for the remaining 3).
Use the formula `(pile + k - 1) // k` or Python's `math.ceil(pile / k)` for correct results.
wrong_approach: "pile // k (floor division)"
correct_approach: "(pile + k - 1) // k (ceiling division)"
- title: Wrong Search Space Bounds
description: |
Setting `right` too high (e.g., `sum(piles)` or `h`) wastes iterations. The maximum useful speed is `max(piles)` because eating faster doesn't help — Koko still spends one hour per pile regardless.
Setting `left` to `0` causes division by zero errors. The minimum meaningful speed is `1`.
wrong_approach: "left = 0 or right = sum(piles)"
correct_approach: "left = 1, right = max(piles)"
- title: Off-by-One in Binary Search
description: |
When searching for a minimum valid value:
- If `mid` works, set `right = mid` (not `mid - 1`) because `mid` could be the answer
- If `mid` fails, set `left = mid + 1`
Using `right = mid - 1` when `mid` is valid might skip the answer. The loop condition `left < right` ensures we converge correctly.
wrong_approach: "right = mid - 1 when mid is feasible"
correct_approach: "right = mid when mid is feasible"
key_takeaways:
- "**Binary search on the answer**: When asked to find the minimum/maximum value satisfying a condition, and the condition is monotonic, binary search applies"
- "**Monotonicity is key**: If a speed `k` works, all larger speeds work too — this sorted property enables binary search"
- "**Ceiling division pattern**: `(a + b - 1) // b` computes `ceil(a / b)` using only integers, avoiding floating-point issues"
- "**Similar problems**: This pattern applies to Capacity To Ship Packages Within D Days, Split Array Largest Sum, and Magnetic Force Between Two Balls"
time_complexity: "O(n log m). Binary search runs `O(log m)` iterations where `m = max(piles)`, and each feasibility check scans all `n` piles."
space_complexity: "O(1). We only use a constant number of variables for the search bounds and hour calculations."
solutions:
- approach_name: Binary Search on Answer
is_optimal: true
code: |
def min_eating_speed(piles: list[int], h: int) -> int:
# Search space: minimum speed 1, maximum speed is largest pile
left, right = 1, max(piles)
while left < right:
mid = (left + right) // 2
# Calculate total hours needed at speed mid
hours_needed = sum((pile + mid - 1) // mid for pile in piles)
if hours_needed <= h:
# Speed mid works, but maybe we can go slower
right = mid
else:
# Too slow, need to eat faster
left = mid + 1
return left
explanation: |
**Time Complexity:** O(n log m) — Binary search over `m = max(piles)` speeds, each iteration scans `n` piles.
**Space Complexity:** O(1) — Only constant extra space used.
We binary search for the minimum speed where Koko can finish on time. The feasibility check sums up the hours needed for each pile using ceiling division.
- approach_name: Linear Search
is_optimal: false
code: |
def min_eating_speed(piles: list[int], h: int) -> int:
# Try every speed from 1 up to max pile
for k in range(1, max(piles) + 1):
# Calculate hours needed at this speed
hours_needed = sum((pile + k - 1) // k for pile in piles)
# Return first speed that works
if hours_needed <= h:
return k
return max(piles)
explanation: |
**Time Complexity:** O(n × m) — Checks up to `m = max(piles)` speeds, each requiring O(n) time.
**Space Complexity:** O(1) — Only constant extra space used.
This brute force approach tries every possible speed starting from 1. While correct, it times out on large inputs where `max(piles)` can be up to `10^9`. Included to illustrate why binary search is essential.

View File

@@ -0,0 +1,186 @@
title: Kth Largest Element in a Stream
slug: kth-largest-element-in-a-stream
difficulty: easy
leetcode_id: 703
leetcode_url: https://leetcode.com/problems/kth-largest-element-in-a-stream/
categories:
- heap
- arrays
patterns:
- heap
description: |
Design a class to find the `k`<sup>th</sup> largest element in a stream.
Note that it is the `k`<sup>th</sup> largest element in the sorted order, not the `k`<sup>th</sup> distinct element.
Implement the `KthLargest` class:
- `KthLargest(int k, int[] nums)` — Initializes the object with the integer `k` and the stream of test scores `nums`.
- `int add(int val)` — Adds a new test score `val` to the stream and returns the element representing the `k`<sup>th</sup> largest element in the pool of test scores so far.
constraints: |
- `0 <= nums.length <= 10^4`
- `1 <= k <= nums.length + 1`
- `-10^4 <= nums[i] <= 10^4`
- `-10^4 <= val <= 10^4`
- At most `10^4` calls will be made to `add`
examples:
- input: |
["KthLargest", "add", "add", "add", "add", "add"]
[[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]]
output: "[null, 4, 5, 5, 8, 8]"
explanation: |
KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]);
kthLargest.add(3); // return 4 (stream: [2,3,4,5,8], 3rd largest = 4)
kthLargest.add(5); // return 5 (stream: [2,3,4,5,5,8], 3rd largest = 5)
kthLargest.add(10); // return 5 (stream: [2,3,4,5,5,8,10], 3rd largest = 5)
kthLargest.add(9); // return 8 (stream: [2,3,4,5,5,8,9,10], 3rd largest = 8)
kthLargest.add(4); // return 8 (stream: [2,3,4,4,5,5,8,9,10], 3rd largest = 8)
- input: |
["KthLargest", "add", "add", "add", "add"]
[[4, [7, 7, 7, 7, 8, 3]], [2], [10], [9], [9]]
output: "[null, 7, 7, 7, 8]"
explanation: |
KthLargest kthLargest = new KthLargest(4, [7, 7, 7, 7, 8, 3]);
kthLargest.add(2); // return 7 (4th largest = 7)
kthLargest.add(10); // return 7 (4th largest = 7)
kthLargest.add(9); // return 7 (4th largest = 7)
kthLargest.add(9); // return 8 (4th largest = 8)
explanation:
intuition: |
Imagine you're running a leaderboard for the top `k` players in a game. You don't need to track *everyone* — just the top `k`. When a new player joins:
- If they're not good enough to crack the top `k`, you ignore them
- If they are, they bump out the current `k`<sup>th</sup> place player
The key insight is: **the `k`<sup>th</sup> largest element is always the smallest element in the top `k` group**. If we maintain exactly `k` elements (the `k` largest seen so far), the minimum of this group is our answer.
A **min-heap** of size `k` is perfect for this. The heap property guarantees the smallest element sits at the top. After each insertion, if our heap grows beyond `k` elements, we pop the smallest — ensuring we always keep exactly the `k` largest values, with the `k`<sup>th</sup> largest conveniently sitting at the heap's root.
Think of it like a bouncer at an exclusive club: the venue only holds `k` people. When someone new arrives, if they're more important than the least important person inside, they swap places. The bouncer (heap root) always knows who's on the bubble.
approach: |
We use a **Min-Heap of size k** to solve this efficiently:
**Step 1: Initialise the heap**
- Create an empty min-heap
- Add all elements from the initial `nums` array to the heap
- After adding each element, if heap size exceeds `k`, pop the minimum
&nbsp;
**Step 2: Implement the add operation**
- Push the new value onto the heap
- If heap size exceeds `k`, pop the minimum (it's no longer in the top `k`)
- Return the heap's minimum — this is the `k`<sup>th</sup> largest
&nbsp;
**Why this works:**
- The heap always contains exactly the `k` largest elements seen so far
- The min-heap property ensures the smallest of these (the `k`<sup>th</sup> largest overall) is at the root
- We only pop elements smaller than the `k`<sup>th</sup> largest, preserving correctness
common_pitfalls:
- title: Using a Max-Heap Instead of Min-Heap
description: |
A max-heap gives you the *largest* element at the root, but we need the `k`<sup>th</sup> largest. With a max-heap of size `k`, you'd have to traverse to find the minimum.
The trick is counter-intuitive: use a **min-heap** of size `k`. The root gives you the minimum of the `k` largest elements — which is exactly the `k`<sup>th</sup> largest overall.
wrong_approach: "Max-heap of size k"
correct_approach: "Min-heap of size k"
- title: Keeping All Elements
description: |
Storing all `n` elements and sorting to find the `k`<sup>th</sup> largest gives O(n log n) per query. With up to `10^4` calls to `add`, this becomes too slow.
By maintaining only `k` elements in the heap, each `add` operation is O(log k), which is much faster when `k << n`.
wrong_approach: "Sort all elements on each query"
correct_approach: "Maintain a fixed-size heap of k elements"
- title: Forgetting to Handle Initial Array
description: |
The constructor receives an initial array `nums` that may have more than `k` elements. You must process these through the heap first, trimming down to size `k` before any `add` calls.
If you skip this step, your heap won't be properly initialised and the first few `add` calls will return wrong results.
wrong_approach: "Ignore nums in constructor"
correct_approach: "Heapify nums and trim to size k in constructor"
key_takeaways:
- "**Min-heap for k largest**: A min-heap of size `k` efficiently tracks the `k`<sup>th</sup> largest element — it's the heap's root"
- "**Bounded heap pattern**: Maintain a fixed-size heap by popping after each push when size exceeds `k`"
- "**O(log k) vs O(log n)**: Limiting heap size to `k` gives faster operations than keeping all elements"
- "**Foundation for streaming problems**: This pattern applies to any 'top k' problem in a data stream (e.g., top k frequent, k closest points)"
time_complexity: "O(n log k) for initialisation where `n` is the size of `nums`, and O(log k) for each `add` call. Each heap operation (push/pop) takes O(log k) time since the heap never exceeds size `k`."
space_complexity: "O(k). We only store at most `k` elements in the heap at any time, regardless of how many elements are added to the stream."
solutions:
- approach_name: Min-Heap
is_optimal: true
code: |
import heapq
class KthLargest:
def __init__(self, k: int, nums: list[int]):
self.k = k
self.heap = []
# Add initial elements to the heap
for num in nums:
heapq.heappush(self.heap, num)
# Keep only the k largest elements
if len(self.heap) > k:
heapq.heappop(self.heap)
def add(self, val: int) -> int:
# Add new value to the heap
heapq.heappush(self.heap, val)
# If heap exceeds k, remove the smallest
if len(self.heap) > self.k:
heapq.heappop(self.heap)
# The root of min-heap is the kth largest
return self.heap[0]
explanation: |
**Time Complexity:** O(n log k) for constructor, O(log k) per `add` call — heap operations on a heap of size `k`.
**Space Complexity:** O(k) — the heap stores at most `k` elements.
We maintain a min-heap of exactly `k` elements. The smallest element in this heap (the root) is the `k`<sup>th</sup> largest overall. When adding a new element, if the heap grows beyond `k`, we pop the smallest — it's no longer in the top `k`.
- approach_name: Sorted List
is_optimal: false
code: |
import bisect
class KthLargest:
def __init__(self, k: int, nums: list[int]):
self.k = k
# Keep a sorted list of the k largest elements
self.sorted_list = sorted(nums, reverse=True)[:k]
self.sorted_list.reverse() # Ascending order for bisect
def add(self, val: int) -> int:
# Insert in sorted position
bisect.insort(self.sorted_list, val)
# Keep only k largest (remove smallest if needed)
if len(self.sorted_list) > self.k:
self.sorted_list.pop(0)
# Return the kth largest (smallest in our k-size list)
return self.sorted_list[0]
explanation: |
**Time Complexity:** O(n log n) for constructor, O(k) per `add` call — `bisect.insort` is O(k) due to shifting elements.
**Space Complexity:** O(k) — stores at most `k` elements.
This approach uses a sorted list with binary search insertion. While the space is the same, the O(k) insertion time makes it slower than the heap approach for large `k`. The heap's O(log k) operations are more efficient.

View File

@@ -0,0 +1,211 @@
title: Kth Largest Element in an Array
slug: kth-largest-element-in-an-array
difficulty: medium
leetcode_id: 215
leetcode_url: https://leetcode.com/problems/kth-largest-element-in-an-array/
categories:
- arrays
- sorting
- heap
patterns:
- heap
- binary-search
description: |
Given an integer array `nums` and an integer `k`, return *the* `k`<sup>th</sup> *largest element in the array*.
Note that it is the `k`<sup>th</sup> largest element in the sorted order, not the `k`<sup>th</sup> distinct element.
Can you solve it without sorting?
constraints: |
- `1 <= k <= nums.length <= 10^5`
- `-10^4 <= nums[i] <= 10^4`
examples:
- input: "nums = [3,2,1,5,6,4], k = 2"
output: "5"
explanation: "The sorted array is [1,2,3,4,5,6]. The 2nd largest element is 5."
- input: "nums = [3,2,3,1,2,4,5,5,6], k = 4"
output: "4"
explanation: "The sorted array is [1,2,2,3,3,4,5,5,6]. The 4th largest element is 4."
explanation:
intuition: |
Imagine you have a collection of exam scores and you want to find the student who ranked `k`<sup>th</sup> from the top. The most straightforward approach would be to sort all scores and pick the `k`<sup>th</sup> one from the end — but can we do better?
Think of it like this: if you only need to find *one* specific ranking, do you really need to sort *everything*? This is similar to finding the tallest person in a room versus sorting everyone by height — the first task is much simpler.
The key insight is that we don't need a fully sorted array. We only need to find the element that would be at position `n - k` if the array were sorted (0-indexed). This opens the door to more efficient approaches:
1. **Heap approach**: Maintain a "top k" collection using a min-heap of size `k`. Any element smaller than our current `k`<sup>th</sup> largest can be discarded.
2. **Quickselect approach**: Use the partitioning logic from quicksort, but only recurse into the half that contains our target position.
Both avoid the full `O(n log n)` cost of sorting when we only need partial ordering.
approach: |
We'll focus on the **Min-Heap approach** as the primary solution due to its consistent performance and clarity:
**Step 1: Understand the heap strategy**
- We maintain a min-heap of size `k`
- The min-heap always contains the `k` largest elements seen so far
- The root of the heap (minimum of these `k` elements) is our answer
&nbsp;
**Step 2: Initialise the heap**
- Create an empty min-heap
- We'll use Python's `heapq` which implements a min-heap
&nbsp;
**Step 3: Process each element**
- For each number in the array:
- If the heap has fewer than `k` elements, push the number
- Otherwise, if the number is larger than the heap's minimum (root), replace the root with this number
- This ensures we always keep the `k` largest elements
&nbsp;
**Step 4: Return the result**
- The root of the heap is the `k`<sup>th</sup> largest element
- Return `heap[0]`
&nbsp;
**Why this works**: By keeping exactly `k` elements and always removing the smallest when we exceed capacity, we guarantee that the smallest element in our heap is larger than all discarded elements — making it exactly the `k`<sup>th</sup> largest overall.
common_pitfalls:
- title: Off-by-One with Heap Size
description: |
A common mistake is confusion about when to push vs. replace in the heap.
If you always push and then pop when size exceeds `k`, you might accidentally pop the element you just added if it's the smallest. The correct approach is to check if the new element is larger than the heap's minimum *before* deciding to add it.
Alternatively, you can push unconditionally and pop if size exceeds `k` — this is simpler and works correctly, though slightly less efficient.
wrong_approach: "Complex conditional logic that's easy to get wrong"
correct_approach: "Push then pop if size > k, or use heappushpop for efficiency"
- title: Using Max-Heap Incorrectly
description: |
Some attempt to use a max-heap of the entire array and pop `k-1` times. While correct, this is inefficient:
- Building a max-heap: `O(n)`
- Popping `k` times: `O(k log n)`
- Total: `O(n + k log n)`
With a min-heap of size `k`, we get `O(n log k)`, which is better when `k` is small relative to `n`.
wrong_approach: "Max-heap of all elements, pop k-1 times"
correct_approach: "Min-heap of size k, maintaining the k largest"
- title: Forgetting Python's heapq is Min-Heap Only
description: |
Python's `heapq` only provides a min-heap. To simulate a max-heap, you must negate values when pushing and negate again when popping.
For this problem, a min-heap is actually what we want — we keep the `k` largest elements by discarding elements smaller than our current `k`<sup>th</sup> largest.
wrong_approach: "Assuming heapq has a max-heap option"
correct_approach: "Use min-heap directly for finding kth largest"
key_takeaways:
- "**Partial ordering insight**: When you only need one specific rank, you don't need to sort everything — use a heap or quickselect instead"
- "**Min-heap for top-k**: A min-heap of size `k` naturally maintains the `k` largest elements, with the `k`<sup>th</sup> largest at the root"
- "**Trade-off awareness**: Heap gives `O(n log k)` guaranteed; Quickselect gives `O(n)` average but `O(n^2)` worst case"
- "**Foundation pattern**: This technique applies to streaming data, top-k frequent elements, and many ranking problems"
time_complexity: "O(n log k). We iterate through all `n` elements, and each heap operation (push/pop) takes `O(log k)` time since the heap size is bounded by `k`."
space_complexity: "O(k). We maintain a heap containing at most `k` elements."
solutions:
- approach_name: Min-Heap
is_optimal: true
code: |
import heapq
def find_kth_largest(nums: list[int], k: int) -> int:
# Min-heap to store the k largest elements
heap = []
for num in nums:
# Add current number to heap
heapq.heappush(heap, num)
# If heap exceeds size k, remove the smallest
# This ensures we keep only the k largest elements
if len(heap) > k:
heapq.heappop(heap)
# The root of min-heap is the kth largest
return heap[0]
explanation: |
**Time Complexity:** O(n log k) — We process each of `n` elements with heap operations costing `O(log k)`.
**Space Complexity:** O(k) — The heap stores at most `k` elements.
This approach maintains a min-heap of the `k` largest elements seen so far. By keeping the heap size at `k` and using a min-heap, the smallest element in our collection (the root) is always the `k`<sup>th</sup> largest overall.
- approach_name: Quickselect
is_optimal: true
code: |
import random
def find_kth_largest(nums: list[int], k: int) -> int:
# Convert kth largest to index in sorted array
# kth largest = element at index (n - k) in ascending order
target_index = len(nums) - k
def quickselect(left: int, right: int) -> int:
# Random pivot to avoid worst-case on sorted input
pivot_idx = random.randint(left, right)
pivot = nums[pivot_idx]
# Move pivot to end
nums[pivot_idx], nums[right] = nums[right], nums[pivot_idx]
# Partition: elements < pivot go to the left
store_idx = left
for i in range(left, right):
if nums[i] < pivot:
nums[store_idx], nums[i] = nums[i], nums[store_idx]
store_idx += 1
# Move pivot to its final sorted position
nums[store_idx], nums[right] = nums[right], nums[store_idx]
# Check if we found the target
if store_idx == target_index:
return nums[store_idx]
elif store_idx < target_index:
# Target is in the right partition
return quickselect(store_idx + 1, right)
else:
# Target is in the left partition
return quickselect(left, store_idx - 1)
return quickselect(0, len(nums) - 1)
explanation: |
**Time Complexity:** O(n) average, O(n^2) worst case — Average case is linear because we only recurse into one half. Random pivot selection makes worst case very unlikely.
**Space Complexity:** O(log n) average for recursion stack, O(n) worst case.
Quickselect uses the partitioning logic from quicksort but only recurses into the partition containing our target index. This reduces the expected work from `O(n log n)` to `O(n)`.
- approach_name: Sorting
is_optimal: false
code: |
def find_kth_largest(nums: list[int], k: int) -> int:
# Sort in descending order
nums.sort(reverse=True)
# Return the kth element (0-indexed, so k-1)
return nums[k - 1]
explanation: |
**Time Complexity:** O(n log n) — Dominated by the sorting step.
**Space Complexity:** O(1) to O(n) — Depends on the sorting algorithm used (in-place vs. not).
The simplest approach: sort and index. While not optimal for this specific problem, it's worth knowing as a baseline. For small arrays or when `k` is close to `n`, the practical difference may be negligible.

View File

@@ -0,0 +1,212 @@
title: Kth Smallest Element in a BST
slug: kth-smallest-element-in-a-bst
difficulty: medium
leetcode_id: 230
leetcode_url: https://leetcode.com/problems/kth-smallest-element-in-a-bst/
categories:
- trees
- recursion
patterns:
- dfs
- tree-traversal
description: |
Given the `root` of a binary search tree, and an integer `k`, return the `k`<sup>th</sup> smallest value (**1-indexed**) of all the values of the nodes in the tree.
constraints: |
- The number of nodes in the tree is `n`
- `1 <= k <= n <= 10^4`
- `0 <= Node.val <= 10^4`
examples:
- input: "root = [3,1,4,null,2], k = 1"
output: "1"
explanation: "The inorder traversal of the tree is [1, 2, 3, 4]. The 1st smallest element is 1."
- input: "root = [5,3,6,2,4,null,null,1], k = 3"
output: "3"
explanation: "The inorder traversal is [1, 2, 3, 4, 5, 6]. The 3rd smallest element is 3."
explanation:
intuition: |
The key insight comes from the fundamental property of a **Binary Search Tree (BST)**: for any node, all values in its left subtree are smaller, and all values in its right subtree are larger.
What does this mean for traversal? If you visit nodes in **inorder** order (left → current → right), you get all values in **sorted ascending order**!
Imagine walking through the tree: you go as far left as possible first (smallest values), then visit the current node, then explore the right subtree. This naturally produces a sorted sequence.
So finding the k<sup>th</sup> smallest becomes simple: perform an inorder traversal and return the k<sup>th</sup> element you encounter. No need to collect all values first — you can count as you go and stop early once you've found it.
Think of it like this: the BST is already "pre-sorted" by its structure. The inorder traversal simply reads this sorted order.
approach: |
We solve this using **Inorder Traversal with Early Termination**:
**Step 1: Understand the traversal pattern**
- Inorder traversal visits: left subtree → current node → right subtree
- For a BST, this visits nodes in ascending sorted order
- We count each node we visit until we reach the k<sup>th</sup> one
&nbsp;
**Step 2: Set up tracking variables**
- `count`: Track how many nodes we've visited (starts at `0`)
- `result`: Store the k<sup>th</sup> smallest value once found
&nbsp;
**Step 3: Perform inorder traversal**
- Recursively traverse the left subtree
- Increment `count` when visiting the current node
- If `count == k`, we've found our answer — store it and stop
- Otherwise, recursively traverse the right subtree
&nbsp;
**Step 4: Early termination**
- Once we've found the k<sup>th</sup> element, there's no need to continue traversing
- We can use a flag or check if result is set to stop recursion early
&nbsp;
This approach leverages the BST property to avoid sorting, achieving O(H + k) time where H is the tree height.
common_pitfalls:
- title: Collecting All Values Then Sorting
description: |
A naive approach collects all node values into a list, sorts it, and returns the k<sup>th</sup> element.
This works but wastes both time and space. With n nodes, you'd use O(n) space for the list and O(n log n) time for sorting.
Since BST's inorder traversal is already sorted, we can get O(H + k) time and O(H) space instead.
wrong_approach: "Collect all values, sort, return kth"
correct_approach: "Use inorder traversal property — already sorted"
- title: Not Using Early Termination
description: |
Even with inorder traversal, visiting all n nodes is wasteful when k is small. If k = 1, why traverse the entire tree?
Always check if you've found the k<sup>th</sup> element and stop early. This improves average case performance significantly, especially when k << n.
wrong_approach: "Complete full inorder traversal"
correct_approach: "Stop as soon as kth element is found"
- title: Off-by-One Errors with 1-Indexed k
description: |
The problem states k is **1-indexed**, meaning k = 1 refers to the smallest element, not the second smallest.
Be careful with your counter: if you start counting at 0, the k<sup>th</sup> element is found when `count == k`, not `count == k - 1`.
Always clarify the indexing in your head before implementing.
wrong_approach: "Returning element at index k-1 after 0-indexed counting"
correct_approach: "Increment count first, then check if count equals k"
key_takeaways:
- "**BST inorder = sorted order**: This fundamental property is the key to many BST problems"
- "**Early termination**: Stop traversing once you have the answer — don't process unnecessary nodes"
- "**Iterative vs recursive**: Both work; iterative uses an explicit stack and can be easier to control for early termination"
- "**Follow-up insight**: For frequent queries with modifications, augment nodes with subtree sizes for O(H) queries"
time_complexity: "O(H + k). We descend H levels to reach the leftmost node, then visit k nodes. For a balanced tree, this is O(log n + k)."
space_complexity: "O(H). The recursion stack or explicit stack holds at most H nodes, where H is the tree height — O(log n) for balanced, O(n) for skewed."
solutions:
- approach_name: Inorder Traversal (Recursive)
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def kth_smallest(root: TreeNode | None, k: int) -> int:
count = 0
result = 0
def inorder(node: TreeNode | None) -> bool:
nonlocal count, result
if not node:
return False
# Traverse left subtree first (smaller values)
if inorder(node.left):
return True # Already found, stop early
# Visit current node
count += 1
if count == k:
result = node.val
return True # Found the kth smallest
# Traverse right subtree (larger values)
return inorder(node.right)
inorder(root)
return result
explanation: |
**Time Complexity:** O(H + k) — Descend to leftmost node (H steps), then visit k nodes.
**Space Complexity:** O(H) — Recursion stack depth equals tree height.
We perform inorder traversal, counting nodes as we visit them. The `True` return value signals that we've found the answer and should stop recursing. This early termination avoids unnecessary traversal when k is small.
- approach_name: Inorder Traversal (Iterative with Stack)
is_optimal: true
code: |
def kth_smallest(root: TreeNode | None, k: int) -> int:
stack = []
current = root
count = 0
while stack or current:
# Go as far left as possible
while current:
stack.append(current)
current = current.left
# Process the leftmost unvisited node
current = stack.pop()
count += 1
# Check if this is the kth smallest
if count == k:
return current.val
# Move to the right subtree
current = current.right
return -1 # Should never reach here if k is valid
explanation: |
**Time Complexity:** O(H + k) — Same as recursive approach.
**Space Complexity:** O(H) — Explicit stack replaces recursion stack.
The iterative approach uses an explicit stack to simulate recursion. We push nodes while going left, pop to visit, then move right. This gives more control over termination and avoids potential stack overflow for very deep trees.
- approach_name: Collect and Sort (Suboptimal)
is_optimal: false
code: |
def kth_smallest(root: TreeNode | None, k: int) -> int:
values = []
def collect(node: TreeNode | None) -> None:
if not node:
return
# Collect all values via any traversal
values.append(node.val)
collect(node.left)
collect(node.right)
collect(root)
values.sort()
return values[k - 1] # k is 1-indexed
explanation: |
**Time Complexity:** O(n log n) — Collecting is O(n), sorting is O(n log n).
**Space Complexity:** O(n) — Store all n values.
This approach ignores the BST property entirely. It collects all values, sorts them, and returns the k<sup>th</sup> element. While correct, it's inefficient and doesn't leverage the fact that inorder traversal of a BST is already sorted. Included to illustrate why understanding data structure properties matters.

View File

@@ -0,0 +1,204 @@
title: Largest Rectangle in Histogram
slug: largest-rectangle-in-histogram
difficulty: hard
leetcode_id: 84
leetcode_url: https://leetcode.com/problems/largest-rectangle-in-histogram/
categories:
- arrays
- stack
patterns:
- monotonic-stack
description: |
Given an array of integers `heights` representing the histogram's bar height where the width of each bar is `1`, return *the area of the largest rectangle in the histogram*.
constraints: |
- `1 <= heights.length <= 10^5`
- `0 <= heights[i] <= 10^4`
examples:
- input: "heights = [2,1,5,6,2,3]"
output: "10"
explanation: "The largest rectangle is formed using bars at indices 2 and 3 (heights 5 and 6), with width 2 and height 5, giving area = 10 units."
- input: "heights = [2,4]"
output: "4"
explanation: "The largest rectangle is the single bar of height 4 with width 1, giving area = 4 units."
explanation:
intuition: |
Imagine you're standing at each bar in the histogram, trying to figure out how far you can extend a rectangle horizontally while keeping that bar's height as the minimum.
For any bar at position `i`, the largest rectangle that uses this bar's height extends:
- **Left**: until we hit a bar shorter than `heights[i]`
- **Right**: until we hit a bar shorter than `heights[i]`
The area is then `height[i] * (right_boundary - left_boundary - 1)`.
The brute force approach would check every bar and scan left and right to find boundaries — but that's O(n^2). The key insight is that a **monotonic stack** can find these boundaries efficiently.
Think of it like this: as you scan left-to-right, maintain a stack of bar indices in **increasing order of height**. When you encounter a bar shorter than the stack's top, you've found the right boundary for all taller bars on the stack. Pop them off and calculate their areas — the new stack top gives you their left boundary.
This works because the stack invariant guarantees that for any bar we pop, all bars between its left boundary (the new stack top) and right boundary (current index) are at least as tall.
approach: |
We solve this using a **Monotonic Stack**:
**Step 1: Initialise variables**
- `stack`: Empty list to store indices of bars in increasing height order
- `max_area`: Set to `0` to track the largest rectangle found
&nbsp;
**Step 2: Iterate through each bar**
- For each index `i` with height `heights[i]`:
- While the stack is non-empty AND the current height is less than the height at the stack's top index:
- Pop the top index as `height_idx` — this bar can't extend further right
- Calculate the width: if stack is empty, width is `i` (bar extends to the beginning); otherwise width is `i - stack[-1] - 1`
- Calculate area as `heights[height_idx] * width`
- Update `max_area` if this area is larger
- Push current index `i` onto the stack
&nbsp;
**Step 3: Process remaining bars in the stack**
- After the loop, bars remaining in the stack extend all the way to the right end
- Pop each remaining index and calculate its area using `n` as the right boundary
- Width calculation: if stack becomes empty, width is `n`; otherwise width is `n - stack[-1] - 1`
&nbsp;
**Step 4: Return the result**
- Return `max_area`
&nbsp;
The monotonic stack ensures each bar is pushed and popped at most once, giving O(n) time complexity.
common_pitfalls:
- title: Forgetting to Process Remaining Stack
description: |
After iterating through all bars, some indices may still be on the stack. These represent bars that extend all the way to the right edge of the histogram.
Forgetting to process these remaining bars will miss valid rectangles. For example, with `heights = [2, 4]`, after the loop the stack contains both indices. Without processing, you'd return `0` instead of `4`.
wrong_approach: "Only calculate areas during the main loop"
correct_approach: "Process remaining stack elements using array length as right boundary"
- title: Incorrect Width Calculation
description: |
When popping a bar from the stack, its left boundary isn't always the immediately preceding bar — it's the bar at the new stack top (or the start if stack is empty).
A common mistake is calculating width as `i - popped_index`, but this is wrong. The correct width is `i - stack[-1] - 1` (or just `i` if stack is empty).
For example, in `[1, 5, 6, 2]`, when we pop index 2 (height 6) at index 3, the width isn't `3 - 2 = 1`. The bar of height 6 can extend left to where height 5 is, so width is `3 - 1 - 1 = 1`. But when we pop index 1 (height 5), the width is `3 - 0 - 1 = 2`.
wrong_approach: "Width = current_index - popped_index"
correct_approach: "Width = current_index - new_stack_top - 1 (or current_index if stack empty)"
- title: Brute Force Time Limit Exceeded
description: |
The naive O(n^2) approach — for each bar, scan left and right to find boundaries — will cause TLE with `heights.length <= 10^5`.
10^5 elements means up to 10^10 operations, which is far too slow. The monotonic stack reduces this to O(n) by computing all boundaries in a single pass.
wrong_approach: "For each bar, scan left and right to find boundaries"
correct_approach: "Use monotonic stack to find boundaries in O(n)"
key_takeaways:
- "**Monotonic stack pattern**: When you need to find the next smaller/larger element for all positions, a monotonic stack provides O(n) efficiency"
- "**Width calculation insight**: The left boundary for a popped element is always the new stack top, not the previous element"
- "**Process the stack after iteration**: Elements remaining in the stack have implicit right boundaries at the array end"
- "**Foundation for related problems**: This technique extends to problems like Maximal Rectangle, Trapping Rain Water, and daily temperatures"
time_complexity: "O(n). Each bar index is pushed onto and popped from the stack at most once, giving 2n operations total."
space_complexity: "O(n). In the worst case (strictly increasing heights), all n indices are stored on the stack simultaneously."
solutions:
- approach_name: Monotonic Stack
is_optimal: true
code: |
def largest_rectangle_area(heights: list[int]) -> int:
stack = [] # Store indices of bars in increasing height order
max_area = 0
n = len(heights)
for i in range(n):
# Pop bars that can't extend further right
while stack and heights[i] < heights[stack[-1]]:
height_idx = stack.pop()
height = heights[height_idx]
# Width extends from new stack top to current index
width = i if not stack else i - stack[-1] - 1
max_area = max(max_area, height * width)
stack.append(i)
# Process remaining bars - they extend to the right edge
while stack:
height_idx = stack.pop()
height = heights[height_idx]
# Width extends from new stack top to array end
width = n if not stack else n - stack[-1] - 1
max_area = max(max_area, height * width)
return max_area
explanation: |
**Time Complexity:** O(n) — Each index is pushed and popped at most once.
**Space Complexity:** O(n) — Stack may contain all indices in worst case.
We maintain a stack of indices in increasing height order. When we encounter a shorter bar, we pop taller bars and calculate their areas since they can't extend further right. The stack top after popping gives the left boundary.
- approach_name: Monotonic Stack with Sentinel
is_optimal: true
code: |
def largest_rectangle_area(heights: list[int]) -> int:
# Add sentinel bars: 0-height at start and end
heights = [0] + heights + [0]
stack = [0] # Stack starts with left sentinel
max_area = 0
for i in range(1, len(heights)):
while heights[i] < heights[stack[-1]]:
height = heights[stack.pop()]
# Width between current index and new stack top
width = i - stack[-1] - 1
max_area = max(max_area, height * width)
stack.append(i)
return max_area
explanation: |
**Time Complexity:** O(n) — Same as basic approach.
**Space Complexity:** O(n) — Stack plus modified heights array.
Adding sentinel bars of height 0 at both ends eliminates edge case handling. The left sentinel ensures the stack is never empty, and the right sentinel forces all remaining bars to be processed. This leads to cleaner code at the cost of a slightly modified input array.
- approach_name: Brute Force
is_optimal: false
code: |
def largest_rectangle_area(heights: list[int]) -> int:
max_area = 0
n = len(heights)
for i in range(n):
height = heights[i]
# Find left boundary - first bar shorter than current
left = i
while left > 0 and heights[left - 1] >= height:
left -= 1
# Find right boundary - first bar shorter than current
right = i
while right < n - 1 and heights[right + 1] >= height:
right += 1
# Calculate area with current bar's height
width = right - left + 1
max_area = max(max_area, height * width)
return max_area
explanation: |
**Time Complexity:** O(n^2) — For each bar, we scan left and right.
**Space Complexity:** O(1) — Only tracking indices and area.
For each bar, we expand left and right while adjacent bars are at least as tall. This finds the maximum width rectangle using each bar's height. While correct, this approach is too slow for large inputs (TLE on LeetCode) because it re-scans the same regions repeatedly.

View File

@@ -0,0 +1,200 @@
title: Last Stone Weight II
slug: last-stone-weight-ii
difficulty: medium
leetcode_id: 1049
leetcode_url: https://leetcode.com/problems/last-stone-weight-ii/
categories:
- dynamic-programming
- arrays
patterns:
- dynamic-programming
description: |
You are given an array of integers `stones` where `stones[i]` is the weight of the i<sup>th</sup> stone.
We are playing a game with the stones. On each turn, we choose any two stones and smash them together. Suppose the stones have weights `x` and `y` with `x <= y`. The result of this smash is:
- If `x == y`, both stones are destroyed, and
- If `x != y`, the stone of weight `x` is destroyed, and the stone of weight `y` has new weight `y - x`.
At the end of the game, there is **at most one** stone left.
Return *the smallest possible weight of the left stone*. If there are no stones left, return `0`.
constraints: |
- `1 <= stones.length <= 30`
- `1 <= stones[i] <= 100`
examples:
- input: "stones = [2,7,4,1,8,1]"
output: "1"
explanation: "We can combine 2 and 4 to get 2, so the array converts to [2,7,1,8,1] then, we can combine 7 and 8 to get 1, so the array converts to [2,1,1,1] then, we can combine 2 and 1 to get 1, so the array converts to [1,1,1] then, we can combine 1 and 1 to get 0, so the array converts to [1], then that's the optimal value."
- input: "stones = [31,26,33,21,40]"
output: "5"
explanation: "One way is to smash 31 and 33 to get 2, then smash 26 and 21 to get 5, then smash 40 and 5 to get 35, then smash 35 and 2 to get 33, then smash 33 and 5 to get 28... Actually, the minimum achievable is 5 by optimal partitioning."
explanation:
intuition: |
At first glance, this looks like a simulation problem — keep smashing stones until one remains. But simulating every possible order of smashing would be exponentially complex. There must be a deeper insight.
Here's the key realisation: **smashing is equivalent to assigning signs**. When we smash stones `x` and `y`, we get `|y - x|`. If we keep smashing the results, we're essentially computing `|(a - b) - c|` = `|a - b - c|` or similar expressions. In the end, each original stone contributes either positively or negatively to the final result.
Think of it like this: imagine labelling each stone with `+` or `-`. The final stone's weight equals `|sum of stones with + signs| - |sum of stones with - signs|`. We want to minimise this difference.
This transforms the problem into: **partition the stones into two groups such that the absolute difference between their sums is minimised**. This is the classic "minimum subset sum difference" problem, which is a variant of the 0/1 knapsack.
If the total sum is `S`, and one group has sum `subset_sum`, the other has sum `S - subset_sum`. The difference is `|S - 2 * subset_sum|`. To minimise this, we want `subset_sum` as close to `S / 2` as possible.
approach: |
We solve this using **0/1 Knapsack DP** to find the closest achievable sum to half the total:
**Step 1: Calculate the target**
- Compute `total = sum(stones)`
- Our goal: find the largest `subset_sum <= total // 2` that we can form
- The answer will be `total - 2 * subset_sum`
&nbsp;
**Step 2: Create the DP set**
- Use a set `dp` to track all achievable sums
- Initialise with `{0}` — we can always achieve sum 0 (empty subset)
&nbsp;
**Step 3: Process each stone**
- For each stone, we can either include it or not (0/1 knapsack)
- For each existing sum `s` in our set, `s + stone` is now also achievable
- Add these new sums to our set (but only up to `total // 2` to save space)
&nbsp;
**Step 4: Find the best partition**
- The largest value in `dp` that doesn't exceed `total // 2` is our best `subset_sum`
- Return `total - 2 * subset_sum`
&nbsp;
The set-based approach is elegant and efficient for this problem's constraints. With `stones.length <= 30` and `stones[i] <= 100`, the maximum total is 3000, making this approach very practical.
common_pitfalls:
- title: Trying to Simulate Smashing
description: |
The naive approach of simulating all possible smashing orders has exponential complexity. With 30 stones, there are far too many orderings to try.
The key insight is recognising this as a partitioning problem, not a simulation problem. Once you see that smashing assigns implicit +/- signs, the path to DP becomes clear.
wrong_approach: "Recursively try all pairs of stones to smash"
correct_approach: "Reduce to minimum partition difference using DP"
- title: Using Unbounded Knapsack
description: |
Unlike Coin Change where each coin can be used infinitely, here each stone can only be used **once**. This is 0/1 knapsack, not unbounded knapsack.
If you iterate incorrectly, you might count the same stone multiple times, leading to wrong answers.
wrong_approach: "for stone in stones: for s in dp: add s + stone"
correct_approach: "Process stones one at a time, updating dp carefully"
- title: Forgetting to Limit the Target
description: |
Since we want subset_sum closest to `total / 2`, we only need to track sums up to `total // 2`. Tracking larger sums is redundant — if one group has sum `> total / 2`, the other has sum `< total / 2`, and we'd already have that smaller sum.
This optimisation keeps memory usage reasonable.
wrong_approach: "Track all possible sums up to total"
correct_approach: "Only track sums up to total // 2"
key_takeaways:
- "**Problem reduction**: Recognising that smashing stones = partitioning with +/- signs transforms an intractable simulation into a classic DP problem"
- "**0/1 Knapsack pattern**: Each stone can be used at most once — this is the defining characteristic of 0/1 knapsack"
- "**Minimum partition difference**: Finding two subsets with minimum sum difference is equivalent to finding one subset closest to half the total"
- "**Set-based DP**: Using a set to track achievable sums is clean and efficient for moderate constraints"
time_complexity: "O(n × S) where n is the number of stones and S is the total sum. For each stone, we potentially add up to S/2 new sums."
space_complexity: "O(S) where S is the total sum. The set stores at most S/2 + 1 achievable sums."
solutions:
- approach_name: Set-Based DP
is_optimal: true
code: |
def last_stone_weight_ii(stones: list[int]) -> int:
total = sum(stones)
target = total // 2
# dp stores all achievable subset sums
dp = {0}
for stone in stones:
# For each existing sum, we can add this stone
# Create new set to avoid modifying during iteration
new_sums = set()
for s in dp:
if s + stone <= target:
new_sums.add(s + stone)
dp.update(new_sums)
# Find largest achievable sum <= target
best_sum = max(dp)
# Difference between two groups: (total - best_sum) - best_sum
return total - 2 * best_sum
explanation: |
**Time Complexity:** O(n × S) — For each of n stones, we iterate through sums up to S/2.
**Space Complexity:** O(S) — Set stores achievable sums up to S/2.
We track all achievable subset sums using a set. For each stone, we compute new achievable sums by adding it to existing sums. The largest sum we can achieve that doesn't exceed half the total gives us the closest partition to equal, minimising the leftover stone weight.
- approach_name: Boolean Array DP
is_optimal: true
code: |
def last_stone_weight_ii(stones: list[int]) -> int:
total = sum(stones)
target = total // 2
# dp[i] = True if sum i is achievable
dp = [False] * (target + 1)
dp[0] = True # Empty subset has sum 0
for stone in stones:
# Iterate backwards to avoid using same stone twice
for s in range(target, stone - 1, -1):
if dp[s - stone]:
dp[s] = True
# Find largest achievable sum
for s in range(target, -1, -1):
if dp[s]:
return total - 2 * s
return total # Fallback (shouldn't reach here)
explanation: |
**Time Complexity:** O(n × S) — Same as set-based approach.
**Space Complexity:** O(S) — Boolean array of size S/2 + 1.
This uses a classic 0/1 knapsack boolean array. The key is iterating **backwards** when updating — this ensures each stone is only counted once per subset. If we iterated forwards, we'd potentially add the same stone multiple times.
- approach_name: Brute Force (Exponential)
is_optimal: false
code: |
def last_stone_weight_ii(stones: list[int]) -> int:
def find_min_diff(index: int, sum1: int, sum2: int) -> int:
# Base case: all stones assigned
if index == len(stones):
return abs(sum1 - sum2)
# Try putting current stone in group 1 or group 2
put_in_group1 = find_min_diff(index + 1, sum1 + stones[index], sum2)
put_in_group2 = find_min_diff(index + 1, sum1, sum2 + stones[index])
return min(put_in_group1, put_in_group2)
return find_min_diff(0, 0, 0)
explanation: |
**Time Complexity:** O(2^n) — Each stone has 2 choices, giving 2^n subsets.
**Space Complexity:** O(n) — Recursion stack depth.
This brute force tries all possible partitions by assigning each stone to either group. While correct, it's far too slow for n=30 (over a billion combinations). Included to illustrate the problem structure and why DP is necessary.

View File

@@ -0,0 +1,165 @@
title: Last Stone Weight
slug: last-stone-weight
difficulty: easy
leetcode_id: 1046
leetcode_url: https://leetcode.com/problems/last-stone-weight/
categories:
- arrays
- heap
patterns:
- heap
description: |
You are given an array of integers `stones` where `stones[i]` is the weight of the i<sup>th</sup> stone.
We are playing a game with the stones. On each turn, we choose the **heaviest two stones** and smash them together. Suppose the heaviest two stones have weights `x` and `y` with `x <= y`. The result of this smash is:
- If `x == y`, both stones are destroyed, and
- If `x != y`, the stone of weight `x` is destroyed, and the stone of weight `y` has new weight `y - x`.
At the end of the game, there is **at most one** stone left.
Return *the weight of the last remaining stone*. If there are no stones left, return `0`.
constraints: |
- `1 <= stones.length <= 30`
- `1 <= stones[i] <= 1000`
examples:
- input: "stones = [2,7,4,1,8,1]"
output: "1"
explanation: "We combine 7 and 8 to get 1 so the array converts to [2,4,1,1,1], then we combine 2 and 4 to get 2 so the array converts to [2,1,1,1], then we combine 2 and 1 to get 1 so the array converts to [1,1,1], then we combine 1 and 1 to get 0 so the array converts to [1]. That's the value of the last stone."
- input: "stones = [1]"
output: "1"
explanation: "There is only one stone, so we return its weight directly."
explanation:
intuition: |
Imagine you have a collection of rocks and you keep smashing the two largest ones together. After each collision, either both rocks disappear (if equal weight) or you're left with a smaller rock (the difference in weights).
The key insight is that we always need quick access to the **two heaviest stones**. After smashing, we might need to put a new stone back and find the next two heaviest. This screams for a **max heap** (priority queue) — a data structure designed for exactly this: efficiently finding and removing the maximum element.
Think of it like this: the heap is a "smart pile" that always keeps the biggest stone on top. When you grab the top two stones, smash them, and toss the result back in, the pile automatically rearranges to put the new biggest stone on top.
Without a heap, you'd have to re-sort the array after each smash, which is inefficient. The heap lets you do the same thing in logarithmic time per operation.
approach: |
We solve this using a **Max Heap** approach:
**Step 1: Build a max heap from the stones**
- Python's `heapq` module implements a *min* heap, so we negate all values to simulate a max heap
- Use `heapify()` to convert the list into a heap in O(n) time
&nbsp;
**Step 2: Simulate the smashing process**
- While the heap has more than one stone:
- Pop the two largest stones (negate to get actual values)
- If they're not equal, push the difference back (negated)
- If they're equal, both are destroyed (don't push anything back)
&nbsp;
**Step 3: Return the result**
- If the heap is empty, return `0` (all stones destroyed each other)
- Otherwise, return the remaining stone's weight (negated back to positive)
&nbsp;
This simulation directly follows the problem rules, and the heap ensures we always grab the two heaviest stones efficiently.
common_pitfalls:
- title: Using Sort Instead of Heap
description: |
A tempting approach is to sort the array, take the two largest, and re-sort after each operation. While this works, it's inefficient:
- Sorting takes O(n log n) per smash
- With up to n smashes, total time becomes O(n^2 log n)
The heap approach does each operation in O(log n), giving O(n log n) total.
For this problem's small constraints (n <= 30), sorting works fine, but heaps are the right tool for this pattern.
wrong_approach: "Re-sorting after each smash"
correct_approach: "Use a max heap for O(log n) insert/extract"
- title: Forgetting Python Uses Min Heap
description: |
Python's `heapq` is a min heap, not a max heap. If you push positive values, `heappop()` gives you the *smallest* element, not the largest.
The fix is to negate values: push `-stone` and negate again when popping. This "flips" the ordering so the largest original value becomes the smallest negated value (and thus pops first).
wrong_approach: "Using heapq with positive values"
correct_approach: "Negate values to simulate max heap"
- title: Not Handling the Empty Heap Case
description: |
When all stones perfectly cancel out (e.g., `[2, 2]`), the heap becomes empty. You must check if the heap is empty before trying to return the last element.
Returning `0` when the heap is empty is specified in the problem: "If there are no stones left, return `0`."
key_takeaways:
- "**Max heap pattern**: When you need repeated access to the maximum (or minimum) element with insertions, use a heap"
- "**Python heap trick**: Negate values to convert `heapq` (min heap) into a max heap"
- "**Simulation problems**: Sometimes the solution is just carefully implementing the rules with the right data structure"
- "**Foundation for harder problems**: This pattern extends to problems like merging stones with costs, scheduling, or any greedy selection of extremes"
time_complexity: "O(n log n). We perform at most `n` heap operations (pop and push), and each operation takes O(log n) time."
space_complexity: "O(n). We store all stones in the heap initially. In-place heapify uses no extra space beyond the input."
solutions:
- approach_name: Max Heap
is_optimal: true
code: |
import heapq
def last_stone_weight(stones: list[int]) -> int:
# Negate values to simulate max heap (Python's heapq is min heap)
heap = [-s for s in stones]
heapq.heapify(heap)
# Smash stones until one or none remain
while len(heap) > 1:
# Pop two heaviest stones (negate to get actual values)
first = -heapq.heappop(heap)
second = -heapq.heappop(heap)
# If they're not equal, push the difference back
if first != second:
heapq.heappush(heap, -(first - second))
# Return last stone or 0 if none left
return -heap[0] if heap else 0
explanation: |
**Time Complexity:** O(n log n) — Each of the up to n-1 smash operations involves two pops and at most one push, each O(log n).
**Space Complexity:** O(n) — The heap stores all n stones initially.
We use a max heap to efficiently find and remove the two heaviest stones. After each smash, if there's a remainder, we push it back. The process continues until at most one stone remains.
- approach_name: Sorting (Simulation)
is_optimal: false
code: |
def last_stone_weight(stones: list[int]) -> int:
# Keep smashing until one or none left
while len(stones) > 1:
# Sort to get heaviest at the end
stones.sort()
# Pop two heaviest
first = stones.pop()
second = stones.pop()
# If not equal, push remainder back
if first != second:
stones.append(first - second)
# Return last stone or 0 if none
return stones[0] if stones else 0
explanation: |
**Time Complexity:** O(n^2 log n) — We sort (O(n log n)) up to n times.
**Space Complexity:** O(1) — We modify the input list in-place (or O(n) if counting sort's internal space).
This approach sorts the array each iteration to find the two heaviest stones. It's simpler to understand but less efficient. Works fine for the small constraints (n <= 30) but doesn't scale well. The heap approach is preferred for interview settings to demonstrate knowledge of efficient data structures.

View File

@@ -0,0 +1,177 @@
title: Lemonade Change
slug: lemonade-change
difficulty: easy
leetcode_id: 860
leetcode_url: https://leetcode.com/problems/lemonade-change/
categories:
- arrays
patterns:
- greedy
description: |
At a lemonade stand, each lemonade costs `$5`. Customers are standing in a queue to buy from you and order one at a time (in the order specified by `bills`). Each customer will only buy one lemonade and pay with either a `$5`, `$10`, or `$20` bill. You must provide the correct change to each customer so that the net transaction is that the customer pays `$5`.
Note that you do not have any change in hand at first.
Given an integer array `bills` where `bills[i]` is the bill the i<sup>th</sup> customer pays, return `true` *if you can provide every customer with the correct change*, or `false` *otherwise*.
constraints: |
- `1 <= bills.length <= 10^5`
- `bills[i]` is either `5`, `10`, or `20`
examples:
- input: "bills = [5,5,5,10,20]"
output: "true"
explanation: "From the first 3 customers, we collect three $5 bills. From the fourth customer, we collect a $10 bill and give back a $5. From the fifth customer, we give a $10 bill and a $5 bill. Since all customers got correct change, we output true."
- input: "bills = [5,5,10,10,20]"
output: "false"
explanation: "From the first two customers, we collect two $5 bills. For the next two customers, we collect a $10 bill and give back a $5 bill each. For the last customer, we cannot give the change of $15 back because we only have two $10 bills (no $5 bills left)."
explanation:
intuition: |
Imagine you're actually running a lemonade stand with a cash register. You start with an empty register and customers line up to pay.
The key insight is that **$5 bills are the most valuable** — not because of their face value, but because they're the most *versatile* for making change. A $5 bill can be used to:
- Give change for a $10 bill (need one $5)
- Give change for a $20 bill (need one $10 + one $5, OR three $5s)
Think of it like this: when a customer pays with $20, you have two options for the $15 change:
1. One $10 + one $5 (preferred — uses the less versatile $10)
2. Three $5 bills (backup — depletes your precious $5s)
The greedy choice is always to **preserve your $5 bills** when possible. Use $10 bills first when giving change for $20, because $10 bills can only be used for one purpose (change for $20), while $5 bills can be used for both $10 and $20 transactions.
approach: |
We solve this using a **Greedy Simulation**:
**Step 1: Initialise counters**
- `fives`: Count of $5 bills in hand, starts at `0`
- `tens`: Count of $10 bills in hand, starts at `0`
- We don't need to track $20 bills — they can never be used as change
&nbsp;
**Step 2: Process each customer in order**
- **If customer pays $5**: No change needed. Increment `fives` by 1
- **If customer pays $10**: Need to give $5 change. If `fives == 0`, return `false`. Otherwise, decrement `fives`, increment `tens`
- **If customer pays $20**: Need to give $15 change. Try the greedy choice first:
- If we have at least one $10 and one $5: use them (decrement both)
- Else if we have at least three $5s: use them (decrement `fives` by 3)
- Else: return `false` — we can't make change
&nbsp;
**Step 3: Return the result**
- If we process all customers successfully, return `true`
&nbsp;
The greedy strategy works because using a $10 bill when available always leaves us in a better (or equal) position than using three $5 bills.
common_pitfalls:
- title: Not Prioritising $10 Bills for $20 Change
description: |
When giving change for $20, some might randomly choose between using three $5s or one $10 + one $5. This can lead to failure.
For example, with `bills = [5,5,10,20,5,5,5,10,20,20]`:
- If you use three $5s for the first $20 instead of $10 + $5, you might run out of $5 bills later when a $10 customer arrives
Always prefer using $10 bills for $20 change — $5 bills are more versatile.
wrong_approach: "Random or first-available bill selection"
correct_approach: "Greedy: prefer $10 + $5 over three $5s"
- title: Tracking $20 Bills
description: |
It's tempting to track all bill types, but $20 bills are useless for making change. You can never give a $20 bill back to a customer (since the most change needed is $15).
Tracking $20 bills wastes space and adds unnecessary complexity.
wrong_approach: "Maintaining a counter for $20 bills"
correct_approach: "Only track $5 and $10 bills"
- title: Forgetting the Empty Register Start
description: |
The problem states you start with no change. If the first customer pays with anything other than $5, you immediately fail.
For input `bills = [10,5,5,5,5]`, the answer is `false` because you can't give change to the very first customer.
wrong_approach: "Assuming some initial change is available"
correct_approach: "Start with fives = 0 and tens = 0"
key_takeaways:
- "**Greedy pattern**: When multiple valid choices exist, prefer the one that keeps more options open (preserve $5 bills)"
- "**Simulation**: Sometimes the best approach is to simulate the process step by step"
- "**Track only what matters**: $20 bills are never used as change, so don't track them"
- "**Order matters**: The greedy choice ($10 + $5 over three $5s) ensures we handle future customers optimally"
time_complexity: "O(n). We process each customer exactly once with O(1) operations per customer."
space_complexity: "O(1). We only use two integer counters (`fives` and `tens`) regardless of input size."
solutions:
- approach_name: Greedy Simulation
is_optimal: true
code: |
def lemonade_change(bills: list[int]) -> bool:
# Track only $5 and $10 bills (we never use $20 for change)
fives = 0
tens = 0
for bill in bills:
if bill == 5:
# No change needed, just collect the $5
fives += 1
elif bill == 10:
# Need to give $5 change
if fives == 0:
return False # Can't make change
fives -= 1
tens += 1
else: # bill == 20
# Need to give $15 change
# Greedy: prefer using $10 + $5 to preserve $5 bills
if tens > 0 and fives > 0:
tens -= 1
fives -= 1
elif fives >= 3:
fives -= 3
else:
return False # Can't make change
return True # Successfully served all customers
explanation: |
**Time Complexity:** O(n) — Single pass through the bills array.
**Space Complexity:** O(1) — Only two counters used.
We simulate serving each customer, making the greedy choice to preserve $5 bills when possible. The key insight is that $10 + $5 is always preferred over three $5s for $20 change, because $5 bills are needed for both $10 and $20 transactions while $10 bills are only useful for $20 transactions.
- approach_name: Brute Force Simulation
is_optimal: false
code: |
def lemonade_change(bills: list[int]) -> bool:
# Track all bills (including $20, though unnecessary)
cash = {5: 0, 10: 0, 20: 0}
for bill in bills:
cash[bill] += 1 # Receive payment
change_needed = bill - 5
# Try to make change using largest bills first
for denomination in [20, 10, 5]:
while change_needed >= denomination and cash[denomination] > 0:
change_needed -= denomination
cash[denomination] -= 1
if change_needed > 0:
return False # Couldn't make exact change
return True
explanation: |
**Time Complexity:** O(n) — Still linear, but with more operations per customer.
**Space Complexity:** O(1) — Fixed-size dictionary.
This approach uses a general change-making algorithm: try to use the largest bills first. While it works, it's unnecessarily complex for this specific problem. It also tracks $20 bills (which are never used) and uses a loop where direct conditionals suffice. The greedy simulation above is cleaner and more efficient in practice.

View File

@@ -0,0 +1,211 @@
title: Letter Combinations of a Phone Number
slug: letter-combinations-of-a-phone-number
difficulty: medium
leetcode_id: 17
leetcode_url: https://leetcode.com/problems/letter-combinations-of-a-phone-number/
categories:
- strings
- hash-tables
- recursion
patterns:
- backtracking
description: |
Given a string containing digits from `2-9` inclusive, return all possible letter combinations that the number could represent. Return the answer in **any order**.
A mapping of digits to letters (just like on the telephone buttons) is given below. Note that `1` does not map to any letters.
| Digit | Letters |
|-------|---------|
| 2 | a, b, c |
| 3 | d, e, f |
| 4 | g, h, i |
| 5 | j, k, l |
| 6 | m, n, o |
| 7 | p, q, r, s |
| 8 | t, u, v |
| 9 | w, x, y, z |
constraints: |
- `0 <= digits.length <= 4`
- `digits[i]` is a digit in the range `['2', '9']`
examples:
- input: 'digits = "23"'
output: '["ad","ae","af","bd","be","bf","cd","ce","cf"]'
explanation: "Digit 2 maps to 'abc' and digit 3 maps to 'def'. Combining each letter from 2 with each letter from 3 gives 9 combinations."
- input: 'digits = ""'
output: "[]"
explanation: "Empty input returns an empty list."
- input: 'digits = "2"'
output: '["a","b","c"]'
explanation: "Digit 2 maps to 'abc', so we return all three letters."
explanation:
intuition: |
Think of this problem like an old-school phone where you had to press buttons multiple times to type letters. Each digit opens up a **set of choices**, and we need to explore every possible combination of those choices.
Imagine you're at a crossroads where each path branches into multiple smaller paths. For the input `"23"`:
- First, you see three paths: `a`, `b`, `c` (from digit `2`)
- From each of those paths, you see three more paths: `d`, `e`, `f` (from digit `3`)
- You must walk down every possible route to collect all combinations
This is a classic **backtracking** scenario: build a solution character by character, explore all possibilities at each step, and backtrack to try other options.
The key insight is that we're essentially computing a **Cartesian product** of letter sets. For `n` digits where each digit maps to `k` letters on average, we'll generate roughly `k^n` combinations — and we need to visit them all.
approach: |
We solve this using **Backtracking (DFS)**:
**Step 1: Handle the edge case**
- If the input `digits` is empty, return an empty list immediately
- This avoids unnecessary processing and edge case bugs
&nbsp;
**Step 2: Create the digit-to-letter mapping**
- Build a dictionary mapping each digit (`'2'` through `'9'`) to its corresponding letters
- Example: `'2'` → `'abc'`, `'7'` → `'pqrs'`
&nbsp;
**Step 3: Define a recursive backtracking function**
- `backtrack(index, current_combination)`:
- `index`: which digit we're currently processing
- `current_combination`: the string built so far
&nbsp;
**Step 4: Base case — combination complete**
- If `index == len(digits)`, we've processed all digits
- Add `current_combination` to our results list
&nbsp;
**Step 5: Recursive case — explore all letters for current digit**
- Get the letters corresponding to `digits[index]`
- For each letter:
- Append it to `current_combination`
- Recursively call `backtrack(index + 1, ...)`
- The recursion naturally "backtracks" when it returns
&nbsp;
**Step 6: Return all collected combinations**
- After the recursion completes, return the results list
common_pitfalls:
- title: Forgetting the Empty Input Case
description: |
If `digits = ""`, you should return `[]`, not `[""]`.
A common mistake is initializing the result with an empty string and building from there, which would incorrectly return `[""]` for empty input.
Always check for empty input at the start and return an empty list.
wrong_approach: "Returning [''] for empty input"
correct_approach: "Check if digits is empty and return [] immediately"
- title: Using Iteration Instead of Backtracking
description: |
While you can solve this iteratively by building combinations level by level, it's harder to visualise and more error-prone.
The iterative approach works but misses the opportunity to practice the fundamental backtracking pattern that's essential for harder problems like N-Queens, permutations, and subsets.
wrong_approach: "Complex iterative logic with nested loops"
correct_approach: "Clean recursive backtracking with clear base case"
- title: String Concatenation in Loops
description: |
In Python, repeatedly concatenating strings with `+` in a loop creates new string objects each time, leading to O(n^2) behaviour.
For this problem with `digits.length <= 4`, it's not a performance issue. But for larger inputs, use a list and `''.join()` at the end.
wrong_approach: "current = current + letter in tight loops"
correct_approach: "Use list append and join, or accept small overhead for clarity"
key_takeaways:
- "**Backtracking template**: This problem demonstrates the core backtracking pattern — make a choice, explore, unmake the choice (implicitly via recursion)"
- "**Cartesian product**: Combining elements from multiple sets is a fundamental operation that backtracking handles elegantly"
- "**Hash map for mapping**: Using a dictionary to map digits to letters keeps the code clean and extensible"
- "**Foundation for harder problems**: This exact pattern scales to permutations, combinations, subsets, and constraint satisfaction problems"
time_complexity: "O(4^n * n) where `n` is the length of `digits`. In the worst case (all 7s or 9s), each digit maps to 4 letters. We generate up to 4^n combinations, and each combination takes O(n) time to build."
space_complexity: "O(n) for the recursion stack depth, not counting the output. The maximum recursion depth equals the number of digits."
solutions:
- approach_name: Backtracking (DFS)
is_optimal: true
code: |
def letter_combinations(digits: str) -> list[str]:
# Edge case: empty input
if not digits:
return []
# Mapping of digits to letters (like a phone keypad)
phone_map = {
'2': 'abc', '3': 'def', '4': 'ghi', '5': 'jkl',
'6': 'mno', '7': 'pqrs', '8': 'tuv', '9': 'wxyz'
}
result = []
def backtrack(index: int, current: str) -> None:
# Base case: we've processed all digits
if index == len(digits):
result.append(current)
return
# Get letters for current digit
letters = phone_map[digits[index]]
# Try each letter and recurse
for letter in letters:
backtrack(index + 1, current + letter)
# Start backtracking from index 0 with empty string
backtrack(0, "")
return result
explanation: |
**Time Complexity:** O(4^n * n) — We generate up to 4^n combinations (when digits are 7 or 9), and building each string takes O(n).
**Space Complexity:** O(n) — Recursion stack depth equals the number of digits.
The backtracking approach naturally explores all paths in the decision tree. Each recursive call handles one digit, trying all its letters before returning. This pattern is fundamental to many combinatorial problems.
- approach_name: Iterative (BFS-like)
is_optimal: false
code: |
def letter_combinations(digits: str) -> list[str]:
# Edge case: empty input
if not digits:
return []
phone_map = {
'2': 'abc', '3': 'def', '4': 'ghi', '5': 'jkl',
'6': 'mno', '7': 'pqrs', '8': 'tuv', '9': 'wxyz'
}
# Start with empty combination
result = [""]
# Process each digit
for digit in digits:
letters = phone_map[digit]
# Build new combinations by appending each letter
new_result = []
for combination in result:
for letter in letters:
new_result.append(combination + letter)
result = new_result
return result
explanation: |
**Time Complexity:** O(4^n * n) — Same as backtracking, we still generate all combinations.
**Space Complexity:** O(4^n) — We store all intermediate combinations at each level.
This iterative approach builds combinations level by level. While it works, it uses more space than backtracking and doesn't teach the fundamental recursive pattern. It's included to show an alternative perspective.

View File

@@ -0,0 +1,342 @@
title: LFU Cache
slug: lfu-cache
difficulty: hard
leetcode_id: 460
leetcode_url: https://leetcode.com/problems/lfu-cache/
categories:
- hash-tables
- linked-lists
patterns:
- heap
description: |
Design and implement a data structure for a **Least Frequently Used (LFU)** cache.
Implement the `LFUCache` class:
- `LFUCache(int capacity)` Initialises the object with the `capacity` of the data structure.
- `int get(int key)` Gets the value of the `key` if the `key` exists in the cache. Otherwise, returns `-1`.
- `void put(int key, int value)` Update the value of the `key` if present, or inserts the `key` if not already present. When the cache reaches its `capacity`, it should invalidate and remove the **least frequently used** key before inserting a new item. For this problem, when there is a **tie** (i.e., two or more keys with the same frequency), the **least recently used** key would be invalidated.
To determine the least frequently used key, a **use counter** is maintained for each key in the cache. The key with the smallest **use counter** is the least frequently used key.
When a key is first inserted into the cache, its **use counter** is set to `1` (due to the `put` operation). The **use counter** for a key in the cache is incremented when either a `get` or `put` operation is called on it.
The functions `get` and `put` must each run in **O(1)** average time complexity.
constraints: |
- `1 <= capacity <= 10^4`
- `0 <= key <= 10^5`
- `0 <= value <= 10^9`
- At most `2 * 10^5` calls will be made to `get` and `put`.
examples:
- input: |
["LFUCache", "put", "put", "get", "put", "get", "get", "put", "get", "get", "get"]
[[2], [1, 1], [2, 2], [1], [3, 3], [2], [3], [4, 4], [1], [3], [4]]
output: "[null, null, null, 1, null, -1, 3, null, -1, 3, 4]"
explanation: |
LFUCache lfu = new LFUCache(2);
lfu.put(1, 1); // cache=[1,_], cnt(1)=1
lfu.put(2, 2); // cache=[2,1], cnt(2)=1, cnt(1)=1
lfu.get(1); // return 1, cache=[1,2], cnt(1)=2
lfu.put(3, 3); // 2 is the LFU key because cnt(2)=1 is smallest, invalidate 2
lfu.get(2); // return -1 (not found)
lfu.get(3); // return 3, cache=[3,1], cnt(3)=2
lfu.put(4, 4); // Both 1 and 3 have cnt=2, but 1 is LRU, invalidate 1
lfu.get(1); // return -1 (not found)
lfu.get(3); // return 3, cnt(3)=3
lfu.get(4); // return 4, cnt(4)=2
explanation:
intuition: |
Think of the LFU cache like a **library with limited shelf space**. Each book has two properties: how many times it's been checked out (frequency) and when it was last touched (recency). When the shelves are full and a new book arrives, you remove the book with the fewest checkouts. If two books tie on checkout count, you remove the one that was touched longer ago.
The challenge is achieving O(1) operations. A naive approach might scan all items to find the minimum frequency, but that's O(n). The key insight is to **group items by their frequency** using a clever data structure combination:
1. **Hash map for key lookup** — Instant access to any item's data and frequency
2. **Frequency buckets** — Group all items with the same frequency together
3. **Ordered list within each bucket** — Track recency order for tie-breaking
When an item's frequency increases (from access), we simply move it from one bucket to the next. When we need to evict, we go to the lowest frequency bucket and remove the oldest item (the tail of that bucket's list).
The trick to maintaining O(1) is tracking the `min_freq` variable. It only ever increases by 1 (when items are accessed) or resets to 1 (when new items are inserted). We never need to search for the minimum.
approach: |
We solve this using **Two Hash Maps + Doubly-Linked Lists**:
**Step 1: Define the data structures**
- `key_to_node`: Hash map from key to node (stores value, frequency, and list position)
- `freq_to_list`: Hash map from frequency to a doubly-linked list of nodes with that frequency
- `min_freq`: Integer tracking the current minimum frequency in the cache
- `capacity`: Maximum number of items the cache can hold
- `size`: Current number of items in the cache
&nbsp;
**Step 2: Implement the `get` operation**
- If key doesn't exist, return `-1`
- If key exists:
- Remove the node from its current frequency list
- Increment the node's frequency
- Add the node to the new frequency list (at the head, marking it as most recently used)
- Update `min_freq` if the old frequency list is now empty and was the minimum
- Return the value
&nbsp;
**Step 3: Implement the `put` operation**
- If capacity is 0, do nothing
- If key exists:
- Update the value
- Call the same "touch" logic as `get` to update frequency
- If key doesn't exist:
- If cache is at capacity, evict the LFU item (tail of `freq_to_list[min_freq]`)
- Create a new node with frequency 1
- Add to `key_to_node` and `freq_to_list[1]`
- Reset `min_freq` to 1 (new items always have the lowest possible frequency)
&nbsp;
**Step 4: Implement helper for moving nodes between frequency lists**
- Remove node from old frequency's list
- If that list becomes empty and it was `min_freq`, increment `min_freq`
- Add node to new frequency's list at the head (most recently used position)
&nbsp;
Using doubly-linked lists allows O(1) removal from anywhere and O(1) insertion at head. The hash maps provide O(1) key lookup and O(1) access to any frequency bucket.
common_pitfalls:
- title: Using a Min-Heap for Frequency Tracking
description: |
A natural instinct is to use a min-heap to always know the minimum frequency. However, heaps have O(log n) operations for insertion and deletion.
The problem requires O(1) average time. Instead, track `min_freq` as a simple integer that only changes in predictable ways: it resets to 1 on insert, and may increment by 1 when we access items and empty a frequency bucket.
wrong_approach: "Min-heap to find lowest frequency"
correct_approach: "Track min_freq integer, only increments or resets to 1"
- title: Not Handling the Tie-Breaker Correctly
description: |
When multiple keys have the same frequency, the **least recently used** among them should be evicted. Within each frequency bucket, you need an ordered structure.
Using a set or unordered collection loses recency information. A doubly-linked list with newest items at the head and oldest at the tail provides O(1) access to the LRU item for eviction.
wrong_approach: "Set or unordered collection for frequency groups"
correct_approach: "Doubly-linked list with head=MRU, tail=LRU"
- title: Forgetting to Update min_freq on Access
description: |
When you access an item and increment its frequency, you might empty its old frequency bucket. If that bucket was the minimum, you need to update `min_freq`.
For example, if `min_freq=2` and the only item with frequency 2 gets accessed (now frequency 3), `min_freq` should become 3. Forgetting this leads to evicting items from empty buckets.
wrong_approach: "Only update min_freq on eviction"
correct_approach: "Check if old frequency bucket is empty after access"
- title: Zero Capacity Edge Case
description: |
The constraints allow `capacity >= 1`, but some implementations forget to handle the boundary. With capacity 0, all `put` operations should be no-ops and all `get` operations should return -1.
Always check `if capacity == 0` at the start of `put`.
key_takeaways:
- "**Compound data structures**: Complex cache problems often require combining multiple data structures (hash maps + linked lists) to achieve O(1) for different operations"
- "**Frequency bucketing**: Grouping items by frequency and tracking the minimum avoids expensive searches"
- "**Doubly-linked lists for O(1) removal**: When you need to remove items from the middle of a sequence in O(1), doubly-linked lists are the answer"
- "**LFU vs LRU**: LRU only tracks recency; LFU tracks frequency with recency as tie-breaker. LFU is more complex but can be more cache-efficient for certain access patterns"
time_complexity: "O(1) for both `get` and `put` operations. Hash map lookups, linked list insertions/deletions, and frequency updates are all constant time."
space_complexity: "O(capacity). We store at most `capacity` items, each with constant overhead for hash map entries and list nodes."
solutions:
- approach_name: Two Hash Maps with Doubly-Linked Lists
is_optimal: true
code: |
class Node:
"""Doubly-linked list node storing key, value, and frequency."""
def __init__(self, key: int, value: int):
self.key = key
self.value = value
self.freq = 1 # New items start with frequency 1
self.prev = None
self.next = None
class DoublyLinkedList:
"""Doubly-linked list with sentinel nodes for O(1) operations."""
def __init__(self):
# Sentinel nodes simplify edge cases
self.head = Node(0, 0) # Dummy head (MRU side)
self.tail = Node(0, 0) # Dummy tail (LRU side)
self.head.next = self.tail
self.tail.prev = self.head
self.size = 0
def add_first(self, node: Node) -> None:
"""Add node right after head (most recently used position)."""
node.next = self.head.next
node.prev = self.head
self.head.next.prev = node
self.head.next = node
self.size += 1
def remove(self, node: Node) -> None:
"""Remove a node from anywhere in the list in O(1)."""
node.prev.next = node.next
node.next.prev = node.prev
self.size -= 1
def remove_last(self) -> Node:
"""Remove and return the tail node (least recently used)."""
if self.size == 0:
return None
last = self.tail.prev
self.remove(last)
return last
def is_empty(self) -> bool:
return self.size == 0
class LFUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.size = 0
self.min_freq = 0
# Maps key -> Node
self.key_to_node: dict[int, Node] = {}
# Maps frequency -> DoublyLinkedList of nodes with that frequency
self.freq_to_list: dict[int, DoublyLinkedList] = {}
def _update_freq(self, node: Node) -> None:
"""Move node from current frequency bucket to next frequency bucket."""
freq = node.freq
# Remove from current frequency list
self.freq_to_list[freq].remove(node)
# If this was the min frequency list and it's now empty, increment min_freq
if freq == self.min_freq and self.freq_to_list[freq].is_empty():
self.min_freq += 1
# Increment frequency and add to new list
node.freq += 1
if node.freq not in self.freq_to_list:
self.freq_to_list[node.freq] = DoublyLinkedList()
self.freq_to_list[node.freq].add_first(node)
def get(self, key: int) -> int:
if key not in self.key_to_node:
return -1
node = self.key_to_node[key]
# Update frequency (this also marks it as most recently used)
self._update_freq(node)
return node.value
def put(self, key: int, value: int) -> None:
if self.capacity == 0:
return
if key in self.key_to_node:
# Key exists: update value and frequency
node = self.key_to_node[key]
node.value = value
self._update_freq(node)
else:
# New key: check if we need to evict
if self.size >= self.capacity:
# Evict LFU (and LRU among ties)
lfu_list = self.freq_to_list[self.min_freq]
evicted = lfu_list.remove_last()
del self.key_to_node[evicted.key]
self.size -= 1
# Insert new node with frequency 1
new_node = Node(key, value)
self.key_to_node[key] = new_node
if 1 not in self.freq_to_list:
self.freq_to_list[1] = DoublyLinkedList()
self.freq_to_list[1].add_first(new_node)
self.min_freq = 1 # New items always have the minimum frequency
self.size += 1
explanation: |
**Time Complexity:** O(1) for both `get` and `put`.
- Hash map lookups: O(1)
- Doubly-linked list add/remove: O(1)
- Frequency bucket access: O(1)
**Space Complexity:** O(capacity).
We maintain at most `capacity` nodes, each stored once in `key_to_node` and once in a frequency list. The number of frequency buckets is bounded by the number of operations, but nodes are shared references.
- approach_name: OrderedDict per Frequency (Python-Specific)
is_optimal: true
code: |
from collections import OrderedDict, defaultdict
class LFUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.min_freq = 0
# Maps key -> (value, frequency)
self.key_to_val_freq: dict[int, tuple[int, int]] = {}
# Maps frequency -> OrderedDict of keys (maintains insertion order)
# OrderedDict gives us O(1) move_to_end and popitem
self.freq_to_keys: dict[int, OrderedDict] = defaultdict(OrderedDict)
def _update_freq(self, key: int) -> None:
"""Increment frequency of key and move to appropriate bucket."""
value, freq = self.key_to_val_freq[key]
# Remove from current frequency bucket
del self.freq_to_keys[freq][key]
# Update min_freq if we emptied the minimum bucket
if not self.freq_to_keys[freq] and freq == self.min_freq:
self.min_freq += 1
# Add to next frequency bucket
new_freq = freq + 1
self.freq_to_keys[new_freq][key] = None # Value doesn't matter
self.key_to_val_freq[key] = (value, new_freq)
def get(self, key: int) -> int:
if key not in self.key_to_val_freq:
return -1
self._update_freq(key)
return self.key_to_val_freq[key][0]
def put(self, key: int, value: int) -> None:
if self.capacity == 0:
return
if key in self.key_to_val_freq:
# Update existing key
_, freq = self.key_to_val_freq[key]
self.key_to_val_freq[key] = (value, freq)
self._update_freq(key)
else:
# Evict if at capacity
if len(self.key_to_val_freq) >= self.capacity:
# popitem(last=False) removes oldest (LRU) from min freq bucket
evicted_key, _ = self.freq_to_keys[self.min_freq].popitem(last=False)
del self.key_to_val_freq[evicted_key]
# Insert new key with frequency 1
self.key_to_val_freq[key] = (value, 1)
self.freq_to_keys[1][key] = None
self.min_freq = 1
explanation: |
**Time Complexity:** O(1) average for both `get` and `put`.
Python's `OrderedDict` maintains insertion order and provides O(1) `popitem()` and `move_to_end()`. We use it as a pseudo-linked-list where order represents recency.
**Space Complexity:** O(capacity).
This approach is more Pythonic and concise, leveraging built-in data structures. The trade-off is that it's language-specific and relies on Python's `OrderedDict` implementation details.

View File

@@ -0,0 +1,190 @@
title: Linked List Cycle
slug: linked-list-cycle
difficulty: easy
leetcode_id: 141
leetcode_url: https://leetcode.com/problems/linked-list-cycle/
categories:
- linked-lists
- two-pointers
- hash-tables
patterns:
- fast-slow-pointers
description: |
Given `head`, the head of a linked list, determine if the linked list has a cycle in it.
There is a cycle in a linked list if there is some node in the list that can be reached again by continuously following the `next` pointer. Internally, `pos` is used to denote the index of the node that tail's `next` pointer is connected to. **Note that `pos` is not passed as a parameter**.
Return `true` *if there is a cycle in the linked list*. Otherwise, return `false`.
constraints: |
- `0 <= number of nodes <= 10^4`
- `-10^5 <= Node.val <= 10^5`
- `pos` is `-1` or a valid index in the linked list
examples:
- input: "head = [3,2,0,-4], pos = 1"
output: "true"
explanation: "There is a cycle in the linked list, where the tail connects to the 1st node (0-indexed)."
- input: "head = [1,2], pos = 0"
output: "true"
explanation: "There is a cycle in the linked list, where the tail connects to the 0th node."
- input: "head = [1], pos = -1"
output: "false"
explanation: "There is no cycle in the linked list."
explanation:
intuition: |
Imagine two runners on a circular track: one runs twice as fast as the other.
If the track is truly circular (has a cycle), the fast runner will eventually "lap" the slow runner and they'll meet. If the track has an end (no cycle), the fast runner will simply reach the finish line without ever meeting the slow runner again.
This is the core insight behind **Floyd's Cycle Detection Algorithm** (also called the "tortoise and hare" algorithm). We use two pointers moving at different speeds:
- The **slow pointer** moves one step at a time
- The **fast pointer** moves two steps at a time
If there's a cycle, the fast pointer will eventually catch up to the slow pointer from behind (they'll meet inside the cycle). If there's no cycle, the fast pointer will reach `null` and we know the list terminates.
Why does this work? Once both pointers enter the cycle, the fast pointer gains one node on the slow pointer with each iteration. Since the cycle has finite length, they're guaranteed to meet.
approach: |
We solve this using **Floyd's Cycle Detection (Fast-Slow Pointers)**:
**Step 1: Handle edge cases**
- If the list is empty (`head` is `null`) or has only one node with no cycle, return `false`
&nbsp;
**Step 2: Initialise two pointers**
- `slow`: Starts at `head`, moves one node at a time
- `fast`: Starts at `head`, moves two nodes at a time
&nbsp;
**Step 3: Traverse the list**
- While `fast` and `fast.next` are not `null`:
- Move `slow` forward by one: `slow = slow.next`
- Move `fast` forward by two: `fast = fast.next.next`
- If `slow == fast`, we've found a cycle — return `true`
&nbsp;
**Step 4: Return the result**
- If the loop exits (fast reached `null`), there's no cycle — return `false`
&nbsp;
The beauty of this approach is that it uses constant space while guaranteeing detection if a cycle exists.
common_pitfalls:
- title: Using Extra Space with Hash Set
description: |
A straightforward approach is to use a hash set to track visited nodes:
- Traverse the list, adding each node to a set
- If you encounter a node already in the set, there's a cycle
This works correctly with **O(n) time**, but uses **O(n) space** for the hash set. The follow-up asks for O(1) space, which the fast-slow pointer approach achieves.
wrong_approach: "Hash set to track visited nodes (O(n) space)"
correct_approach: "Fast-slow pointers (O(1) space)"
- title: Checking Node Values Instead of References
description: |
A common mistake is comparing `slow.val == fast.val` instead of `slow == fast`.
Node *values* can be duplicated (the constraint allows values from `-10^5` to `10^5`), but node *references* (memory addresses) are unique. Two different nodes might have the same value, so comparing values could give false positives.
Always compare the node references themselves, not their values.
wrong_approach: "Comparing slow.val == fast.val"
correct_approach: "Comparing slow == fast (reference equality)"
- title: Null Pointer Exceptions
description: |
When moving the fast pointer two steps, you must check both `fast` and `fast.next` before accessing `fast.next.next`.
If `fast` is `null`, accessing `fast.next` throws an error. If `fast.next` is `null`, accessing `fast.next.next` throws an error.
The loop condition `while fast and fast.next` ensures both checks are satisfied before moving.
wrong_approach: "Moving fast without null checks"
correct_approach: "Check fast and fast.next before moving"
key_takeaways:
- "**Floyd's algorithm**: The fast-slow pointer technique detects cycles in O(n) time and O(1) space — a fundamental pattern for linked list problems"
- "**Why they meet**: In a cycle, the fast pointer gains one position per iteration on the slow pointer, guaranteeing they meet within one cycle length"
- "**Reference vs value**: Always compare node references, not values, when checking for the same node"
- "**Foundation for harder problems**: This same technique extends to finding the cycle start point (LeetCode 142) and finding the middle of a linked list"
time_complexity: "O(n). In the worst case, both pointers traverse the entire list. If there's a cycle, they meet within O(n) steps."
space_complexity: "O(1). We only use two pointer variables (`slow` and `fast`), regardless of the list size."
solutions:
- approach_name: Floyd's Cycle Detection (Fast-Slow Pointers)
is_optimal: true
code: |
class ListNode:
def __init__(self, val: int = 0, next: 'ListNode | None' = None):
self.val = val
self.next = next
def has_cycle(head: ListNode | None) -> bool:
# Handle empty list
if not head:
return False
# Initialise slow and fast pointers
slow = head
fast = head
# Traverse until fast reaches the end
while fast and fast.next:
# Move slow one step
slow = slow.next
# Move fast two steps
fast = fast.next.next
# If they meet, there's a cycle
if slow == fast:
return True
# Fast reached the end — no cycle
return False
explanation: |
**Time Complexity:** O(n) — Each node is visited at most twice (once by slow, potentially twice by fast).
**Space Complexity:** O(1) — Only two pointer variables are used.
The fast pointer moves twice as fast as the slow pointer. If there's a cycle, the fast pointer will eventually catch up to the slow pointer inside the cycle. If there's no cycle, the fast pointer reaches the end.
- approach_name: Hash Set
is_optimal: false
code: |
class ListNode:
def __init__(self, val: int = 0, next: 'ListNode | None' = None):
self.val = val
self.next = next
def has_cycle(head: ListNode | None) -> bool:
# Track visited nodes
visited: set[ListNode] = set()
current = head
while current:
# If we've seen this node before, there's a cycle
if current in visited:
return True
# Mark this node as visited
visited.add(current)
current = current.next
# Reached the end — no cycle
return False
explanation: |
**Time Complexity:** O(n) — We traverse each node once.
**Space Complexity:** O(n) — We store up to n node references in the hash set.
This approach is intuitive: track every node you visit, and if you see the same node twice, there's a cycle. While correct, it uses extra space that the fast-slow pointer approach avoids.

View File

@@ -0,0 +1,226 @@
title: Longest Common Prefix
slug: longest-common-prefix
difficulty: easy
leetcode_id: 14
leetcode_url: https://leetcode.com/problems/longest-common-prefix/
categories:
- strings
- arrays
patterns:
- two-pointers
function_signature: "def longest_common_prefix(strs: list[str]) -> str:"
test_cases:
visible:
- input: { strs: ["flower", "flow", "flight"] }
expected: "fl"
- input: { strs: ["dog", "racecar", "car"] }
expected: ""
hidden:
- input: { strs: ["a"] }
expected: "a"
- input: { strs: ["", "b"] }
expected: ""
- input: { strs: ["abc", "abc", "abc"] }
expected: "abc"
- input: { strs: ["ab", "a"] }
expected: "a"
- input: { strs: ["cir", "car"] }
expected: "c"
description: |
Write a function to find the longest common prefix string amongst an array of strings.
If there is no common prefix, return an empty string `""`.
constraints: |
- `1 <= strs.length <= 200`
- `0 <= strs[i].length <= 200`
- `strs[i]` consists of only lowercase English letters if it is non-empty.
examples:
- input: 'strs = ["flower","flow","flight"]'
output: '"fl"'
explanation: "The first two characters 'f' and 'l' are common to all three strings."
- input: 'strs = ["dog","racecar","car"]'
output: '""'
explanation: "There is no common prefix among the input strings."
explanation:
intuition: |
Imagine you have a stack of papers, each with a word written on it. You want to find how many letters at the start of each word are exactly the same across all papers.
Think of it like aligning all the words vertically by their first character:
```
f l o w e r
f l o w
f l i g h t
```
You scan column by column from left to right. As long as every word has the same character in that column, you include it in your prefix. The moment you find a mismatch (like 'o' vs 'i' in column 3 above), you stop — everything before that point is your longest common prefix.
The key insight is that the common prefix can only be as long as the **shortest string** in the array, and we can stop as soon as any character differs.
approach: |
We solve this using a **Vertical Scanning** approach:
**Step 1: Handle edge case**
- If the input array is empty, return an empty string `""`
&nbsp;
**Step 2: Iterate character by character**
- Use the first string as a reference
- For each character position `i` in the first string, compare it against the character at position `i` in every other string
&nbsp;
**Step 3: Check for mismatches or end of string**
- If any string is shorter than position `i`, we've reached the end of that string — return the prefix found so far
- If any string has a different character at position `i`, we've found a mismatch — return the prefix found so far
&nbsp;
**Step 4: Build the prefix**
- If all strings match at position `i`, continue to the next position
- After checking all positions in the first string, return it entirely (it's the common prefix)
&nbsp;
This approach efficiently scans vertically through all strings simultaneously, stopping at the first point of divergence.
common_pitfalls:
- title: Forgetting the Empty Array Case
description: |
If the input array is empty, there are no strings to compare. Attempting to access `strs[0]` will cause an index error.
Always check for an empty array first and return `""` immediately.
wrong_approach: "Directly accessing strs[0] without checking array length"
correct_approach: "Check if strs is empty before processing"
- title: Index Out of Bounds on Shorter Strings
description: |
When comparing character by character, some strings may be shorter than others. For example, with `["ab", "a"]`, checking index 1 on the second string causes an error.
Always verify that the current index is within bounds for each string before accessing it: `if i >= len(strs[j])`.
wrong_approach: "Accessing strs[j][i] without checking length"
correct_approach: "Check i < len(strs[j]) before accessing the character"
- title: Using the Horizontal Scanning Inefficiently
description: |
A horizontal approach compares strings pairwise: find the common prefix of strings 1 and 2, then compare that result with string 3, and so on.
While correct, this can be less efficient in practice. If the first two strings share a long prefix but string 3 is very different, you've done unnecessary work. Vertical scanning stops at the first column with a mismatch across all strings.
wrong_approach: "Pairwise comparison accumulating prefixes"
correct_approach: "Vertical scanning comparing all strings at each position"
key_takeaways:
- "**Vertical scanning pattern**: When comparing multiple sequences, scanning position-by-position across all sequences simultaneously can be more efficient than pairwise comparison"
- "**Early termination**: Stop as soon as you find a mismatch or reach the end of any string — no need to process further"
- "**Use the shortest string**: The common prefix can never be longer than the shortest string, so checking bounds is essential"
- "**Foundation for string problems**: This pattern of character-by-character comparison appears in many string matching problems"
time_complexity: "O(S), where S is the sum of all characters in all strings. In the worst case, all strings are identical and we compare every character."
space_complexity: "O(1). We only use a few variables for iteration, not counting the output string."
solutions:
- approach_name: Vertical Scanning
is_optimal: true
code: |
def longest_common_prefix(strs: list[str]) -> str:
# Handle empty input
if not strs:
return ""
# Use the first string as reference
for i in range(len(strs[0])):
char = strs[0][i]
# Compare this character with all other strings
for j in range(1, len(strs)):
# Check if we've reached the end of this string
# or if the characters don't match
if i >= len(strs[j]) or strs[j][i] != char:
# Return prefix up to (but not including) position i
return strs[0][:i]
# All characters in first string matched all other strings
return strs[0]
explanation: |
**Time Complexity:** O(S) — where S is the sum of all characters in all strings. We compare each character at most once.
**Space Complexity:** O(1) — only using index variables, not counting the output.
We scan vertically through all strings at each character position. The moment we find any mismatch or reach the end of any string, we return what we've found so far.
- approach_name: Horizontal Scanning
is_optimal: false
code: |
def longest_common_prefix(strs: list[str]) -> str:
# Handle empty input
if not strs:
return ""
# Start with the first string as the initial prefix
prefix = strs[0]
# Compare prefix with each subsequent string
for i in range(1, len(strs)):
# Shrink prefix until it matches the start of current string
while not strs[i].startswith(prefix):
# Remove last character from prefix
prefix = prefix[:-1]
# No common prefix exists
if not prefix:
return ""
return prefix
explanation: |
**Time Complexity:** O(S) — where S is the sum of all characters. In the worst case, we compare all characters.
**Space Complexity:** O(1) — only storing the prefix reference.
This approach starts with the first string as the candidate prefix and progressively shortens it until it matches the beginning of each subsequent string. While correct, it may do more work than vertical scanning when early strings share a long prefix but later strings diverge early.
- approach_name: Binary Search
is_optimal: false
code: |
def longest_common_prefix(strs: list[str]) -> str:
# Handle empty input
if not strs:
return ""
def is_common_prefix(length: int) -> bool:
"""Check if first 'length' chars of strs[0] is a prefix of all strings."""
prefix = strs[0][:length]
return all(s.startswith(prefix) for s in strs)
# Find the minimum string length
min_len = min(len(s) for s in strs)
# Binary search for the longest valid prefix length
low, high = 0, min_len
while low < high:
# Use upper middle to avoid infinite loop
mid = (low + high + 1) // 2
if is_common_prefix(mid):
# Prefix of this length works, try longer
low = mid
else:
# Prefix too long, try shorter
high = mid - 1
return strs[0][:low]
explanation: |
**Time Complexity:** O(S * log(m)) — where S is the sum of all characters and m is the minimum string length. Binary search runs log(m) iterations, each checking all strings.
**Space Complexity:** O(1) — only using variables for binary search.
This approach uses binary search on the length of the prefix. While theoretically interesting, it's generally slower in practice than vertical scanning because it may repeatedly check the same characters. Included to demonstrate how binary search can apply to string problems.

View File

@@ -0,0 +1,208 @@
title: Longest Common Subsequence
slug: longest-common-subsequence
difficulty: medium
leetcode_id: 1143
leetcode_url: https://leetcode.com/problems/longest-common-subsequence/
categories:
- strings
- dynamic-programming
patterns:
- dynamic-programming
description: |
Given two strings `text1` and `text2`, return *the length of their longest **common subsequence***. If there is no **common subsequence**, return `0`.
A **subsequence** of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.
For example, `"ace"` is a subsequence of `"abcde"`.
A **common subsequence** of two strings is a subsequence that is common to both strings.
constraints: |
- `1 <= text1.length, text2.length <= 1000`
- `text1` and `text2` consist of only lowercase English characters.
examples:
- input: 'text1 = "abcde", text2 = "ace"'
output: "3"
explanation: 'The longest common subsequence is "ace" and its length is 3.'
- input: 'text1 = "abc", text2 = "abc"'
output: "3"
explanation: 'The longest common subsequence is "abc" and its length is 3.'
- input: 'text1 = "abc", text2 = "def"'
output: "0"
explanation: "There is no such common subsequence, so the result is 0."
explanation:
intuition: |
Imagine you're comparing two sequences of characters, trying to find the longest chain of letters that appears in both — not necessarily consecutively, but in the same relative order.
Think of it like comparing two playlists of songs. You want to find the longest sequence of songs that appears in both playlists, where the songs appear in the same order (though not necessarily back-to-back). You can't rearrange songs — you can only skip ones that don't match.
The **key insight** is that this problem has **optimal substructure**: if we know the LCS of smaller prefixes of both strings, we can build up to the answer for the full strings. When characters match, we extend our subsequence; when they don't, we take the better result from either excluding the last character of the first string or the second.
This is a classic **dynamic programming** problem because:
1. We can break it into overlapping subproblems (comparing prefixes of different lengths)
2. The solution to larger problems depends on solutions to smaller ones
3. We can store intermediate results to avoid redundant computation
approach: |
We solve this using a **2D Dynamic Programming** approach with a table where `dp[i][j]` represents the length of the LCS of `text1[0:i]` and `text2[0:j]`.
**Step 1: Create the DP table**
- Create a 2D array `dp` of size `(m+1) x (n+1)` where `m = len(text1)` and `n = len(text2)`
- The extra row and column handle the base case of empty prefixes
- Initialise all values to `0` (the LCS of any string with an empty string is `0`)
&nbsp;
**Step 2: Fill the table using the recurrence relation**
- Iterate through each cell `dp[i][j]` for `i` from `1` to `m` and `j` from `1` to `n`
- If `text1[i-1] == text2[j-1]`: the characters match, so `dp[i][j] = dp[i-1][j-1] + 1`
- Otherwise: take the maximum of excluding one character from either string: `dp[i][j] = max(dp[i-1][j], dp[i][j-1])`
&nbsp;
**Step 3: Return the result**
- The answer is in `dp[m][n]`, representing the LCS of the complete strings
&nbsp;
The recurrence works because when characters match, we've found a common element and extend the LCS of the previous prefixes. When they don't match, we take the best LCS we can get by ignoring one character from either string.
common_pitfalls:
- title: Confusing Subsequence with Substring
description: |
A **substring** must be contiguous (consecutive characters), while a **subsequence** allows gaps.
For `"abcde"` and `"ace"`:
- The longest common **substring** is `"a"` or `"c"` or `"e"` (length 1)
- The longest common **subsequence** is `"ace"` (length 3)
Using a substring algorithm (like checking all contiguous windows) will give the wrong answer. LCS requires dynamic programming because we need to track non-contiguous matches.
wrong_approach: "Sliding window for contiguous matches"
correct_approach: "2D DP tracking all prefix combinations"
- title: The Brute Force Exponential Trap
description: |
A naive approach might try all possible subsequences of one string and check if each exists in the other.
For a string of length `n`, there are `2^n` possible subsequences. With constraints up to `1000` characters, `2^1000` operations is astronomically impossible.
Even with recursion and memoisation, without proper caching you'll recompute the same subproblems many times. The DP table ensures each subproblem is solved exactly once.
wrong_approach: "Generate all subsequences and check membership"
correct_approach: "Bottom-up DP with O(m*n) time"
- title: Off-by-One Index Errors
description: |
The DP table has dimensions `(m+1) x (n+1)` to include the empty prefix base case.
When comparing characters, use `text1[i-1]` and `text2[j-1]` (not `text1[i]` and `text2[j]`) because `dp[i][j]` represents prefixes of length `i` and `j`.
A common mistake is using `text1[i]` which causes an index out of bounds error or compares the wrong characters.
wrong_approach: "Compare text1[i] with text2[j] directly"
correct_approach: "Compare text1[i-1] with text2[j-1] when filling dp[i][j]"
key_takeaways:
- "**Classic DP problem**: LCS is a foundational dynamic programming problem that appears in many variations (edit distance, diff algorithms, DNA sequence alignment)"
- "**2D table pattern**: When comparing two sequences, a 2D DP table where `dp[i][j]` represents the answer for prefixes of length `i` and `j` is a common technique"
- "**Optimal substructure**: Match = extend previous result by 1; no match = take the best of two subproblems"
- "**Space optimisation possible**: Since each row only depends on the previous row, you can reduce space from O(m*n) to O(min(m,n)) using rolling arrays"
time_complexity: "O(m * n). We fill each cell of the `m x n` DP table exactly once, where `m` and `n` are the lengths of the two strings."
space_complexity: "O(m * n). We use a 2D array of size `(m+1) x (n+1)` to store intermediate results. This can be optimised to O(min(m, n)) using a rolling array since we only need the previous row."
solutions:
- approach_name: 2D Dynamic Programming
is_optimal: true
code: |
def longest_common_subsequence(text1: str, text2: str) -> int:
m, n = len(text1), len(text2)
# Create DP table with extra row/col for empty string base case
# dp[i][j] = LCS length of text1[0:i] and text2[0:j]
dp = [[0] * (n + 1) for _ in range(m + 1)]
# Fill the table row by row
for i in range(1, m + 1):
for j in range(1, n + 1):
if text1[i - 1] == text2[j - 1]:
# Characters match: extend LCS from diagonal
dp[i][j] = dp[i - 1][j - 1] + 1
else:
# No match: take best of excluding one char from either string
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
# Answer is LCS of complete strings
return dp[m][n]
explanation: |
**Time Complexity:** O(m * n) — We iterate through every cell in the DP table once.
**Space Complexity:** O(m * n) — We store the full 2D DP table.
This bottom-up approach builds the solution systematically. Each cell depends only on already-computed cells (top, left, and diagonal), so we fill row by row. The final cell contains the answer for the complete strings.
- approach_name: Space-Optimised DP (Rolling Array)
is_optimal: true
code: |
def longest_common_subsequence(text1: str, text2: str) -> int:
# Ensure text2 is the shorter string to minimise space
if len(text1) < len(text2):
text1, text2 = text2, text1
m, n = len(text1), len(text2)
# Only keep two rows: previous and current
prev = [0] * (n + 1)
curr = [0] * (n + 1)
for i in range(1, m + 1):
for j in range(1, n + 1):
if text1[i - 1] == text2[j - 1]:
# Match: extend from diagonal (prev row, prev column)
curr[j] = prev[j - 1] + 1
else:
# No match: best of top (prev[j]) or left (curr[j-1])
curr[j] = max(prev[j], curr[j - 1])
# Roll the arrays: current becomes previous for next iteration
prev, curr = curr, prev
# Answer is in prev (after the swap)
return prev[n]
explanation: |
**Time Complexity:** O(m * n) — Same iteration as the 2D approach.
**Space Complexity:** O(min(m, n)) — Only two arrays of length `n+1` are used.
Since each row only depends on the immediately previous row, we can discard older rows. By swapping `prev` and `curr` after each row, we maintain a "rolling window" of just two rows. We also swap strings if needed to ensure we use the shorter length for our arrays.
- approach_name: Recursive with Memoisation
is_optimal: false
code: |
def longest_common_subsequence(text1: str, text2: str) -> int:
from functools import lru_cache
@lru_cache(maxsize=None)
def lcs(i: int, j: int) -> int:
# Base case: empty prefix
if i == 0 or j == 0:
return 0
# Characters match: include in LCS
if text1[i - 1] == text2[j - 1]:
return lcs(i - 1, j - 1) + 1
# No match: try excluding from each string
return max(lcs(i - 1, j), lcs(i, j - 1))
return lcs(len(text1), len(text2))
explanation: |
**Time Complexity:** O(m * n) — Each unique `(i, j)` state is computed once due to memoisation.
**Space Complexity:** O(m * n) — For the memoisation cache, plus O(m + n) for the recursion stack.
This top-down approach is more intuitive but uses more memory due to the recursion stack. It's useful for understanding the problem structure but the iterative DP solution is generally preferred in interviews for its predictable space usage and no risk of stack overflow.

View File

@@ -0,0 +1,231 @@
title: Longest Consecutive Sequence
slug: longest-consecutive-sequence
difficulty: medium
leetcode_id: 128
leetcode_url: https://leetcode.com/problems/longest-consecutive-sequence/
categories:
- arrays
- hash-tables
patterns:
- union-find
description: |
Given an unsorted array of integers `nums`, return *the length of the longest consecutive elements sequence*.
You must write an algorithm that runs in `O(n)` time.
constraints: |
- `0 <= nums.length <= 10^5`
- `-10^9 <= nums[i] <= 10^9`
examples:
- input: "nums = [100, 4, 200, 1, 3, 2]"
output: "4"
explanation: "The longest consecutive elements sequence is [1, 2, 3, 4]. Therefore its length is 4."
- input: "nums = [0, 3, 7, 2, 5, 8, 4, 6, 0, 1]"
output: "9"
explanation: "The longest consecutive elements sequence is [0, 1, 2, 3, 4, 5, 6, 7, 8]. Therefore its length is 9."
- input: "nums = [1, 0, 1, 2]"
output: "3"
explanation: "The longest consecutive elements sequence is [0, 1, 2]. Therefore its length is 3."
explanation:
intuition: |
Imagine you have a collection of scattered puzzle pieces, each with a number on it. Your goal is to find the **longest chain** where each piece connects to the next (consecutive numbers). The naive approach would be to pick up each piece and search through all other pieces for its neighbour — but that's slow.
The key insight is this: **a consecutive sequence always has a starting point** — a number that has no predecessor (`num - 1` doesn't exist in the array). If we can identify these starting points efficiently, we can then count forward from each one to find the sequence length.
Think of it like this: instead of blindly searching, we first dump all the puzzle pieces into a bag (a hash set) for O(1) lookups. Then, for each piece, we ask: "Is there a piece with `num - 1`?" If not, this piece is the **start of a potential sequence**. We then count forward: does `num + 1` exist? Does `num + 2` exist? And so on.
By only counting forward from sequence starts, we ensure each number is visited at most twice (once when added to the set, once when counted in a sequence), giving us O(n) time.
approach: |
We solve this using a **Hash Set Approach**:
**Step 1: Handle edge cases**
- If the array is empty, return `0`
&nbsp;
**Step 2: Build a hash set**
- Convert the array to a set for O(1) lookups
- This also automatically handles duplicates
&nbsp;
**Step 3: Find sequence starting points**
- Iterate through each number in the set
- A number is a sequence start if `num - 1` is NOT in the set
- This ensures we only start counting from the beginning of each sequence
&nbsp;
**Step 4: Count consecutive elements**
- For each starting point, count how many consecutive numbers exist
- Keep checking if `num + 1`, `num + 2`, etc. are in the set
- Track the maximum sequence length found
&nbsp;
**Step 5: Return the result**
- Return the longest sequence length found
&nbsp;
This approach is efficient because each number is processed at most twice: once to check if it's a starting point, and once when counting a sequence.
common_pitfalls:
- title: The Sorting Trap
description: |
A natural first instinct is to sort the array and then scan for consecutive elements. While this works correctly, sorting takes **O(n log n)** time.
The problem explicitly requires O(n) time complexity, so a sorting-based solution would fail this requirement. The hash set approach achieves true O(n) by trading time for space.
wrong_approach: "Sort then scan for consecutive elements"
correct_approach: "Use a hash set for O(1) lookups"
- title: Counting From Every Number
description: |
If you try to count the sequence length starting from every number in the array, you'll get O(n²) time complexity in the worst case.
For example, with `nums = [1, 2, 3, 4, 5]`, starting from `5` counts 1 element, from `4` counts 2, from `3` counts 3, and so on — leading to 1 + 2 + 3 + 4 + 5 = O(n²) total work.
The fix is to **only count from sequence starting points** (numbers where `num - 1` doesn't exist). This ensures each element is counted exactly once across all sequences.
wrong_approach: "Count sequence length from every element"
correct_approach: "Only count from elements where num - 1 is not in the set"
- title: Not Handling Duplicates
description: |
The array may contain duplicate values (e.g., `[1, 0, 1, 2]`). If you iterate over the original array instead of the set, you might count the same sequence multiple times or get incorrect lengths.
Using a set automatically deduplicates the input, ensuring each unique number is processed only once.
key_takeaways:
- "**Hash set for O(1) lookups**: When you need to check membership repeatedly, convert to a set first"
- "**Identify sequence boundaries**: Only start counting from elements that begin a sequence (`num - 1` not present)"
- "**Each element visited once**: Smart iteration ensures O(n) despite nested-looking loops"
- "**Space-time tradeoff**: We use O(n) space to achieve O(n) time instead of O(n log n)"
time_complexity: "O(n). Each number is visited at most twice — once when checking if it's a sequence start, and once when counting forward from a starting point."
space_complexity: "O(n). We store all unique elements in a hash set."
solutions:
- approach_name: Hash Set
is_optimal: true
code: |
def longest_consecutive(nums: list[int]) -> int:
if not nums:
return 0
# Build a set for O(1) lookups
num_set = set(nums)
longest = 0
for num in num_set:
# Only start counting if this is the beginning of a sequence
# (i.e., num - 1 is not in the set)
if num - 1 not in num_set:
current_num = num
current_length = 1
# Count consecutive numbers
while current_num + 1 in num_set:
current_num += 1
current_length += 1
# Update the longest sequence found
longest = max(longest, current_length)
return longest
explanation: |
**Time Complexity:** O(n) — Each number is processed at most twice.
**Space Complexity:** O(n) — Hash set stores all unique elements.
The key optimisation is only counting from sequence starting points. When we find a number where `num - 1` doesn't exist, we know it's the start of a new sequence and count forward from there.
- approach_name: Sorting
is_optimal: false
code: |
def longest_consecutive(nums: list[int]) -> int:
if not nums:
return 0
# Sort the array
nums.sort()
longest = 1
current_length = 1
for i in range(1, len(nums)):
# Skip duplicates
if nums[i] == nums[i - 1]:
continue
# Check if consecutive
if nums[i] == nums[i - 1] + 1:
current_length += 1
else:
# Sequence broken, start fresh
longest = max(longest, current_length)
current_length = 1
return max(longest, current_length)
explanation: |
**Time Complexity:** O(n log n) — Dominated by the sorting step.
**Space Complexity:** O(1) or O(n) — Depends on the sorting algorithm used.
This approach sorts the array first, then scans linearly to find consecutive sequences. While simpler to understand, it doesn't meet the O(n) time requirement specified in the problem. Included here to illustrate the tradeoff between simplicity and optimal complexity.
- approach_name: Union-Find
is_optimal: false
code: |
def longest_consecutive(nums: list[int]) -> int:
if not nums:
return 0
# Map each number to its index
num_to_idx = {}
for i, num in enumerate(nums):
if num not in num_to_idx:
num_to_idx[num] = num # Each number is its own parent initially
# Union-Find with path compression
def find(x):
if num_to_idx[x] != x:
num_to_idx[x] = find(num_to_idx[x])
return num_to_idx[x]
def union(x, y):
root_x, root_y = find(x), find(y)
if root_x != root_y:
# Always point to the larger number
if root_x < root_y:
num_to_idx[root_x] = root_y
else:
num_to_idx[root_y] = root_x
# Union consecutive numbers
for num in num_to_idx:
if num + 1 in num_to_idx:
union(num, num + 1)
# Count sequence lengths by finding the root of each number
# and measuring distance to root
longest = 0
for num in num_to_idx:
root = find(num)
longest = max(longest, root - num + 1)
return longest
explanation: |
**Time Complexity:** O(n × α(n)) ≈ O(n) — Where α is the inverse Ackermann function.
**Space Complexity:** O(n) — Storage for the parent mapping.
Union-Find groups consecutive numbers into the same set. While this is a valid O(n) approach, it's more complex than the hash set solution. The hash set approach is preferred for its simplicity and clarity.

View File

@@ -0,0 +1,237 @@
title: Longest Happy String
slug: longest-happy-string
difficulty: medium
leetcode_id: 1405
leetcode_url: https://leetcode.com/problems/longest-happy-string/
categories:
- strings
- heap
patterns:
- greedy
- heap
description: |
A string `s` is called **happy** if it satisfies the following conditions:
- `s` only contains the letters `'a'`, `'b'`, and `'c'`.
- `s` does not contain any of `"aaa"`, `"bbb"`, or `"ccc"` as a substring.
- `s` contains **at most** `a` occurrences of the letter `'a'`.
- `s` contains **at most** `b` occurrences of the letter `'b'`.
- `s` contains **at most** `c` occurrences of the letter `'c'`.
Given three integers `a`, `b`, and `c`, return *the **longest possible happy** string*. If there are multiple longest happy strings, return *any of them*. If there is no such string, return *the empty string* `""`.
A **substring** is a contiguous sequence of characters within a string.
constraints: |
- `0 <= a, b, c <= 100`
- `a + b + c > 0`
examples:
- input: "a = 1, b = 1, c = 7"
output: '"ccaccbcc"'
explanation: '"ccbccacc" would also be a correct answer.'
- input: "a = 7, b = 1, c = 0"
output: '"aabaa"'
explanation: "It is the only correct answer in this case."
explanation:
intuition: |
Imagine you're filling a jar with coloured marbles (a, b, c), but you have a rule: **no more than two marbles of the same colour can sit adjacent to each other**.
The key insight is that we should always **prioritise using the most abundant character** — but with a critical constraint. If the last two characters in our result are the same, we must pick a *different* character next, even if it's not the most abundant.
Think of it like a balancing act: we want to "burn through" the character with the highest count as fast as possible (using it twice in a row when allowed), while using less frequent characters as "separators" to break up potential triplets.
This greedy strategy works because:
1. By always picking the most frequent valid character, we maximise the length of the result
2. Using a character twice when possible (aa, bb, cc) is optimal — it depletes the larger counts faster
3. When forced to use a less frequent character as a separator, we only use it once to minimise "waste"
A **max-heap** naturally gives us the character with the highest remaining count at each step, making this approach efficient.
approach: |
We solve this using a **Greedy approach with a Max-Heap**:
**Step 1: Build the max-heap**
- Create a max-heap containing tuples of `(count, character)` for each character with count > 0
- Use negative counts in Python's `heapq` since it's a min-heap by default
&nbsp;
**Step 2: Greedily build the string**
- While the heap is not empty:
- Pop the character with the highest count
- Check the last two characters of the result
- **If the last two characters are the same as the popped character**: we cannot use it (would create "aaa", "bbb", or "ccc")
- Pop the next most frequent character instead
- Use it once, then push both back
- If no alternative exists, we're done
- **Otherwise**: use the most frequent character
- Use it twice if count >= 2 and it won't create a triplet
- Use it once otherwise
- Push it back if count > 0
&nbsp;
**Step 3: Return the result**
- Return the built string
&nbsp;
The greedy choice of always using the most frequent valid character ensures we build the longest possible happy string.
common_pitfalls:
- title: Always Using the Most Frequent Without Checking
description: |
A naive greedy approach might always pick the most frequent character without checking if it would create a triplet.
For example, with `a=2, b=2, c=1` and result so far `"aa"`, blindly picking 'a' again would create `"aaa"`.
You must check the last two characters of the result before deciding which character to append.
wrong_approach: "Always append the most frequent character"
correct_approach: "Check last two characters first, switch to second-most-frequent if needed"
- title: Using Single Characters When Doubles Are Safe
description: |
When building the string, if the most frequent character doesn't match the last two, we can safely append it **twice** (if count >= 2).
Using only one character at a time when two are safe means we don't deplete the larger counts fast enough, potentially leaving characters unused.
For example, with `c=7, a=1, b=1`: optimal is "ccaccbcc" (length 8), not "cacbccc" (length 7).
wrong_approach: "Always append just one character at a time"
correct_approach: "Append two characters when it's safe and count allows"
- title: Not Handling the "No Valid Character" Case
description: |
When the last two characters are the same as the most frequent, and there's no second character available, the string is complete.
Failing to handle this edge case can cause infinite loops or index errors.
For example, with `a=3, b=0, c=0`, the answer is `"aa"` — we cannot use all three 'a's.
wrong_approach: "Assume there's always a valid character to append"
correct_approach: "Check if heap is empty after skipping the blocked character"
key_takeaways:
- "**Greedy with constraints**: Always pick the locally optimal choice (most frequent), but respect the constraint (no triplets)"
- "**Max-heap for dynamic priorities**: When the 'best' option changes as you consume resources, a heap keeps priorities efficiently updated"
- "**Double usage optimisation**: When allowed, use the most frequent character twice to deplete large counts faster"
- "**Pattern recognition**: This problem combines greedy character selection with the 'reorganise string' pattern seen in problems like Task Scheduler"
time_complexity: "O((a + b + c) * log 3) = O(n). Each character is pushed and popped from the heap at most once, and heap operations on 3 elements are O(log 3) = O(1)."
space_complexity: "O(a + b + c) = O(n). The result string stores up to `a + b + c` characters. The heap uses O(1) space since it contains at most 3 elements."
solutions:
- approach_name: Greedy with Max-Heap
is_optimal: true
code: |
import heapq
def longest_diverse_string(a: int, b: int, c: int) -> str:
# Max-heap: use negative counts for max-heap behaviour
heap = []
if a > 0:
heapq.heappush(heap, (-a, 'a'))
if b > 0:
heapq.heappush(heap, (-b, 'b'))
if c > 0:
heapq.heappush(heap, (-c, 'c'))
result = []
while heap:
# Get the most frequent character
count1, char1 = heapq.heappop(heap)
# Check if last two chars are the same as char1
if len(result) >= 2 and result[-1] == char1 and result[-2] == char1:
# Can't use char1, try the second most frequent
if not heap:
break # No alternative, we're done
count2, char2 = heapq.heappop(heap)
result.append(char2) # Use only once as separator
count2 += 1 # Decrement (negative, so add 1)
if count2 < 0:
heapq.heappush(heap, (count2, char2))
# Push char1 back unchanged
heapq.heappush(heap, (count1, char1))
else:
# Safe to use char1 — use twice if possible
if -count1 >= 2:
result.append(char1)
result.append(char1)
count1 += 2
else:
result.append(char1)
count1 += 1
if count1 < 0:
heapq.heappush(heap, (count1, char1))
return ''.join(result)
explanation: |
**Time Complexity:** O(n) where n = a + b + c — Each character is used at most once, and heap operations on at most 3 elements are O(1).
**Space Complexity:** O(n) — The result string can be up to length n.
We greedily select the most frequent valid character at each step. When the most frequent would create a triplet, we use the second-most-frequent as a separator. Using characters twice when safe maximises output length.
- approach_name: Greedy Without Heap
is_optimal: false
code: |
def longest_diverse_string(a: int, b: int, c: int) -> str:
result = []
counts = [a, b, c]
chars = ['a', 'b', 'c']
while True:
# Find the character with max count that won't create triplet
# Sort indices by count descending
order = sorted(range(3), key=lambda i: -counts[i])
added = False
for i in order:
if counts[i] == 0:
continue
# Check if this char would create a triplet
if (len(result) >= 2 and
result[-1] == chars[i] and
result[-2] == chars[i]):
continue
# Safe to add this character
result.append(chars[i])
counts[i] -= 1
# Try to add a second one if safe
if (counts[i] > 0 and
(len(result) < 2 or
result[-1] != chars[i] or
result[-2] != chars[i])):
# Check if adding another would still be safe
# (won't create triplet with what follows)
# We only add two if this char has the max count
# to deplete it faster
if i == order[0]:
result.append(chars[i])
counts[i] -= 1
added = True
break
if not added:
break
return ''.join(result)
explanation: |
**Time Complexity:** O(n) — Each iteration adds 1-2 characters, and sorting 3 elements is O(1).
**Space Complexity:** O(n) — The result string can be up to length n.
This approach manually tracks counts and sorts to find the most frequent valid character. While functionally equivalent, it's less elegant than the heap solution and slightly harder to generalise to more characters.

View File

@@ -0,0 +1,271 @@
title: Longest Increasing Path in a Matrix
slug: longest-increasing-path-in-a-matrix
difficulty: hard
leetcode_id: 329
leetcode_url: https://leetcode.com/problems/longest-increasing-path-in-a-matrix/
categories:
- graphs
- dynamic-programming
- arrays
patterns:
- dfs
- dynamic-programming
- matrix-traversal
description: |
Given an `m x n` integers `matrix`, return *the length of the longest increasing path in* `matrix`.
From each cell, you can either move in four directions: left, right, up, or down. You **may not** move **diagonally** or move **outside the boundary** (i.e., wrap-around is not allowed).
constraints: |
- `m == matrix.length`
- `n == matrix[i].length`
- `1 <= m, n <= 200`
- `0 <= matrix[i][j] <= 2^31 - 1`
examples:
- input: "matrix = [[9,9,4],[6,6,8],[2,1,1]]"
output: "4"
explanation: "The longest increasing path is [1, 2, 6, 9]."
- input: "matrix = [[3,4,5],[3,2,6],[2,2,1]]"
output: "4"
explanation: "The longest increasing path is [3, 4, 5, 6]. Moving diagonally is not allowed."
- input: "matrix = [[1]]"
output: "1"
explanation: "A single cell forms a path of length 1."
explanation:
intuition: |
Imagine the matrix as a landscape where each cell's value represents its elevation. You're trying to find the longest route where you're always climbing uphill.
The key insight is that this problem has **optimal substructure**: the longest path starting from any cell equals 1 (the cell itself) plus the maximum of the longest paths from its valid neighbours (neighbours with strictly greater values).
Think of it like water flowing downhill. If you flip the perspective and consider paths going from higher to lower values, water from any cell can only flow to cells with smaller values. The longest path from a cell is determined by where its "downstream" neighbours can reach.
Here's why memoisation works so well: once you've computed the longest increasing path starting from cell `(i, j)`, that answer never changes. No matter which cell you're exploring later, if it can move to `(i, j)`, you already know the best path from there. This turns what would be exponential exploration into a linear traversal of the matrix.
The directed acyclic graph (DAG) structure is crucial. Since we can only move to strictly greater values, there are no cycles. This guarantees that our DFS will terminate and that dynamic programming is applicable.
approach: |
We solve this using **DFS with Memoisation**:
**Step 1: Set up the memoisation cache**
- Create a 2D array `memo` of the same dimensions as the matrix
- `memo[i][j]` will store the longest increasing path starting from cell `(i, j)`
- Initialise all values to `0` (or use a dictionary for sparse storage)
&nbsp;
**Step 2: Define the DFS function**
- For a cell `(i, j)`, if `memo[i][j]` is already computed (non-zero), return it immediately
- Otherwise, explore all four neighbours (up, down, left, right)
- For each neighbour `(ni, nj)` where `matrix[ni][nj] > matrix[i][j]`:
- Recursively compute the longest path from `(ni, nj)`
- Track the maximum path length among all valid neighbours
- Set `memo[i][j] = 1 + max_neighbour_path` (1 for the current cell plus the best continuation)
- Return `memo[i][j]`
&nbsp;
**Step 3: Iterate through all cells**
- For each cell in the matrix, call the DFS function
- Track the global maximum path length across all starting cells
- Cells with cached results will return immediately, ensuring each cell is fully computed only once
&nbsp;
**Step 4: Return the result**
- Return the maximum path length found
&nbsp;
The memoisation ensures that each cell is visited and computed exactly once, giving us optimal time complexity. The DFS naturally handles the dependency order since smaller-value cells depend on larger-value cells, and there are no cycles.
common_pitfalls:
- title: Brute Force Without Memoisation
description: |
A naive DFS that doesn't cache results will recompute paths from the same cell multiple times.
Consider a matrix where many paths converge to the same cell. Without memoisation, you'd compute the path from that cell once for every path that reaches it.
With a `200 x 200` matrix, this can lead to exponential time complexity, causing **Time Limit Exceeded** errors.
wrong_approach: "Plain DFS exploring all paths without caching"
correct_approach: "DFS with memoisation to cache computed path lengths"
- title: Forgetting Boundary Checks
description: |
When exploring neighbours, you must check that the neighbour indices are within bounds before accessing the matrix.
Accessing `matrix[-1][0]` or `matrix[m][n]` will cause index errors or incorrect results.
Always validate `0 <= ni < m` and `0 <= nj < n` before comparing values.
wrong_approach: "Checking only the value condition without bounds"
correct_approach: "Check bounds first, then check if neighbour value is greater"
- title: Using Non-Strict Inequality
description: |
The path must be **strictly increasing**. Using `>=` instead of `>` when comparing neighbour values can create infinite loops (since equal adjacent values would let you bounce back and forth forever).
The problem specifies "increasing path", which means each step must go to a strictly larger value.
wrong_approach: "Using matrix[ni][nj] >= matrix[i][j]"
correct_approach: "Using matrix[ni][nj] > matrix[i][j]"
- title: Modifying the Matrix
description: |
Some solutions attempt to mark visited cells by modifying the matrix values. This breaks the algorithm because:
1. You might need to visit the same cell from different starting points
2. The memoised value depends on the original matrix values
Use a separate `memo` array instead of modifying the input.
wrong_approach: "Setting matrix[i][j] = -1 to mark as visited"
correct_approach: "Use a separate memo array for caching"
key_takeaways:
- "**DFS + Memoisation pattern**: When exploring paths in a DAG structure, memoisation converts exponential brute force into polynomial time"
- "**Recognising DAG structure**: The strictly increasing constraint ensures no cycles, making dynamic programming applicable"
- "**Top-down vs bottom-up**: This problem is naturally suited to top-down DP (DFS with memo) since we explore from arbitrary starting points"
- "**Matrix traversal foundation**: This pattern extends to many grid problems where you need to find optimal paths with constraints"
time_complexity: "O(m * n). Each cell is computed exactly once and cached. The DFS visits each cell at most once for computation, with O(1) lookups for cached results."
space_complexity: "O(m * n). We use a 2D memo array of the same size as the input matrix. The recursion stack can also reach O(m * n) depth in the worst case (e.g., a strictly increasing snake path)."
solutions:
- approach_name: DFS with Memoisation
is_optimal: true
code: |
def longest_increasing_path(matrix: list[list[int]]) -> int:
if not matrix or not matrix[0]:
return 0
m, n = len(matrix), len(matrix[0])
# Cache to store longest path starting from each cell
memo = [[0] * n for _ in range(m)]
# Four directions: up, down, left, right
directions = [(-1, 0), (1, 0), (0, -1), (0, 1)]
def dfs(i: int, j: int) -> int:
# Return cached result if already computed
if memo[i][j] != 0:
return memo[i][j]
# At minimum, the path length is 1 (the cell itself)
max_length = 1
# Explore all four neighbours
for di, dj in directions:
ni, nj = i + di, j + dj
# Check bounds and strictly increasing condition
if 0 <= ni < m and 0 <= nj < n and matrix[ni][nj] > matrix[i][j]:
# Recurse and track the maximum path
max_length = max(max_length, 1 + dfs(ni, nj))
# Cache the result before returning
memo[i][j] = max_length
return max_length
# Try starting from every cell and track global maximum
result = 0
for i in range(m):
for j in range(n):
result = max(result, dfs(i, j))
return result
explanation: |
**Time Complexity:** O(m * n) — Each cell is computed exactly once due to memoisation.
**Space Complexity:** O(m * n) — For the memo array and recursion stack.
The DFS explores paths starting from each cell, but memoisation ensures we never recompute. The strictly increasing constraint guarantees no cycles, making this a DAG traversal problem perfectly suited for dynamic programming.
- approach_name: Topological Sort (BFS)
is_optimal: true
code: |
from collections import deque
def longest_increasing_path(matrix: list[list[int]]) -> int:
if not matrix or not matrix[0]:
return 0
m, n = len(matrix), len(matrix[0])
directions = [(-1, 0), (1, 0), (0, -1), (0, 1)]
# outdegree[i][j] = count of neighbours with greater values
outdegree = [[0] * n for _ in range(m)]
# Calculate outdegree for each cell
for i in range(m):
for j in range(n):
for di, dj in directions:
ni, nj = i + di, j + dj
if 0 <= ni < m and 0 <= nj < n and matrix[ni][nj] > matrix[i][j]:
outdegree[i][j] += 1
# Start BFS from cells with outdegree 0 (local maxima)
queue = deque()
for i in range(m):
for j in range(n):
if outdegree[i][j] == 0:
queue.append((i, j))
# BFS layer by layer, counting the number of layers
path_length = 0
while queue:
path_length += 1
# Process all cells at current level
for _ in range(len(queue)):
i, j = queue.popleft()
# Check all neighbours with smaller values
for di, dj in directions:
ni, nj = i + di, j + dj
if 0 <= ni < m and 0 <= nj < n and matrix[ni][nj] < matrix[i][j]:
outdegree[ni][nj] -= 1
# If all larger neighbours processed, add to queue
if outdegree[ni][nj] == 0:
queue.append((ni, nj))
return path_length
explanation: |
**Time Complexity:** O(m * n) — Each cell is processed exactly once.
**Space Complexity:** O(m * n) — For the outdegree array and queue.
This approach treats the matrix as a DAG where edges point from smaller to larger values. We use topological sort starting from "sink" nodes (local maxima with no outgoing edges). The number of BFS layers equals the longest path length. This is an elegant alternative that avoids recursion.
- approach_name: Brute Force DFS
is_optimal: false
code: |
def longest_increasing_path(matrix: list[list[int]]) -> int:
if not matrix or not matrix[0]:
return 0
m, n = len(matrix), len(matrix[0])
directions = [(-1, 0), (1, 0), (0, -1), (0, 1)]
def dfs(i: int, j: int) -> int:
max_length = 1
for di, dj in directions:
ni, nj = i + di, j + dj
if 0 <= ni < m and 0 <= nj < n and matrix[ni][nj] > matrix[i][j]:
# No caching - recomputes every time
max_length = max(max_length, 1 + dfs(ni, nj))
return max_length
result = 0
for i in range(m):
for j in range(n):
result = max(result, dfs(i, j))
return result
explanation: |
**Time Complexity:** O(4^(m*n)) worst case — Exponential due to repeated exploration.
**Space Complexity:** O(m * n) — Recursion stack depth.
This naive approach recomputes paths from the same cell multiple times. While correct, it's far too slow for the given constraints and will result in TLE. Included to illustrate why memoisation is essential.

View File

@@ -0,0 +1,234 @@
title: Longest Increasing Subsequence
slug: longest-increasing-subsequence
difficulty: medium
leetcode_id: 300
leetcode_url: https://leetcode.com/problems/longest-increasing-subsequence/
categories:
- arrays
- dynamic-programming
- binary-search
patterns:
- dynamic-programming
- binary-search
description: |
Given an integer array `nums`, return *the length of the longest **strictly increasing subsequence***.
A **subsequence** is a sequence that can be derived from an array by deleting some or no elements without changing the order of the remaining elements.
constraints: |
- `1 <= nums.length <= 2500`
- `-10^4 <= nums[i] <= 10^4`
examples:
- input: "nums = [10,9,2,5,3,7,101,18]"
output: "4"
explanation: "The longest increasing subsequence is [2,3,7,101], therefore the length is 4."
- input: "nums = [0,1,0,3,2,3]"
output: "4"
explanation: "One possible longest increasing subsequence is [0,1,2,3]."
- input: "nums = [7,7,7,7,7,7,7]"
output: "1"
explanation: "The longest increasing subsequence is any single element, since all elements are equal and a strictly increasing sequence cannot have duplicates."
explanation:
intuition: |
Imagine you're building a tower of blocks where each block you add must be larger than the one below it. You have a sequence of blocks laid out in a row, and you must pick blocks from left to right (you can skip blocks, but you can't go backwards).
The question becomes: what's the tallest tower you can build?
The key insight is that for each position in the array, we want to know: **"What's the longest increasing subsequence that ends at this position?"** If we can answer this for every position, the answer to the original problem is simply the maximum of all these values.
Think of it like this: when you're at position `i`, you look back at all previous positions `j` where `nums[j] < nums[i]`. You can extend any increasing subsequence ending at `j` by adding `nums[i]` to it. So the longest subsequence ending at `i` is one more than the longest subsequence ending at any valid `j`.
For optimal O(n log n) time, we use a different mental model: maintain a "patience sorting" pile where we track the smallest ending element for subsequences of each length. This allows us to use binary search to efficiently find where each new element fits.
approach: |
We'll cover two approaches: the classic DP solution and the optimised binary search solution.
**Approach A: Dynamic Programming (O(n^2))**
**Step 1: Initialise the DP array**
- Create a `dp` array where `dp[i]` represents the length of the longest increasing subsequence ending at index `i`
- Initialise all values to `1` since every element is a subsequence of length 1 by itself
&nbsp;
**Step 2: Fill the DP array**
- For each index `i` from `1` to `n-1`, look at all previous indices `j` from `0` to `i-1`
- If `nums[j] < nums[i]`, we can extend the subsequence ending at `j` by adding `nums[i]`
- Update: `dp[i] = max(dp[i], dp[j] + 1)`
&nbsp;
**Step 3: Return the maximum**
- The answer is `max(dp)` since the longest subsequence might end at any position
&nbsp;
**Approach B: Binary Search with Patience Sorting (O(n log n))**
**Step 1: Initialise a "tails" array**
- `tails[i]` represents the smallest ending element of all increasing subsequences of length `i+1`
- Start with an empty array
&nbsp;
**Step 2: Process each element**
- For each number in `nums`, use binary search to find its position in `tails`
- If the number is larger than all elements in `tails`, append it (we found a longer subsequence)
- Otherwise, replace the first element in `tails` that is >= the current number
- This maintains the invariant that `tails` is always sorted
&nbsp;
**Step 3: Return the length**
- The length of `tails` is the answer
common_pitfalls:
- title: Confusing Subsequence with Subarray
description: |
A **subsequence** allows skipping elements while maintaining relative order. A **subarray** must be contiguous.
For `[10,9,2,5,3,7,101,18]`:
- `[2,5,7,101]` is a valid subsequence (not contiguous, but maintains order)
- `[9,2,5]` is both a subarray and subsequence
Using a sliding window or subarray approach will give wrong answers since you'd miss non-contiguous increasing sequences.
wrong_approach: "Sliding window for contiguous elements"
correct_approach: "DP considering all previous elements or binary search on tails"
- title: Forgetting Strictly Increasing
description: |
The problem asks for **strictly increasing**, meaning equal elements don't count.
For `[1,3,3,5]`, the LIS is `[1,3,5]` with length 3, NOT `[1,3,3,5]` with length 4.
In the DP approach, use `nums[j] < nums[i]` (strict inequality), not `<=`.
In the binary search approach, use `bisect_left` (not `bisect_right`) to handle duplicates correctly.
wrong_approach: "Using <= instead of < for comparison"
correct_approach: "Strict inequality: nums[j] < nums[i]"
- title: O(n^2) Time Limit on Large Inputs
description: |
The basic DP solution is O(n^2). With `n = 2500`, this means up to 6.25 million operations, which is acceptable for this problem.
However, if constraints were larger (e.g., `n = 10^5`), the DP approach would be too slow. The binary search approach scales to O(n log n) for such cases.
Always check constraints to decide which approach is needed.
wrong_approach: "Always using O(n^2) DP without checking constraints"
correct_approach: "Use binary search for larger inputs"
- title: Misunderstanding the Tails Array
description: |
The `tails` array in the binary search approach does **not** store an actual LIS. It stores the smallest possible ending element for subsequences of each length.
For `[10,9,2,5,3,7]`:
- After processing: `tails = [2,3,7]`
- But `[2,3,7]` happens to be a valid LIS here
- For `[3,1,2]`: `tails = [1,2]`, but `[1,2]` is not from the original subsequence `[3,1,2]` as `1` comes after `3`
The length of `tails` is always correct, but its contents may not form a valid subsequence from the input.
wrong_approach: "Thinking tails array contains the actual LIS"
correct_approach: "Understand tails gives length only, not the actual subsequence"
key_takeaways:
- "**Classic DP pattern**: When computing properties of subsequences, think about what information you need at each position and how previous positions contribute"
- "**Patience sorting insight**: Maintaining sorted auxiliary structures enables binary search optimisation, reducing O(n^2) to O(n log n)"
- "**Foundation for harder problems**: LIS appears in many variations (longest bitonic subsequence, Russian doll envelopes, box stacking) and understanding both approaches unlocks these"
- "**Subsequence vs subarray**: Always clarify whether the problem allows skipping elements \u2014 this fundamentally changes the approach"
time_complexity: "O(n^2) for dynamic programming, O(n log n) for binary search. The DP approach compares each element with all previous elements; the binary search approach performs a log n search for each of n elements."
space_complexity: "O(n). Both approaches use an auxiliary array of size n (`dp` array for DP, `tails` array for binary search)."
solutions:
- approach_name: Binary Search (Patience Sorting)
is_optimal: true
code: |
import bisect
def length_of_lis(nums: list[int]) -> int:
# tails[i] = smallest ending element of all increasing
# subsequences of length i+1
tails = []
for num in nums:
# Find position where num should be inserted
# bisect_left handles duplicates correctly (strict increase)
pos = bisect.bisect_left(tails, num)
if pos == len(tails):
# num is larger than all tails - extend longest subsequence
tails.append(num)
else:
# Replace to maintain smallest possible tail
tails[pos] = num
# Length of tails = length of longest increasing subsequence
return len(tails)
explanation: |
**Time Complexity:** O(n log n) \u2014 For each of n elements, we perform a binary search in O(log n).
**Space Complexity:** O(n) \u2014 The tails array can grow up to size n.
This approach maintains an array where `tails[i]` is the smallest ending element of all increasing subsequences of length `i+1`. By keeping tails sorted and using binary search, we efficiently determine whether to extend the longest subsequence or update an existing length's tail.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def length_of_lis(nums: list[int]) -> int:
n = len(nums)
# dp[i] = length of longest increasing subsequence ending at i
dp = [1] * n # Every element is a subsequence of length 1
for i in range(1, n):
# Check all previous elements
for j in range(i):
# If we can extend the subsequence ending at j
if nums[j] < nums[i]:
dp[i] = max(dp[i], dp[j] + 1)
# LIS can end at any position
return max(dp)
explanation: |
**Time Complexity:** O(n^2) \u2014 Nested loops comparing each element with all previous elements.
**Space Complexity:** O(n) \u2014 The dp array stores one value per element.
This classic DP solution builds up the answer by computing, for each position, the longest increasing subsequence that ends at that position. The final answer is the maximum across all positions. While intuitive, this approach is slower than the binary search method for large inputs.
- approach_name: Brute Force (Recursion with Memoisation)
is_optimal: false
code: |
def length_of_lis(nums: list[int]) -> int:
from functools import lru_cache
n = len(nums)
@lru_cache(maxsize=None)
def lis_ending_at(index: int) -> int:
"""Return length of LIS ending at index."""
# Base case: subsequence of just this element
max_length = 1
# Try extending from any valid previous position
for prev in range(index):
if nums[prev] < nums[index]:
max_length = max(max_length, 1 + lis_ending_at(prev))
return max_length
# LIS can end at any position
return max(lis_ending_at(i) for i in range(n))
explanation: |
**Time Complexity:** O(n^2) \u2014 Same as iterative DP due to memoisation.
**Space Complexity:** O(n) \u2014 Memoisation cache and recursion stack.
This recursive approach with memoisation is equivalent to the iterative DP but may be more intuitive for some. The recurrence relation is clear: the LIS ending at index `i` is 1 plus the maximum LIS ending at any previous index `j` where `nums[j] < nums[i]`. Included to show the connection between recursion and DP.

View File

@@ -0,0 +1,171 @@
title: Longest Palindromic Substring
slug: longest-palindromic-substring
difficulty: medium
leetcode_id: 5
leetcode_url: https://leetcode.com/problems/longest-palindromic-substring/
categories:
- strings
- dynamic-programming
patterns:
- two-pointers
- dynamic-programming
description: |
Given a string `s`, return *the longest palindromic substring* in `s`.
A **palindrome** is a string that reads the same forward and backward.
constraints: |
- `1 <= s.length <= 1000`
- `s` consists of only digits and English letters
examples:
- input: 's = "babad"'
output: '"bab"'
explanation: '"aba" is also a valid answer — both have length 3.'
- input: 's = "cbbd"'
output: '"bb"'
explanation: "The longest palindromic substring is \"bb\" with length 2."
explanation:
intuition: |
Every palindrome has a **center**. For odd-length palindromes like "aba", the center is the middle character 'b'. For even-length palindromes like "abba", the center is the gap between the two 'b's.
Think of it like this: if we know the center, we can find the full palindrome by **expanding outward** — checking if the characters on both sides match. We keep expanding until they don't match.
The strategy is simple: try every possible center (each character and each gap between characters), expand to find the longest palindrome for that center, and track the overall longest.
This "expand around center" approach is intuitive and uses O(1) extra space, making it ideal for interviews.
approach: |
We solve this using **Expand Around Center**:
**Step 1: Define the expand helper function**
- `expand(left, right)` returns the bounds of the longest palindrome centered at this position
- While `left >= 0` and `right < len(s)` and `s[left] == s[right]`:
- Expand: decrement `left`, increment `right`
- Return the bounds of the palindrome (after adjusting for the last failed expansion)
&nbsp;
**Step 2: Try every possible center**
- For each index `i`:
- Try **odd-length** palindrome: `expand(i, i)` — center is single character
- Try **even-length** palindrome: `expand(i, i+1)` — center is between characters
- Update the best result if either expansion found a longer palindrome
&nbsp;
**Step 3: Return the longest palindrome**
- Track `start` and `end` indices of the best palindrome found
- Return `s[start:end+1]`
&nbsp;
Why does this work? By checking both odd and even centers at every position, we're guaranteed to find the center of the longest palindrome somewhere.
common_pitfalls:
- title: Forgetting Even-Length Palindromes
description: |
If you only expand around single characters, you'll miss even-length palindromes like "abba" or "bb".
Every position needs two expansion attempts: one for odd (center at `i`) and one for even (center between `i` and `i+1`).
wrong_approach: "Only calling expand(i, i)"
correct_approach: "Call both expand(i, i) and expand(i, i+1)"
- title: Index Out of Bounds During Expansion
description: |
The expansion loop must check bounds **before** accessing characters. A common mistake is checking equality first, which causes an index error.
wrong_approach: "while s[left] == s[right] and left >= 0 and right < len(s)"
correct_approach: "while left >= 0 and right < len(s) and s[left] == s[right]"
- title: Returning Length Instead of Substring
description: |
The problem asks for the actual substring, not just its length. Track the start and end positions of the best palindrome so you can extract it.
wrong_approach: "return max_length"
correct_approach: "return s[start:end+1]"
key_takeaways:
- "**Expand around center**: O(n²) time, O(1) space — optimal for interviews"
- "**Handle both odd and even**: Check single-character centers AND gaps between characters"
- "**Track positions, not just length**: You need to return the actual substring"
- "**Manacher's algorithm**: Can solve in O(n) but is complex — not expected in interviews"
time_complexity: "O(n²). For each of n possible centers, expansion can take up to O(n) time in the worst case."
space_complexity: "O(1). Only a few variables for tracking positions — no additional data structures."
solutions:
- approach_name: Expand Around Center
is_optimal: true
code: |
def longest_palindrome(s: str) -> str:
def expand(left: int, right: int) -> tuple[int, int]:
"""Expand around center and return palindrome bounds."""
while left >= 0 and right < len(s) and s[left] == s[right]:
left -= 1
right += 1
# Return bounds of palindrome (undo last expansion)
return left + 1, right - 1
start, end = 0, 0
for i in range(len(s)):
# Try odd-length palindrome (single character center)
l1, r1 = expand(i, i)
if r1 - l1 > end - start:
start, end = l1, r1
# Try even-length palindrome (center between characters)
l2, r2 = expand(i, i + 1)
if r2 - l2 > end - start:
start, end = l2, r2
return s[start:end + 1]
explanation: |
**Time Complexity:** O(n²) — n centers, up to n expansion steps each.
**Space Complexity:** O(1) — Only tracking start/end positions.
For each position, we try both odd and even palindrome centers. The expand function returns the bounds of the longest palindrome for that center. We track the overall longest and return it at the end.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def longest_palindrome(s: str) -> str:
n = len(s)
if n < 2:
return s
# dp[i][j] = True if s[i:j+1] is a palindrome
dp = [[False] * n for _ in range(n)]
start, max_len = 0, 1
# Single characters are palindromes
for i in range(n):
dp[i][i] = True
# Check substrings of increasing length
for length in range(2, n + 1):
for i in range(n - length + 1):
j = i + length - 1
if length == 2:
# Two characters: palindrome if they match
dp[i][j] = s[i] == s[j]
else:
# Longer: palindrome if ends match AND inner is palindrome
dp[i][j] = s[i] == s[j] and dp[i + 1][j - 1]
if dp[i][j] and length > max_len:
start, max_len = i, length
return s[start:start + max_len]
explanation: |
**Time Complexity:** O(n²) — Fill n×n table.
**Space Complexity:** O(n²) — DP table storage.
Build up palindrome information from smaller to larger substrings. `dp[i][j]` is True if substring from i to j is a palindrome. A substring is a palindrome if its ends match and its inner substring is also a palindrome. Track the longest one found.

View File

@@ -0,0 +1,237 @@
title: Longest Turbulent Subarray
slug: longest-turbulent-subarray
difficulty: medium
leetcode_id: 978
leetcode_url: https://leetcode.com/problems/longest-turbulent-subarray/
categories:
- arrays
- dynamic-programming
patterns:
- sliding-window
- dynamic-programming
description: |
Given an integer array `arr`, return *the length of a maximum size turbulent subarray of* `arr`.
A subarray is **turbulent** if the comparison sign flips between each adjacent pair of elements in the subarray.
More formally, a subarray `[arr[i], arr[i + 1], ..., arr[j]]` of `arr` is said to be turbulent if and only if:
- For `i <= k < j`:
- `arr[k] > arr[k + 1]` when `k` is odd, and
- `arr[k] < arr[k + 1]` when `k` is even.
- Or, for `i <= k < j`:
- `arr[k] > arr[k + 1]` when `k` is even, and
- `arr[k] < arr[k + 1]` when `k` is odd.
constraints: |
- `1 <= arr.length <= 4 * 10^4`
- `0 <= arr[i] <= 10^9`
examples:
- input: "arr = [9,4,2,10,7,8,8,1,9]"
output: "5"
explanation: "arr[1] > arr[2] < arr[3] > arr[4] < arr[5], which gives the turbulent subarray [4,2,10,7,8] with length 5."
- input: "arr = [4,8,12,16]"
output: "2"
explanation: "All elements are strictly increasing, so the longest turbulent subarray is any pair of adjacent elements."
- input: "arr = [100]"
output: "1"
explanation: "A single element is trivially turbulent."
explanation:
intuition: |
Imagine a stock price chart that zigzags up and down. A **turbulent subarray** is like finding the longest stretch where the chart alternates direction at every point — up, then down, then up, then down (or vice versa).
The key insight is that we don't care about the absolute values or even the parity of indices. What matters is whether the **comparison sign flips** between consecutive pairs. If we have `a < b`, the next comparison must be `b > c` for the sequence to remain turbulent. If we ever see `a < b < c` (same direction twice) or `a == b` (no direction), the turbulent sequence breaks.
Think of it like walking on a wavy path: every step must change direction. The moment you take two steps in the same direction (or stand still), you've left the turbulent zone.
This naturally leads to a **sliding window** or **dynamic programming** approach: extend the current turbulent sequence while the alternation holds, and reset when it breaks.
approach: |
We solve this using a **Single Pass with Two Counters** approach:
**Step 1: Handle the edge case**
- If the array has only one element, return `1` immediately (a single element is trivially turbulent)
&nbsp;
**Step 2: Initialise tracking variables**
- `inc`: Length of the longest turbulent subarray ending at the current position where the last comparison was increasing (`arr[i-1] < arr[i]`)
- `dec`: Length of the longest turbulent subarray ending at the current position where the last comparison was decreasing (`arr[i-1] > arr[i]`)
- Both start at `1` since a single element has length 1
- `result`: Tracks the maximum length seen, initialised to `1`
&nbsp;
**Step 3: Iterate through the array starting from index 1**
- For each position `i`, compare `arr[i]` with `arr[i-1]`:
- If `arr[i-1] < arr[i]` (increasing): Set `inc = dec + 1` (extend the previous decreasing sequence) and reset `dec = 1`
- If `arr[i-1] > arr[i]` (decreasing): Set `dec = inc + 1` (extend the previous increasing sequence) and reset `inc = 1`
- If `arr[i-1] == arr[i]` (equal): Reset both `inc = 1` and `dec = 1` (turbulence broken)
- Update `result` with `max(result, inc, dec)`
&nbsp;
**Step 4: Return the result**
- Return `result` after processing all elements
&nbsp;
The key insight is that `inc` and `dec` track complementary states: to extend an increasing comparison, we need the previous comparison to have been decreasing (and vice versa). This is why we set `inc = dec + 1` when we see an increase.
common_pitfalls:
- title: Misunderstanding the Turbulence Definition
description: |
A common mistake is thinking turbulence requires a specific starting direction or depends on index parity in absolute terms. The definition is actually simpler: **comparisons must alternate**.
The two cases in the problem description (odd/even rules) just describe the two possible patterns:
- `< > < > ...` (starts with increase)
- `> < > < ...` (starts with decrease)
Both are valid turbulent sequences. Focus on whether consecutive comparisons flip, not on the index values.
wrong_approach: "Checking if index is odd/even to determine expected comparison"
correct_approach: "Track whether the last comparison was increasing or decreasing, and check if the current one flips"
- title: Forgetting Equal Elements Break Turbulence
description: |
When `arr[i-1] == arr[i]`, there's no comparison sign — it's neither increasing nor decreasing. This breaks any turbulent sequence.
For example, in `[9,4,2,10,7,8,8,1,9]`, the sequence `8,8` breaks the turbulence, so we can't connect `[4,2,10,7,8]` with `[8,1,9]`.
Always reset both counters when encountering equal adjacent elements.
wrong_approach: "Ignoring equal elements or treating them as continuing the pattern"
correct_approach: "Reset both inc and dec to 1 when arr[i-1] == arr[i]"
- title: Off-by-One in Sequence Length
description: |
A turbulent subarray with `k` elements has `k-1` comparisons. When extending, we add `1` to the previous counter (e.g., `inc = dec + 1`), not to the number of comparisons.
For `[4, 2, 10]`: After seeing `4 > 2`, `dec = 2` (two elements). After seeing `2 < 10`, `inc = dec + 1 = 3` (three elements). This correctly counts the elements, not comparisons.
wrong_approach: "Counting comparisons instead of elements, or incorrect initialization"
correct_approach: "Initialize counters to 1 (single element) and add 1 when extending"
key_takeaways:
- "**Dual-state DP**: When a sequence's validity depends on its ending condition, track multiple states (here, `inc` and `dec`) that feed into each other"
- "**Sliding window without explicit pointers**: The counters implicitly maintain a window — resetting to `1` is equivalent to starting a new window"
- "**Alternation patterns**: For problems requiring alternating conditions, track what the *last* state was and check if the *current* state differs"
- "**Pattern recognition**: This is similar to 'Wiggle Subsequence' (LeetCode 376), which asks for the longest *subsequence* (not subarray) with alternating differences"
time_complexity: "O(n). We traverse the array exactly once, performing constant-time operations at each step."
space_complexity: "O(1). We only use a fixed number of variables (`inc`, `dec`, `result`) regardless of input size."
solutions:
- approach_name: Single Pass with Two Counters
is_optimal: true
code: |
def max_turbulence_size(arr: list[int]) -> int:
n = len(arr)
if n == 1:
return 1
# inc: length of turbulent subarray ending here with last comparison increasing
# dec: length of turbulent subarray ending here with last comparison decreasing
inc = dec = 1
result = 1
for i in range(1, n):
if arr[i - 1] < arr[i]:
# Current is increasing, extend from previous decreasing
inc = dec + 1
dec = 1 # Reset decreasing counter
elif arr[i - 1] > arr[i]:
# Current is decreasing, extend from previous increasing
dec = inc + 1
inc = 1 # Reset increasing counter
else:
# Equal elements break turbulence
inc = dec = 1
result = max(result, inc, dec)
return result
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(1) — Only three variables used.
We maintain two counters that track the length of turbulent subarrays ending at the current position, distinguished by whether the last comparison was increasing or decreasing. When we see an increase, we can extend any previous sequence that ended with a decrease (and vice versa). Equal elements reset both counters since they break turbulence.
- approach_name: Explicit Sliding Window
is_optimal: true
code: |
def max_turbulence_size(arr: list[int]) -> int:
n = len(arr)
if n == 1:
return 1
# Helper to get comparison sign: -1, 0, or 1
def cmp(a: int, b: int) -> int:
if a < b:
return -1
elif a > b:
return 1
return 0
result = 1
left = 0 # Start of current turbulent window
for right in range(1, n):
c = cmp(arr[right - 1], arr[right])
if c == 0:
# Equal elements: start fresh window after this position
left = right
elif right == left + 1:
# Second element of window: any non-zero comparison is valid
result = max(result, 2)
else:
# Check if comparison alternates from previous
prev_c = cmp(arr[right - 2], arr[right - 1])
if c == prev_c:
# Same direction twice: start new window from previous position
left = right - 1
# Window size is right - left + 1
result = max(result, right - left + 1)
return result
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(1) — Only tracking window boundaries.
This version explicitly maintains a sliding window with `left` and `right` pointers. The window shrinks (by moving `left`) when turbulence breaks: either due to equal elements or two consecutive comparisons in the same direction. This approach is conceptually clearer for those familiar with sliding window patterns.
- approach_name: Dynamic Programming (Explicit)
is_optimal: false
code: |
def max_turbulence_size(arr: list[int]) -> int:
n = len(arr)
if n == 1:
return 1
# dp_inc[i]: length of turbulent subarray ending at i with arr[i-1] < arr[i]
# dp_dec[i]: length of turbulent subarray ending at i with arr[i-1] > arr[i]
dp_inc = [1] * n
dp_dec = [1] * n
for i in range(1, n):
if arr[i - 1] < arr[i]:
dp_inc[i] = dp_dec[i - 1] + 1
elif arr[i - 1] > arr[i]:
dp_dec[i] = dp_inc[i - 1] + 1
# If equal, both remain 1 (initialized value)
return max(max(dp_inc), max(dp_dec))
explanation: |
**Time Complexity:** O(n) — Single pass to fill DP arrays.
**Space Complexity:** O(n) — Two arrays of length n.
This explicit DP formulation makes the recurrence clear: `dp_inc[i]` depends on `dp_dec[i-1]` and vice versa. While correct, it uses O(n) space unnecessarily — since we only need the previous values, we can reduce to O(1) space as shown in the optimal solution. Included here to illustrate the DP structure before optimization.

View File

@@ -0,0 +1,259 @@
title: Longest Valid Parentheses
slug: longest-valid-parentheses
difficulty: hard
leetcode_id: 32
leetcode_url: https://leetcode.com/problems/longest-valid-parentheses/
categories:
- strings
- stack
- dynamic-programming
patterns:
- dynamic-programming
- monotonic-stack
description: |
Given a string containing just the characters `'('` and `')'`, return *the length of the longest valid (well-formed) parentheses substring*.
A valid parentheses substring is one where every opening parenthesis `'('` has a corresponding closing parenthesis `')'` and they are properly nested.
constraints: |
- `0 <= s.length <= 3 * 10^4`
- `s[i]` is `'('` or `')'`
examples:
- input: 's = "(()"'
output: "2"
explanation: "The longest valid parentheses substring is \"()\"."
- input: 's = ")()())"'
output: "4"
explanation: "The longest valid parentheses substring is \"()()\"."
- input: 's = ""'
output: "0"
explanation: "An empty string has no valid parentheses."
explanation:
intuition: |
Imagine you're reading through a string of parentheses and trying to find the longest stretch where they're perfectly balanced.
The key insight is that a valid parentheses string can be **broken by unmatched characters**. An unmatched `)` at position `i` means any valid substring must start *after* `i`. Similarly, an unmatched `(` at position `j` means any valid substring ending before `j` cannot extend past it.
Think of it like this: unmatched parentheses act as **barriers** that divide the string into segments. Within each segment, we need to find how far the valid matching extends.
There are two elegant ways to approach this:
1. **Stack approach**: Use a stack to track indices of unmatched `(` characters. When we see a `)`, we either match it with a `(` (pop from stack) or mark it as a barrier. The stack always holds indices that "break" the valid sequence.
2. **Dynamic Programming**: For each position, calculate the length of the longest valid substring *ending* at that position. A `)` at position `i` can extend a valid substring if there's a matching `(` available.
The stack approach is more intuitive once you see it: we push indices as barriers, and the distance from the current index to the top of the stack gives us the length of the current valid segment.
approach: |
We'll use the **Stack Approach** as our optimal solution:
**Step 1: Initialise the stack with a base index**
- Push `-1` onto the stack as a "floor" or base index
- This handles the edge case where a valid substring starts from index `0`
- The stack will store indices of unmatched `(` characters and barrier positions
&nbsp;
**Step 2: Iterate through each character**
- For each character at index `i`:
- If it's `'('`: push `i` onto the stack (potential start of valid sequence)
- If it's `')'`: pop from the stack (try to match with a `(`)
&nbsp;
**Step 3: Calculate valid length after each `)`**
- After popping for a `)`:
- If the stack is **empty**: this `)` is unmatched, push `i` as a new barrier
- If the stack is **not empty**: calculate `i - stack.top()` as the length of the current valid substring
- Update `max_length` with the maximum value seen
&nbsp;
**Step 4: Return the result**
- Return `max_length` after processing all characters
&nbsp;
**Why this works**: The stack always contains indices that "break" valid sequences. The distance from the current index to the stack top represents how far back the current valid sequence extends.
common_pitfalls:
- title: Forgetting the Base Index
description: |
Without pushing `-1` initially, the first valid substring starting from index `0` won't be calculated correctly.
For example, with `s = "()"`:
- At index `0`, push `0`
- At index `1`, pop `0`, stack is now empty
- Without a base, we can't calculate `1 - (-1) = 2`
Always initialise with `-1` to handle edge cases cleanly.
wrong_approach: "Start with an empty stack"
correct_approach: "Push -1 as the base index before processing"
- title: Confusing Valid Substring vs Total Matches
description: |
This problem asks for the longest **contiguous** valid substring, not the total number of matched pairs.
For `s = "()(())"`:
- Total matched pairs: 3 (length 6)
- But the whole string is one valid substring of length 6
For `s = "())()"`:
- Total matched pairs: 2
- But longest valid substring is only 2 (`()` at the end or beginning)
The unmatched `)` at index 2 breaks the string into separate segments.
wrong_approach: "Count total matched pairs"
correct_approach: "Track longest contiguous valid segment"
- title: Using O(n) Space When O(1) is Possible
description: |
While the stack solution is intuitive and efficient at O(n) space, there's actually an O(1) space solution using two-pass counting.
For interviews, the stack approach is typically expected, but knowing the O(1) solution demonstrates deeper understanding.
wrong_approach: "Only knowing the stack approach"
correct_approach: "Understand both stack O(n) and two-pass O(1) approaches"
- title: Off-by-One Errors in Length Calculation
description: |
When calculating `i - stack.top()`, remember that this gives the length, not the ending index.
For example, if `i = 5` and `stack.top() = 2`:
- Length = `5 - 2 = 3` (positions 3, 4, 5)
- This represents indices 3 through 5 inclusive
Make sure your mental model matches: we're measuring distance, not counting indices.
key_takeaways:
- "**Stack for matching problems**: Using a stack to track indices (not just characters) is a powerful technique for parentheses and bracket matching"
- "**Barrier concept**: Unmatched characters act as barriers that reset the valid substring count"
- "**Base index trick**: Pushing `-1` as a base handles edge cases elegantly without special-casing"
- "**Related problems**: Valid Parentheses (#20), Generate Parentheses (#22), and Minimum Add to Make Parentheses Valid (#921) use similar concepts"
time_complexity: "O(n). We traverse the string exactly once, and each index is pushed and popped from the stack at most once."
space_complexity: "O(n). In the worst case (all opening parentheses), the stack holds all n indices. The two-pass approach achieves O(1) space."
solutions:
- approach_name: Stack with Index Tracking
is_optimal: true
code: |
def longest_valid_parentheses(s: str) -> int:
# Stack stores indices of unmatched '(' and barrier positions
# Start with -1 as base to handle valid substring starting at index 0
stack = [-1]
max_length = 0
for i, char in enumerate(s):
if char == '(':
# Push index of '(' as potential start of valid sequence
stack.append(i)
else:
# Pop to match this ')' with a '('
stack.pop()
if not stack:
# Stack empty means this ')' is unmatched
# Push current index as new barrier
stack.append(i)
else:
# Calculate length of current valid substring
# Distance from current position to the last barrier
current_length = i - stack[-1]
max_length = max(max_length, current_length)
return max_length
explanation: |
**Time Complexity:** O(n) — Single pass through the string.
**Space Complexity:** O(n) — Stack can hold up to n indices in the worst case.
The stack maintains a "barrier" at its top, representing the rightmost position that breaks valid parentheses. When we find a valid match, the distance from the current index to this barrier gives us the valid substring length.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def longest_valid_parentheses(s: str) -> int:
if not s:
return 0
n = len(s)
# dp[i] = length of longest valid substring ending at index i
dp = [0] * n
max_length = 0
for i in range(1, n):
if s[i] == ')':
if s[i - 1] == '(':
# Case 1: "()" pattern - extends previous valid substring
dp[i] = (dp[i - 2] if i >= 2 else 0) + 2
elif i - dp[i - 1] > 0 and s[i - dp[i - 1] - 1] == '(':
# Case 2: "))" pattern - check if there's matching '('
# before the valid substring ending at i-1
dp[i] = dp[i - 1] + 2
# Add any valid substring before the matching '('
if i - dp[i - 1] >= 2:
dp[i] += dp[i - dp[i - 1] - 2]
max_length = max(max_length, dp[i])
return max_length
explanation: |
**Time Complexity:** O(n) — Single pass through the string.
**Space Complexity:** O(n) — DP array of size n.
For each `)` at position `i`, we determine if it can extend a valid substring:
- If preceded by `(`, we have a `()` pair adding 2 to whatever came before
- If preceded by `)`, we look past the valid substring ending at `i-1` to find a matching `(`
- approach_name: Two-Pass Counting
is_optimal: false
code: |
def longest_valid_parentheses(s: str) -> int:
# O(1) space solution using two passes
max_length = 0
left = right = 0
# Left to right pass
for char in s:
if char == '(':
left += 1
else:
right += 1
if left == right:
# Balanced - this is a valid substring
max_length = max(max_length, 2 * right)
elif right > left:
# Too many ')' - reset counters
left = right = 0
# Right to left pass (handles excess '(' cases)
left = right = 0
for char in reversed(s):
if char == '(':
left += 1
else:
right += 1
if left == right:
max_length = max(max_length, 2 * left)
elif left > right:
# Too many '(' - reset counters
left = right = 0
return max_length
explanation: |
**Time Complexity:** O(n) — Two passes through the string.
**Space Complexity:** O(1) — Only uses counter variables.
This clever approach counts left and right parentheses. When counts match, we have a valid substring. We need two passes because a single pass can't handle both excess `(` and excess `)` cases. Left-to-right handles excess `)`, right-to-left handles excess `(`.

View File

@@ -0,0 +1,172 @@
title: Lowest Common Ancestor of a Binary Search Tree
slug: lowest-common-ancestor-of-a-binary-search-tree
difficulty: medium
leetcode_id: 235
leetcode_url: https://leetcode.com/problems/lowest-common-ancestor-of-a-binary-search-tree/
categories:
- trees
patterns:
- tree-traversal
- binary-search
description: |
Given a binary search tree (BST), find the lowest common ancestor (LCA) node of two given nodes in the BST.
According to the definition of LCA on Wikipedia: "The lowest common ancestor is defined between two nodes `p` and `q` as the lowest node in `T` that has both `p` and `q` as descendants (where we allow **a node to be a descendant of itself**)."
constraints: |
- The number of nodes in the tree is in the range `[2, 10^5]`
- `-10^9 <= Node.val <= 10^9`
- All `Node.val` are **unique**
- `p != q`
- `p` and `q` will exist in the BST
examples:
- input: "root = [6,2,8,0,4,7,9,null,null,3,5], p = 2, q = 8"
output: "6"
explanation: "The LCA of nodes 2 and 8 is 6."
- input: "root = [6,2,8,0,4,7,9,null,null,3,5], p = 2, q = 4"
output: "2"
explanation: "The LCA of nodes 2 and 4 is 2, since a node can be a descendant of itself according to the LCA definition."
- input: "root = [2,1], p = 2, q = 1"
output: "2"
explanation: "The LCA of nodes 2 and 1 is 2."
explanation:
intuition: |
The key insight is to **leverage the BST property**: for any node, all values in its left subtree are smaller, and all values in its right subtree are larger.
Think of it like searching for a meeting point. Imagine you're standing at the root, and two people are trying to find each other — one at node `p` and one at node `q`. As you traverse down the tree, at some point you'll reach a node where the two people would need to go in **different directions** to reach their respective nodes. That splitting point is the LCA.
More concretely:
- If both `p` and `q` are **smaller** than the current node, the LCA must be in the left subtree
- If both `p` and `q` are **larger** than the current node, the LCA must be in the right subtree
- If `p` and `q` are on **opposite sides** (or one equals the current node), then the current node is the LCA
This is fundamentally different from finding the LCA in a general binary tree, where you'd need to search both subtrees. The BST ordering gives us a guaranteed direction at each step.
approach: |
We solve this using the **BST Property** to guide our traversal:
**Step 1: Start at the root**
- Begin traversal at the root node
- We'll move down the tree based on how `p` and `q` compare to the current node
&nbsp;
**Step 2: Compare values and decide direction**
- If both `p.val` and `q.val` are **less than** `current.val`, move to the left child
- If both `p.val` and `q.val` are **greater than** `current.val`, move to the right child
- Otherwise, we've found the split point — return the current node
&nbsp;
**Step 3: The split point is the LCA**
- When `p` and `q` lie on different sides of the current node (or one of them equals the current node), the current node is the lowest common ancestor
- Return this node as the answer
&nbsp;
This works because the BST property guarantees that once `p` and `q` "split" to different subtrees, they can never reunite at a lower node.
common_pitfalls:
- title: Ignoring the BST Property
description: |
A common mistake is treating this like a general binary tree LCA problem and recursing into both subtrees to find `p` and `q`.
In a general binary tree, you'd need to search both children and check which subtree contains which node. But in a BST, you can determine which direction to go with a simple value comparison — O(1) per node instead of potentially visiting both subtrees.
This makes the BST solution O(h) instead of O(n).
wrong_approach: "Search both subtrees like in a general binary tree"
correct_approach: "Use value comparisons to choose one direction at each step"
- title: Forgetting a Node Can Be Its Own Ancestor
description: |
The problem states that a node can be a descendant of itself. If `p = 2` and `q = 4`, and node 2 is an ancestor of node 4 in the BST, then the LCA is 2, not some parent of 2.
When checking the split condition, remember to handle the case where the current node equals `p` or `q`. In this case, the current node is the LCA because one node is an ancestor of the other.
wrong_approach: "Only return when p and q are on opposite sides"
correct_approach: "Return when p and q split OR when current equals p or q"
- title: Incorrect Comparison Logic
description: |
Be careful with the comparison operators. The condition for moving left is when **both** values are less than current. Similarly for moving right.
A common bug is using OR instead of AND:
- Wrong: `if p.val < current.val or q.val < current.val` (might miss the split)
- Correct: `if p.val < current.val and q.val < current.val`
wrong_approach: "Using OR logic for direction decisions"
correct_approach: "Using AND logic — both must be less/greater to continue"
key_takeaways:
- "**Exploit BST ordering**: The BST property lets you make O(1) direction decisions, avoiding the need to search both subtrees"
- "**Split point = LCA**: The moment two values would need to go different directions, you've found their common ancestor"
- "**Iterative vs recursive**: Both approaches work, but iterative uses O(1) space vs O(h) for the recursive call stack"
- "**Foundation for harder problems**: This pattern extends to problems like finding paths between nodes or validating BST structure"
time_complexity: "O(h) where h is the height of the tree. In a balanced BST, h = log(n). In the worst case (skewed tree), h = n."
space_complexity: "O(1) for the iterative solution. We only use a single pointer to traverse the tree, regardless of input size."
solutions:
- approach_name: Iterative Traversal
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def lowest_common_ancestor(root: TreeNode, p: TreeNode, q: TreeNode) -> TreeNode:
# Start at the root and traverse down
current = root
while current:
# Both nodes are in the left subtree
if p.val < current.val and q.val < current.val:
current = current.left
# Both nodes are in the right subtree
elif p.val > current.val and q.val > current.val:
current = current.right
# Split point found — p and q are on different sides
# (or one of them equals current)
else:
return current
return None # Should never reach here if p and q exist in tree
explanation: |
**Time Complexity:** O(h) — We traverse at most the height of the tree, making one comparison per level.
**Space Complexity:** O(1) — Only a single pointer variable is used; no recursion stack.
We exploit the BST property to navigate directly to the LCA. At each node, we compare both `p` and `q` values to decide whether to go left, right, or stop. The moment they would diverge (or one matches the current node), we've found the LCA.
- approach_name: Recursive Traversal
is_optimal: false
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def lowest_common_ancestor(root: TreeNode, p: TreeNode, q: TreeNode) -> TreeNode:
# Both nodes are in the left subtree
if p.val < root.val and q.val < root.val:
return lowest_common_ancestor(root.left, p, q)
# Both nodes are in the right subtree
if p.val > root.val and q.val > root.val:
return lowest_common_ancestor(root.right, p, q)
# Split point — this is the LCA
return root
explanation: |
**Time Complexity:** O(h) — Same traversal pattern as iterative, visiting at most h nodes.
**Space Complexity:** O(h) — Recursive call stack can grow up to the height of the tree.
This recursive version follows the same logic but uses the call stack instead of a loop. While elegant, it uses more space than the iterative approach. The recursive calls naturally unwind once we find the split point.

View File

@@ -0,0 +1,258 @@
title: LRU Cache
slug: lru-cache
difficulty: medium
leetcode_id: 146
leetcode_url: https://leetcode.com/problems/lru-cache/
categories:
- hash-tables
- linked-lists
patterns:
- linkedlist-reversal
description: |
Design a data structure that follows the constraints of a **Least Recently Used (LRU) cache**.
Implement the `LRUCache` class:
- `LRUCache(int capacity)` Initialise the LRU cache with **positive** size `capacity`.
- `int get(int key)` Return the value of the `key` if the key exists, otherwise return `-1`.
- `void put(int key, int value)` Update the value of the `key` if the `key` exists. Otherwise, add the `key-value` pair to the cache. If the number of keys exceeds the `capacity` from this operation, **evict** the least recently used key.
The functions `get` and `put` must each run in `O(1)` average time complexity.
constraints: |
- `1 <= capacity <= 3000`
- `0 <= key <= 10^4`
- `0 <= value <= 10^5`
- At most `2 * 10^5` calls will be made to `get` and `put`.
examples:
- input: |
["LRUCache", "put", "put", "get", "put", "get", "put", "get", "get", "get"]
[[2], [1, 1], [2, 2], [1], [3, 3], [2], [4, 4], [1], [3], [4]]
output: "[null, null, null, 1, null, -1, null, -1, 3, 4]"
explanation: |
LRUCache lRUCache = new LRUCache(2);
lRUCache.put(1, 1); // cache is {1=1}
lRUCache.put(2, 2); // cache is {1=1, 2=2}
lRUCache.get(1); // return 1
lRUCache.put(3, 3); // LRU key was 2, evicts key 2, cache is {1=1, 3=3}
lRUCache.get(2); // returns -1 (not found)
lRUCache.put(4, 4); // LRU key was 1, evicts key 1, cache is {4=4, 3=3}
lRUCache.get(1); // return -1 (not found)
lRUCache.get(3); // return 3
lRUCache.get(4); // return 4
explanation:
intuition: |
Imagine a stack of plates in a restaurant kitchen. When a plate is used, it goes back on top of the stack. When you need a clean plate, you always grab from the top. The plate at the **bottom** of the stack is the one that hasn't been touched in the longest time — it's the "least recently used".
An LRU cache works the same way: we need to track which items were accessed most recently, and when we run out of space, we evict the item that hasn't been touched in the longest time.
The challenge is the **O(1) time requirement** for both `get` and `put`. A simple list would give us O(n) for finding elements. A hash map gives us O(1) lookup but doesn't track order. We need **both** capabilities simultaneously.
The key insight is to combine two data structures:
- A **hash map** for O(1) key lookups
- A **doubly linked list** for O(1) insertion, deletion, and reordering
The hash map points directly to nodes in the linked list, so we can find any element in O(1) time. The doubly linked list maintains the access order — most recently used at the head, least recently used at the tail. When we access an element, we can remove it from its current position and move it to the head in O(1) time because we have direct pointers to adjacent nodes.
approach: |
We solve this using a **Hash Map + Doubly Linked List** combination:
**Step 1: Define the node structure**
- Create a `Node` class with `key`, `value`, `prev`, and `next` pointers
- The key is stored in the node so we can remove entries from the hash map during eviction
&nbsp;
**Step 2: Initialise the data structures**
- `cache`: A hash map mapping keys to their corresponding nodes
- `capacity`: The maximum number of items allowed
- `head` and `tail`: Dummy sentinel nodes that simplify edge case handling
- Connect `head.next = tail` and `tail.prev = head` initially (empty list between sentinels)
&nbsp;
**Step 3: Implement helper methods**
- `_remove(node)`: Remove a node from its current position in the doubly linked list
- `_add_to_head(node)`: Insert a node right after the head sentinel (marks it as most recently used)
&nbsp;
**Step 4: Implement get(key)**
- If key not in cache, return `-1`
- Otherwise, move the node to the head (mark as recently used) and return its value
&nbsp;
**Step 5: Implement put(key, value)**
- If key exists, update its value and move to head
- If key is new:
- Create a new node and add to head
- Add to the hash map
- If over capacity, remove the node before `tail` (the LRU item) and delete from hash map
&nbsp;
Using sentinel nodes eliminates null checks when removing the first/last real node, making the code cleaner and less error-prone.
common_pitfalls:
- title: Using a List for Access Tracking
description: |
A common first instinct is to use a regular list or array to track access order. However, moving an element to the front of a list requires O(n) time to shift elements.
With up to `2 * 10^5` operations, O(n) per operation means up to 4 * 10^10 operations total — this will cause **Time Limit Exceeded (TLE)**.
The doubly linked list with direct node references allows O(1) removal and insertion.
wrong_approach: "Array or singly linked list for order tracking"
correct_approach: "Doubly linked list with hash map for O(1) node access"
- title: Forgetting to Store Key in Node
description: |
When evicting the LRU item, you need to remove it from both the linked list AND the hash map. If the node doesn't store its key, you can't efficiently find which hash map entry to delete.
Always store the key in the node so eviction can update the hash map in O(1) time.
wrong_approach: "Node only stores value"
correct_approach: "Node stores both key and value"
- title: Not Handling the Update Case
description: |
When `put` is called with an existing key, some implementations add a new node without removing the old one. This corrupts the data structure and leads to incorrect eviction behaviour.
Always check if the key exists first. If it does, update the existing node's value and move it to the head instead of creating a new node.
wrong_approach: "Always create new node on put"
correct_approach: "Check existence first, update if present"
- title: Edge Cases with Sentinel Nodes
description: |
Without sentinel (dummy) nodes, removing the first or last real node requires special handling of null pointers. This leads to complex, error-prone code.
Using dummy `head` and `tail` nodes means the first real node is always `head.next` and the last is always `tail.prev`. Removal logic becomes uniform for all nodes.
key_takeaways:
- "**Combine data structures**: When one structure doesn't meet all requirements, combine two. Hash map + linked list gives O(1) lookup AND O(1) reordering."
- "**Sentinel nodes simplify edge cases**: Dummy head/tail nodes eliminate null checks and special cases for first/last elements."
- "**Store redundant data when needed**: Keeping the key in the node seems redundant but enables O(1) eviction from the hash map."
- "**Classic interview pattern**: This exact combination (hash map + doubly linked list) appears in many cache and ordering problems."
time_complexity: "O(1) for both `get` and `put`. Hash map lookup is O(1), and doubly linked list insertion/removal is O(1) with direct node references."
space_complexity: "O(capacity). We store at most `capacity` nodes in the linked list and `capacity` entries in the hash map."
solutions:
- approach_name: Hash Map + Doubly Linked List
is_optimal: true
code: |
class Node:
"""Doubly linked list node storing key-value pair."""
def __init__(self, key: int = 0, value: int = 0):
self.key = key
self.value = value
self.prev: Node | None = None
self.next: Node | None = None
class LRUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.cache: dict[int, Node] = {} # key -> node
# Sentinel nodes simplify edge cases
self.head = Node() # Most recently used after head
self.tail = Node() # Least recently used before tail
self.head.next = self.tail
self.tail.prev = self.head
def _remove(self, node: Node) -> None:
"""Remove node from its current position in the list."""
prev_node = node.prev
next_node = node.next
prev_node.next = next_node
next_node.prev = prev_node
def _add_to_head(self, node: Node) -> None:
"""Add node right after head (marks as most recently used)."""
node.prev = self.head
node.next = self.head.next
self.head.next.prev = node
self.head.next = node
def get(self, key: int) -> int:
if key not in self.cache:
return -1
# Move accessed node to head (most recently used)
node = self.cache[key]
self._remove(node)
self._add_to_head(node)
return node.value
def put(self, key: int, value: int) -> None:
if key in self.cache:
# Update existing node and move to head
node = self.cache[key]
node.value = value
self._remove(node)
self._add_to_head(node)
else:
# Create new node
new_node = Node(key, value)
self.cache[key] = new_node
self._add_to_head(new_node)
# Evict LRU if over capacity
if len(self.cache) > self.capacity:
lru_node = self.tail.prev # Node before tail is LRU
self._remove(lru_node)
del self.cache[lru_node.key] # Key stored in node!
explanation: |
**Time Complexity:** O(1) for both operations — hash map lookup and linked list manipulation are constant time.
**Space Complexity:** O(capacity) — storing up to `capacity` nodes plus the hash map entries.
The hash map provides instant key lookup, while the doubly linked list maintains access order. Sentinel nodes eliminate edge case handling. When evicting, we access the node before `tail` and use its stored key to clean up the hash map.
- approach_name: OrderedDict (Python Built-in)
is_optimal: true
code: |
from collections import OrderedDict
class LRUCache:
"""LRU Cache using Python's OrderedDict which maintains insertion order."""
def __init__(self, capacity: int):
self.capacity = capacity
# OrderedDict remembers insertion order
self.cache: OrderedDict[int, int] = OrderedDict()
def get(self, key: int) -> int:
if key not in self.cache:
return -1
# Move to end (most recently used)
self.cache.move_to_end(key)
return self.cache[key]
def put(self, key: int, value: int) -> None:
if key in self.cache:
# Update and move to end
self.cache.move_to_end(key)
self.cache[key] = value
# Evict oldest if over capacity
if len(self.cache) > self.capacity:
# popitem(last=False) removes first (oldest) item
self.cache.popitem(last=False)
explanation: |
**Time Complexity:** O(1) for both operations — `OrderedDict` uses a hash map + doubly linked list internally.
**Space Complexity:** O(capacity) — same as the manual implementation.
Python's `OrderedDict` is essentially the same data structure we built manually. Using `move_to_end()` marks an item as recently used, and `popitem(last=False)` removes the oldest item. This is the pragmatic choice in a real Python codebase, but understanding the manual implementation is valuable for interviews.