questions S-W

This commit is contained in:
2025-05-30 19:18:33 +01:00
parent 68699f35ec
commit f7e491f1e8
46 changed files with 9696 additions and 0 deletions

View File

@@ -0,0 +1,173 @@
title: Same Tree
slug: same-tree
difficulty: easy
leetcode_id: 100
leetcode_url: https://leetcode.com/problems/same-tree/
categories:
- trees
- recursion
patterns:
- dfs
- tree-traversal
description: |
Given the roots of two binary trees `p` and `q`, write a function to check if they are the same or not.
Two binary trees are considered **the same** if they are structurally identical, and the nodes have the same value.
constraints: |
- The number of nodes in both trees is in the range `[0, 100]`
- `-10^4 <= Node.val <= 10^4`
examples:
- input: "p = [1,2,3], q = [1,2,3]"
output: "true"
explanation: "Both trees have identical structure and values at every node."
- input: "p = [1,2], q = [1,null,2]"
output: "false"
explanation: "The left subtree of p has a node with value 2, but the left subtree of q is empty. The structures differ."
- input: "p = [1,2,1], q = [1,1,2]"
output: "false"
explanation: "Although both trees have the same structure, the values at corresponding positions differ (left child is 2 vs 1, right child is 1 vs 2)."
explanation:
intuition: |
Think of comparing two binary trees like comparing two family trees side by side.
You start at the roots (the grandparents) and ask: "Are these two people the same?" If yes, you then recursively ask the same question about their left children (first child's family) and their right children (second child's family).
The key insight is that **two trees are the same if and only if**:
1. Their root values match
2. Their left subtrees are the same
3. Their right subtrees are the same
This naturally leads to a recursive solution. At each step, we compare the current nodes, then delegate the comparison of subtrees to recursive calls. The recursion terminates when we reach `null` nodes — if both are `null`, that part matches; if only one is `null`, the trees differ.
approach: |
We solve this using **Recursive DFS**:
**Step 1: Handle base cases**
- If both `p` and `q` are `null`, return `True` — two empty trees are identical
- If only one of `p` or `q` is `null`, return `False` — one tree has a node where the other doesn't
&nbsp;
**Step 2: Compare current nodes**
- If `p.val != q.val`, return `False` — the values at this position differ
&nbsp;
**Step 3: Recurse on subtrees**
- Recursively check if the left subtrees are the same: `is_same_tree(p.left, q.left)`
- Recursively check if the right subtrees are the same: `is_same_tree(p.right, q.right)`
- Return `True` only if **both** recursive calls return `True`
&nbsp;
This approach visits every node exactly once, comparing corresponding positions in both trees simultaneously.
common_pitfalls:
- title: Forgetting Null Checks
description: |
A common mistake is to access `p.val` or `q.val` without first checking if `p` or `q` is `null`.
This causes a `NullPointerException` or `AttributeError`. Always handle the `null` cases first before accessing node properties.
wrong_approach: "Comparing values before checking for null"
correct_approach: "Check if both nodes are null, then if one is null, then compare values"
- title: Checking Only Values
description: |
Simply checking if both trees contain the same set of values is not enough.
For example, `[1,2,3]` and `[1,3,2]` have the same values but different structures. The trees must be **structurally identical** with values matching at corresponding positions.
wrong_approach: "Collecting all values and comparing sets"
correct_approach: "Compare structure and values simultaneously during traversal"
- title: Only Checking One Subtree
description: |
After confirming the root values match, you must check **both** the left and right subtrees.
Returning early after checking only the left subtree would miss structural differences on the right side. Use `and` to ensure both subtrees are validated.
wrong_approach: "Returning after checking only left subtree"
correct_approach: "Return True only when both left AND right subtree checks pass"
key_takeaways:
- "**Recursive tree comparison**: When comparing two trees, compare roots first, then recursively compare corresponding subtrees"
- "**Base case handling**: Null checks are essential — two nulls are equal, one null means unequal"
- "**DFS pattern**: This is a classic application of depth-first traversal where we explore each branch fully before moving to siblings"
- "**Foundation for tree problems**: This same pattern extends to problems like symmetric tree, subtree checking, and tree serialisation"
time_complexity: "O(n). We visit each node in both trees at most once, where n is the minimum number of nodes in the two trees."
space_complexity: "O(h). The recursion stack can grow up to the height of the tree. In the worst case (skewed tree), this is O(n). For a balanced tree, it's O(log n)."
solutions:
- approach_name: Recursive DFS
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def is_same_tree(p: TreeNode | None, q: TreeNode | None) -> bool:
# Base case: both nodes are null - trees match at this position
if p is None and q is None:
return True
# Base case: one is null, other isn't - structure differs
if p is None or q is None:
return False
# Compare current node values
if p.val != q.val:
return False
# Recursively check both subtrees must match
return is_same_tree(p.left, q.left) and is_same_tree(p.right, q.right)
explanation: |
**Time Complexity:** O(n) — We visit each node at most once.
**Space Complexity:** O(h) — Recursion stack depth equals tree height.
The recursive approach elegantly handles all cases: empty trees, single nodes, and complex structures. By checking null conditions first, we avoid null pointer errors and correctly identify structural differences.
- approach_name: Iterative BFS
is_optimal: false
code: |
from collections import deque
def is_same_tree(p: TreeNode | None, q: TreeNode | None) -> bool:
# Queue holds pairs of nodes to compare
queue = deque([(p, q)])
while queue:
node1, node2 = queue.popleft()
# Both null - continue to next pair
if node1 is None and node2 is None:
continue
# One null, other not - trees differ
if node1 is None or node2 is None:
return False
# Values differ at this position
if node1.val != node2.val:
return False
# Add children pairs to compare next
queue.append((node1.left, node2.left))
queue.append((node1.right, node2.right))
# All pairs matched
return True
explanation: |
**Time Complexity:** O(n) — We process each node pair once.
**Space Complexity:** O(w) — Queue size is bounded by the maximum width of the tree.
This iterative approach uses a queue to compare nodes level by level. It avoids recursion stack overflow for very deep trees, though in practice the constraint (max 100 nodes) makes this unnecessary. The logic mirrors the recursive version but uses explicit queue management.

View File

@@ -0,0 +1,221 @@
title: Search a 2D Matrix
slug: search-a-2d-matrix
difficulty: medium
leetcode_id: 74
leetcode_url: https://leetcode.com/problems/search-a-2d-matrix/
categories:
- arrays
- binary-search
patterns:
- binary-search
- matrix-traversal
description: |
You are given an `m x n` integer matrix `matrix` with the following two properties:
- Each row is sorted in non-decreasing order.
- The first integer of each row is greater than the last integer of the previous row.
Given an integer `target`, return `true` *if* `target` *is in* `matrix` *or* `false` *otherwise*.
You must write a solution in `O(log(m * n))` time complexity.
constraints: |
- `m == matrix.length`
- `n == matrix[i].length`
- `1 <= m, n <= 100`
- `-10^4 <= matrix[i][j], target <= 10^4`
examples:
- input: "matrix = [[1,3,5,7],[10,11,16,20],[23,30,34,60]], target = 3"
output: "true"
explanation: "The target 3 is found in the first row at index 1."
- input: "matrix = [[1,3,5,7],[10,11,16,20],[23,30,34,60]], target = 13"
output: "false"
explanation: "The target 13 is not present in the matrix."
explanation:
intuition: |
Imagine flattening this 2D matrix into a single sorted 1D array. Because each row is sorted *and* the first element of each row is greater than the last element of the previous row, the entire matrix is essentially one long sorted sequence arranged in rows.
For example, the matrix:
```
[[1, 3, 5, 7],
[10, 11, 16, 20],
[23, 30, 34, 60]]
```
Is equivalent to the sorted array: `[1, 3, 5, 7, 10, 11, 16, 20, 23, 30, 34, 60]`
This insight is the key! Instead of treating it as a 2D search problem, we can treat it as a standard **binary search** on a virtual 1D array. The only trick is converting between a 1D index and 2D row/column coordinates.
Think of it like this: if you have a 1D index `mid` in a matrix with `n` columns, you can find the row with `mid // n` (integer division) and the column with `mid % n` (remainder). This mapping lets us perform binary search without actually flattening the array.
approach: |
We solve this using **Binary Search on Virtual 1D Array**:
**Step 1: Initialise search boundaries**
- `rows`: Number of rows in the matrix (`m`)
- `cols`: Number of columns in the matrix (`n`)
- `left`: Set to `0` (start of the virtual 1D array)
- `right`: Set to `rows * cols - 1` (end of the virtual 1D array)
&nbsp;
**Step 2: Perform binary search**
- While `left <= right`:
- Calculate `mid = left + (right - left) // 2` to avoid integer overflow
- Convert `mid` to 2D coordinates: `row = mid // cols`, `col = mid % cols`
- Get the value at `matrix[row][col]`
- If the value equals `target`, return `true`
- If the value is less than `target`, search the right half: `left = mid + 1`
- If the value is greater than `target`, search the left half: `right = mid - 1`
&nbsp;
**Step 3: Return result**
- If the loop exits without finding the target, return `false`
&nbsp;
The key insight is that we never actually flatten the matrix. We perform binary search on indices `0` to `m*n - 1` and convert each index to row/column coordinates on the fly.
common_pitfalls:
- title: Linear Search Through Rows
description: |
A naive approach might iterate through each row and perform a linear search, resulting in **O(m * n)** time complexity.
With the constraint that `m, n <= 100`, this gives up to 10,000 operations, which is acceptable but far from optimal. The problem explicitly requires `O(log(m * n))`, so this approach would be incorrect.
wrong_approach: "Nested loops checking each element"
correct_approach: "Binary search treating matrix as virtual 1D array"
- title: Two Separate Binary Searches
description: |
Some solutions first binary search to find the correct row, then binary search within that row. While this achieves `O(log m + log n) = O(log(m * n))` complexity, it's more complex to implement.
The single binary search approach is cleaner and directly treats the matrix as a sorted 1D array.
wrong_approach: "Binary search for row, then binary search within row"
correct_approach: "Single binary search with index-to-coordinate conversion"
- title: Integer Overflow in Mid Calculation
description: |
When calculating `mid`, using `(left + right) / 2` can cause integer overflow in languages with fixed-size integers if `left` and `right` are both large.
Always use `left + (right - left) // 2` to avoid this issue. In Python this isn't strictly necessary due to arbitrary precision integers, but it's a good habit for interviews.
wrong_approach: "(left + right) / 2"
correct_approach: "left + (right - left) // 2"
- title: Off-by-One Errors in Index Conversion
description: |
A common mistake is confusing rows and columns when converting from 1D to 2D indices. Remember:
- `row = mid // cols` (divide by number of columns)
- `col = mid % cols` (remainder after dividing by columns)
Using `rows` instead of `cols` will produce incorrect coordinates.
key_takeaways:
- "**Virtual flattening**: A sorted 2D matrix with strictly increasing rows can be treated as a sorted 1D array without actually creating one"
- "**Index conversion**: For a matrix with `n` columns, index `i` maps to row `i // n` and column `i % n`"
- "**Binary search pattern**: This problem demonstrates how binary search applies to any sorted data structure, not just arrays"
- "**Related problems**: Search a 2D Matrix II (LeetCode 240) has different constraints requiring a different approach (staircase search)"
time_complexity: "O(log(m * n)). We perform binary search over `m * n` elements, halving the search space with each iteration."
space_complexity: "O(1). We only use a constant number of variables (`left`, `right`, `mid`, `row`, `col`) regardless of input size."
solutions:
- approach_name: Binary Search on Virtual 1D Array
is_optimal: true
code: |
def search_matrix(matrix: list[list[int]], target: int) -> bool:
if not matrix or not matrix[0]:
return False
rows, cols = len(matrix), len(matrix[0])
# Treat the matrix as a sorted 1D array of length rows * cols
left, right = 0, rows * cols - 1
while left <= right:
# Calculate mid index (avoids overflow in other languages)
mid = left + (right - left) // 2
# Convert 1D index to 2D coordinates
row = mid // cols
col = mid % cols
value = matrix[row][col]
if value == target:
return True
elif value < target:
# Target is in the right half
left = mid + 1
else:
# Target is in the left half
right = mid - 1
return False
explanation: |
**Time Complexity:** O(log(m * n)) — Standard binary search over m * n elements.
**Space Complexity:** O(1) — Only uses constant extra space.
We treat the 2D matrix as a virtual 1D sorted array. By converting indices on the fly (`row = mid // cols`, `col = mid % cols`), we can perform standard binary search without creating an actual flattened array.
- approach_name: Two Binary Searches
is_optimal: false
code: |
def search_matrix(matrix: list[list[int]], target: int) -> bool:
if not matrix or not matrix[0]:
return False
rows, cols = len(matrix), len(matrix[0])
# First binary search: find the row where target could be
top, bottom = 0, rows - 1
while top <= bottom:
mid_row = top + (bottom - top) // 2
# Check if target is in this row's range
if matrix[mid_row][0] <= target <= matrix[mid_row][cols - 1]:
# Target could be in this row, search within it
left, right = 0, cols - 1
while left <= right:
mid_col = left + (right - left) // 2
if matrix[mid_row][mid_col] == target:
return True
elif matrix[mid_row][mid_col] < target:
left = mid_col + 1
else:
right = mid_col - 1
return False
elif matrix[mid_row][0] > target:
bottom = mid_row - 1
else:
top = mid_row + 1
return False
explanation: |
**Time Complexity:** O(log m + log n) = O(log(m * n)) — Binary search for row, then within row.
**Space Complexity:** O(1) — Only uses constant extra space.
This approach first finds the correct row using binary search on the first elements of each row, then searches within that row. While it has the same complexity, it's more verbose than the single binary search approach.
- approach_name: Linear Search
is_optimal: false
code: |
def search_matrix(matrix: list[list[int]], target: int) -> bool:
# Simple but inefficient - O(m * n) time
for row in matrix:
for val in row:
if val == target:
return True
return False
explanation: |
**Time Complexity:** O(m * n) — Checks every element in the worst case.
**Space Complexity:** O(1) — Only uses constant extra space.
This brute force approach checks every element. While simple, it doesn't meet the problem's requirement for O(log(m * n)) complexity. Included to illustrate why binary search is necessary.

View File

@@ -0,0 +1,217 @@
title: Search in Rotated Sorted Array II
slug: search-in-rotated-sorted-array-ii
difficulty: medium
leetcode_id: 81
leetcode_url: https://leetcode.com/problems/search-in-rotated-sorted-array-ii/
categories:
- arrays
- binary-search
patterns:
- binary-search
description: |
There is an integer array `nums` sorted in non-decreasing order (not necessarily with **distinct** values).
Before being passed to your function, `nums` is **rotated** at an unknown pivot index `k` (`0 <= k < nums.length`) such that the resulting array is `[nums[k], nums[k+1], ..., nums[n-1], nums[0], nums[1], ..., nums[k-1]]` (**0-indexed**). For example, `[0,1,2,4,4,4,5,6,6,7]` might be rotated at pivot index `5` and become `[4,5,6,6,7,0,1,2,4,4]`.
Given the array `nums` **after** the rotation and an integer `target`, return `true` *if* `target` *is in* `nums`*, or* `false` *if it is not in* `nums`.
You must decrease the overall operation steps as much as possible.
**Follow up:** This problem is similar to Search in Rotated Sorted Array, but `nums` may contain **duplicates**. Would this affect the runtime complexity? How and why?
constraints: |
- `1 <= nums.length <= 5000`
- `-10^4 <= nums[i] <= 10^4`
- `nums` is guaranteed to be rotated at some pivot
- `-10^4 <= target <= 10^4`
examples:
- input: "nums = [2,5,6,0,0,1,2], target = 0"
output: "true"
explanation: "The target 0 is found in the array."
- input: "nums = [2,5,6,0,0,1,2], target = 3"
output: "false"
explanation: "The target 3 is not in the array."
explanation:
intuition: |
This problem extends Search in Rotated Sorted Array by allowing **duplicate values**. The core insight from the original problem still applies: in a rotated sorted array, at least one half is always sorted, and we can use this to guide our binary search.
However, duplicates introduce a tricky edge case. Consider `nums = [1, 0, 1, 1, 1]` with `left = 0`, `mid = 2`, `right = 4`:
- `nums[left] = 1`, `nums[mid] = 1`, `nums[right] = 1`
- All three values are equal! We can't determine which half is sorted.
Think of it like standing at a broken staircase where the step you're on, and the steps at both ends, are all at the same height. You can't tell which direction leads up or down — you have to take a small step to break the tie.
The solution is simple: when `nums[left] == nums[mid]`, we can't make a decision, so we **shrink the search space by one** (`left++`). This might degrade to O(n) in the worst case (all duplicates), but it's the best we can do without additional information.
approach: |
We solve this using **Modified Binary Search with Duplicate Handling**:
**Step 1: Initialise pointers**
- `left = 0`, `right = len(nums) - 1`
- We'll search within `[left, right]` inclusive
&nbsp;
**Step 2: Binary search with duplicate handling**
- While `left <= right`:
- Calculate `mid = left + (right - left) // 2`
- If `nums[mid] == target`: return `True`
- **Handle duplicates**: If `nums[left] == nums[mid]`:
- We can't determine which half is sorted
- Increment `left` by 1 and continue
- **Determine which half is sorted**:
- If `nums[left] < nums[mid]`: **left half is sorted**
- If `nums[left] <= target < nums[mid]`: search left half (`right = mid - 1`)
- Else: search right half (`left = mid + 1`)
- Else: **right half is sorted**
- If `nums[mid] < target <= nums[right]`: search right half (`left = mid + 1`)
- Else: search left half (`right = mid - 1`)
&nbsp;
**Step 3: Return False if not found**
- If the loop exits without finding the target, return `False`
&nbsp;
The key difference from the no-duplicates version is the duplicate handling step. When we can't determine which half is sorted, we fall back to linear elimination. This ensures correctness while maintaining O(log n) performance for most inputs.
common_pitfalls:
- title: Ignoring the Duplicate Case
description: |
The most common mistake is using the exact same logic as Search in Rotated Sorted Array (without duplicates). That algorithm assumes `nums[left] <= nums[mid]` tells us the left half is sorted.
With duplicates, `nums[left] == nums[mid]` doesn't give us enough information. For example:
- `[1, 0, 1, 1, 1]`: left half contains the rotation point
- `[1, 1, 1, 0, 1]`: right half contains the rotation point
Both have `nums[left] == nums[mid]`, but the sorted half differs!
wrong_approach: "Treating nums[left] == nums[mid] as 'left half sorted'"
correct_approach: "When nums[left] == nums[mid], shrink search space with left++"
- title: Shrinking Both Ends
description: |
Some solutions shrink both `left` and `right` when duplicates are found at the boundaries. While this can work, it's more complex and error-prone.
A simpler approach is to only increment `left` when `nums[left] == nums[mid]`. This is sufficient to make progress while keeping the logic clean.
wrong_approach: "Complex logic to shrink both ends simultaneously"
correct_approach: "Simple left++ when nums[left] == nums[mid]"
- title: Expecting O(log n) Guarantee
description: |
Unlike the no-duplicates version which guarantees O(log n), this problem has **O(n) worst case**. Consider `nums = [1, 1, 1, 1, 1]` — we must check every element.
This is unavoidable. The problem statement hints at this with "decrease the overall operation steps as much as possible" rather than requiring O(log n).
Accept that O(n) worst case is inherent to the problem, not a flaw in your solution.
key_takeaways:
- "**Duplicates break binary search decisions**: When values at boundaries equal the middle value, you can't determine which half is sorted"
- "**Linear fallback is necessary**: Incrementing `left` when stuck ensures progress, even if it degrades to O(n)"
- "**Worst case is O(n)**: This is inherent to the problem — no algorithm can do better when all elements are identical"
- "**Compare to the original**: Understanding how duplicates change the problem helps solidify your understanding of both problems"
time_complexity: "O(log n) average, O(n) worst case. When we can determine the sorted half, we eliminate half the search space. When duplicates prevent this, we fall back to linear elimination."
space_complexity: "O(1). Only a constant number of pointer variables are used."
solutions:
- approach_name: Binary Search with Duplicate Handling
is_optimal: true
code: |
def search(nums: list[int], target: int) -> bool:
left, right = 0, len(nums) - 1
while left <= right:
mid = left + (right - left) // 2
# Found the target
if nums[mid] == target:
return True
# Handle duplicates: can't determine which half is sorted
if nums[left] == nums[mid]:
left += 1
continue
# Determine which half is sorted
if nums[left] < nums[mid]:
# Left half is sorted
if nums[left] <= target < nums[mid]:
# Target is in the sorted left half
right = mid - 1
else:
# Target is in the right half
left = mid + 1
else:
# Right half is sorted
if nums[mid] < target <= nums[right]:
# Target is in the sorted right half
left = mid + 1
else:
# Target is in the left half
right = mid - 1
# Target not found
return False
explanation: |
**Time Complexity:** O(log n) average, O(n) worst case.
**Space Complexity:** O(1) — Constant extra space.
This solution handles the duplicate case by incrementing `left` when `nums[left] == nums[mid]`. For arrays without many duplicates, it behaves like standard binary search. For arrays with all identical elements, it degrades to linear search — but this is unavoidable.
- approach_name: Binary Search with Both-End Shrinking
is_optimal: false
code: |
def search(nums: list[int], target: int) -> bool:
left, right = 0, len(nums) - 1
while left <= right:
mid = left + (right - left) // 2
if nums[mid] == target:
return True
# Handle duplicates at both ends
if nums[left] == nums[mid] == nums[right]:
left += 1
right -= 1
elif nums[left] <= nums[mid]:
# Left half is sorted
if nums[left] <= target < nums[mid]:
right = mid - 1
else:
left = mid + 1
else:
# Right half is sorted
if nums[mid] < target <= nums[right]:
left = mid + 1
else:
right = mid - 1
return False
explanation: |
**Time Complexity:** O(log n) average, O(n) worst case.
**Space Complexity:** O(1) — Constant extra space.
This variant shrinks both ends when `nums[left] == nums[mid] == nums[right]`. It can be slightly faster in practice for arrays with duplicates at both ends, but the logic is more complex. The simpler single-end shrinking is usually preferred.
- approach_name: Linear Search
is_optimal: false
code: |
def search(nums: list[int], target: int) -> bool:
# Simple membership check
return target in nums
explanation: |
**Time Complexity:** O(n) — Scans the entire array.
**Space Complexity:** O(1) — No extra space used.
A straightforward linear scan using Python's `in` operator. While this has the same worst-case complexity as the binary search solution, it doesn't take advantage of the sorted structure when duplicates are sparse. Useful as a baseline or for very small arrays.

View File

@@ -0,0 +1,223 @@
title: Search in Rotated Sorted Array
slug: search-in-rotated-sorted-array
difficulty: medium
leetcode_id: 33
leetcode_url: https://leetcode.com/problems/search-in-rotated-sorted-array/
categories:
- arrays
- binary-search
patterns:
- binary-search
description: |
There is an integer array `nums` sorted in ascending order (with **distinct** values).
Prior to being passed to your function, `nums` is **possibly rotated** at an unknown pivot index `k` (`1 <= k < nums.length`) such that the resulting array is `[nums[k], nums[k+1], ..., nums[n-1], nums[0], nums[1], ..., nums[k-1]]` (0-indexed). For example, `[0,1,2,4,5,6,7]` might be rotated at pivot index `3` and become `[4,5,6,7,0,1,2]`.
Given the array `nums` **after** the possible rotation and an integer `target`, return *the index of* `target` *if it is in* `nums`*, or* `-1` *if it is not in* `nums`.
You must write an algorithm with **O(log n)** runtime complexity.
constraints: |
- `1 <= nums.length <= 5000`
- `-10^4 <= nums[i] <= 10^4`
- All values of `nums` are **unique**
- `nums` is an ascending array that is possibly rotated
- `-10^4 <= target <= 10^4`
examples:
- input: "nums = [4,5,6,7,0,1,2], target = 0"
output: "4"
explanation: "The target 0 is found at index 4."
- input: "nums = [4,5,6,7,0,1,2], target = 3"
output: "-1"
explanation: "The target 3 is not in the array, so return -1."
- input: "nums = [1], target = 0"
output: "-1"
explanation: "Single element array doesn't contain the target."
explanation:
intuition: |
Imagine a sorted array as a staircase going upward. When you rotate it, you're cutting the staircase at some point and moving the upper portion to the end — creating a "broken" staircase with two ascending sections.
For example, `[0,1,2,4,5,6,7]` becomes `[4,5,6,7,0,1,2]`. We now have two sorted "halves": `[4,5,6,7]` (the higher half) and `[0,1,2]` (the lower half).
The key insight is: **at least one half of the array is always properly sorted**. When you pick a middle element, you can determine which half is sorted by comparing the endpoints. Then you check if your target falls within that sorted range — if yes, search there; if not, search the other half.
Think of it like this: you're at the middle of a broken staircase. You look left and right. One side will be a clean, unbroken staircase (sorted). You can quickly tell if your target is on that sorted side. If not, it must be on the "broken" side, so you continue searching there.
approach: |
We solve this using **Modified Binary Search**:
**Step 1: Initialise pointers**
- `left = 0`, `right = len(nums) - 1`
- We'll search within `[left, right]` inclusive
&nbsp;
**Step 2: Binary search with sorted-half detection**
- While `left <= right`:
- Calculate `mid = left + (right - left) // 2`
- If `nums[mid] == target`: return `mid`
- **Determine which half is sorted**:
- If `nums[left] <= nums[mid]`: **left half is sorted**
- If `nums[left] <= target < nums[mid]`: target is in left half, set `right = mid - 1`
- Else: target is in right half, set `left = mid + 1`
- Else: **right half is sorted**
- If `nums[mid] < target <= nums[right]`: target is in right half, set `left = mid + 1`
- Else: target is in left half, set `right = mid - 1`
&nbsp;
**Step 3: Return -1 if not found**
- If the loop exits without finding the target, return `-1`
&nbsp;
The algorithm works because we can always identify one sorted half and determine if the target lies within that range. If it does, we search there; otherwise, we search the other (potentially rotated) half.
common_pitfalls:
- title: Confusing Which Half is Sorted
description: |
The condition `nums[left] <= nums[mid]` (with `<=`, not `<`) correctly identifies when the left half is sorted. The `=` is important for cases where `left == mid` (small subarrays).
For example, with `[3, 1]` and `left = 0, mid = 0`:
- `nums[left] == nums[mid] == 3`
- The left "half" (just element 3) is trivially sorted
Using `<` instead of `<=` would incorrectly identify the right half as sorted.
wrong_approach: "if nums[left] < nums[mid] (missing equality)"
correct_approach: "if nums[left] <= nums[mid]"
- title: Incorrect Target Range Checks
description: |
When checking if the target is in the sorted half, be precise about the boundaries:
- For left half: `nums[left] <= target < nums[mid]` (exclude mid since we already checked it)
- For right half: `nums[mid] < target <= nums[right]` (exclude mid)
Missing the equality check with the endpoints can cause you to search the wrong half.
wrong_approach: "nums[left] < target < nums[mid]"
correct_approach: "nums[left] <= target < nums[mid]"
- title: Using Two-Pass Approach
description: |
A tempting approach is to first find the pivot (minimum element) using "Find Minimum in Rotated Sorted Array", then do a standard binary search on the correct half. While this works, it requires **two passes** (O(log n) + O(log n)).
The single-pass approach is cleaner and more elegant — you determine the sorted half and search direction simultaneously in one loop.
wrong_approach: "Find pivot first, then binary search"
correct_approach: "Single binary search with sorted-half detection"
key_takeaways:
- "**One sorted half**: In a rotated sorted array, at least one half is always properly sorted — use this property to guide your search"
- "**Compare with endpoints**: Comparing `nums[left]` with `nums[mid]` tells you which half is sorted"
- "**Range membership**: Once you identify the sorted half, checking if target is in that range is straightforward"
- "**Foundation problem**: This technique is the basis for many rotated array problems (find minimum, search with duplicates, etc.)"
time_complexity: "O(log n). Each iteration eliminates half the search space."
space_complexity: "O(1). Only a constant number of pointer variables are used."
solutions:
- approach_name: Binary Search with Sorted-Half Detection
is_optimal: true
code: |
def search(nums: list[int], target: int) -> int:
left, right = 0, len(nums) - 1
while left <= right:
mid = left + (right - left) // 2
# Found the target
if nums[mid] == target:
return mid
# Determine which half is sorted
if nums[left] <= nums[mid]:
# Left half is sorted
if nums[left] <= target < nums[mid]:
# Target is in the sorted left half
right = mid - 1
else:
# Target is in the right half
left = mid + 1
else:
# Right half is sorted
if nums[mid] < target <= nums[right]:
# Target is in the sorted right half
left = mid + 1
else:
# Target is in the left half
right = mid - 1
# Target not found
return -1
explanation: |
**Time Complexity:** O(log n) — Search space halves each iteration.
**Space Complexity:** O(1) — Constant extra space.
At each step, we identify which half is sorted (by comparing `nums[left]` with `nums[mid]`), then check if the target falls within that sorted range. If yes, we search that half; otherwise, we search the other half. This ensures we always make progress toward the target or confirm its absence.
- approach_name: Two-Pass (Find Pivot + Binary Search)
is_optimal: false
code: |
def search(nums: list[int], target: int) -> int:
n = len(nums)
# Step 1: Find the pivot (index of minimum element)
left, right = 0, n - 1
while left < right:
mid = left + (right - left) // 2
if nums[mid] > nums[right]:
left = mid + 1
else:
right = mid
pivot = left
# Step 2: Determine which half to search
# If target >= nums[0], it's in the left portion (before pivot)
# Otherwise, it's in the right portion (pivot to end)
if target >= nums[0]:
left, right = 0, pivot - 1 if pivot > 0 else 0
else:
left, right = pivot, n - 1
# Handle edge case: array not rotated or single element
if pivot == 0:
left, right = 0, n - 1
# Step 3: Standard binary search in the chosen range
while left <= right:
mid = left + (right - left) // 2
if nums[mid] == target:
return mid
elif nums[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
explanation: |
**Time Complexity:** O(log n) — Two binary searches, each O(log n).
**Space Complexity:** O(1) — Constant extra space.
This approach first finds the pivot (minimum element) to understand the array's structure, then performs a standard binary search on the appropriate half. While correct and having the same complexity, it's more verbose than the single-pass approach. Useful if you've already solved "Find Minimum in Rotated Sorted Array" and want to build on that.
- approach_name: Linear Search
is_optimal: false
code: |
def search(nums: list[int], target: int) -> int:
# Simple scan through the array
for i, num in enumerate(nums):
if num == target:
return i
return -1
explanation: |
**Time Complexity:** O(n) — Scans every element.
**Space Complexity:** O(1) — No extra space used.
A straightforward linear scan. This doesn't meet the O(log n) requirement but demonstrates the baseline approach. With constraints up to `n = 5000`, linear search would still pass but isn't optimal.

View File

@@ -0,0 +1,186 @@
title: Search Insert Position
slug: search-insert-position
difficulty: easy
leetcode_id: 35
leetcode_url: https://leetcode.com/problems/search-insert-position/
categories:
- arrays
- binary-search
patterns:
- binary-search
function_signature: "def search_insert(nums: list[int], target: int) -> int:"
test_cases:
visible:
- input: { nums: [1, 3, 5, 6], target: 5 }
expected: 2
- input: { nums: [1, 3, 5, 6], target: 2 }
expected: 1
- input: { nums: [1, 3, 5, 6], target: 7 }
expected: 4
hidden:
- input: { nums: [1, 3, 5, 6], target: 0 }
expected: 0
- input: { nums: [1], target: 0 }
expected: 0
- input: { nums: [1], target: 2 }
expected: 1
description: |
Given a sorted array of distinct integers and a target value, return the index if the target is found. If not, return the index where it would be if it were inserted in order.
You must write an algorithm with `O(log n)` runtime complexity.
constraints: |
- `1 <= nums.length <= 10^4`
- `-10^4 <= nums[i] <= 10^4`
- `nums` contains **distinct** values sorted in **ascending** order
- `-10^4 <= target <= 10^4`
examples:
- input: "nums = [1,3,5,6], target = 5"
output: "2"
explanation: "5 is found at index 2."
- input: "nums = [1,3,5,6], target = 2"
output: "1"
explanation: "2 is not found, but would be inserted at index 1 (between 1 and 3)."
- input: "nums = [1,3,5,6], target = 7"
output: "4"
explanation: "7 is not found and is greater than all elements, so it would be inserted at the end (index 4)."
explanation:
intuition: |
Imagine you have a bookshelf where books are arranged by page count from smallest to largest. You want to find where a book with a specific page count belongs — either its exact position if it's already there, or where you'd slide it in to maintain the order.
The key insight is that this is a **classic binary search problem with a twist**: instead of just returning `-1` when the target isn't found, we return the *insertion point*. This insertion point is exactly where our search "narrows down to" when the target doesn't exist.
Think of it like this: binary search works by maintaining a range `[left, right]` where the answer *could* be. Each step, we eliminate half the range. When we find the target, we return its index. When we don't find it, the `left` pointer ends up at exactly the position where the target would need to be inserted to maintain sorted order.
Why does `left` give us the insertion point? Because we're looking for the **first position** where `nums[i] >= target`. When the loop ends without finding the target, `left` points to the smallest element greater than target (or past the end if target is larger than all elements).
approach: |
We solve this using **Binary Search**:
**Step 1: Initialise pointers**
- `left`: Set to `0` (start of array)
- `right`: Set to `len(nums) - 1` (end of array)
&nbsp;
**Step 2: Binary search loop**
- While `left <= right`:
- Calculate `mid = left + (right - left) // 2` to avoid integer overflow
- If `nums[mid] == target`: return `mid` (found the target)
- If `nums[mid] < target`: the target must be in the right half, so set `left = mid + 1`
- If `nums[mid] > target`: the target must be in the left half, so set `right = mid - 1`
&nbsp;
**Step 3: Return insertion point**
- If we exit the loop without finding the target, return `left`
- At this point, `left` is the index where target should be inserted
&nbsp;
The algorithm works because binary search naturally converges to the insertion point. When the target isn't found, `left` ends up at the position of the smallest element greater than target.
common_pitfalls:
- title: Using Linear Search
description: |
A naive approach is to iterate through the array and return the first index where `nums[i] >= target`:
```python
for i in range(len(nums)):
if nums[i] >= target:
return i
return len(nums)
```
While correct, this is **O(n)** time complexity. The problem explicitly requires **O(log n)**, so this approach would be considered incorrect even if it passes the test cases.
wrong_approach: "Linear scan through the array"
correct_approach: "Binary search halving the search space each step"
- title: Returning -1 When Not Found
description: |
Classic binary search returns `-1` when the element isn't found. But this problem asks for the *insertion position*, not whether the element exists.
If you return `-1` when `nums[mid] != target` after the loop, you'll fail test cases where the target needs to be inserted.
wrong_approach: "Return -1 when target not found"
correct_approach: "Return left pointer as the insertion position"
- title: Integer Overflow in Mid Calculation
description: |
Calculating `mid = (left + right) // 2` can cause integer overflow in some languages when `left` and `right` are both large.
Use `mid = left + (right - left) // 2` instead. In Python this isn't strictly necessary due to arbitrary precision integers, but it's a good habit for interviews where you might code in Java or C++.
wrong_approach: "mid = (left + right) // 2"
correct_approach: "mid = left + (right - left) // 2"
- title: Off-by-One Errors
description: |
A common mistake is using `left < right` instead of `left <= right`, or incorrectly updating pointers with `left = mid` or `right = mid` instead of `mid + 1` and `mid - 1`.
With `left <= right` and proper pointer updates, the search space shrinks by at least one element each iteration, guaranteeing termination.
wrong_approach: "while left < right with mid assignments"
correct_approach: "while left <= right with mid +/- 1 assignments"
key_takeaways:
- "**Binary search template**: This problem demonstrates the standard binary search pattern that applies to many problems — sorted array, O(log n) requirement, halving search space"
- "**Insertion point insight**: When binary search doesn't find the target, the `left` pointer naturally lands at the insertion position"
- "**Foundation for harder problems**: This exact technique is used in `bisect_left` in Python and underlies problems like finding first/last occurrence, search in rotated array, and more"
- "**Interview favourite**: This is a classic warm-up problem that tests whether you truly understand binary search beyond just finding an element"
time_complexity: "O(log n). Each iteration halves the search space, so we perform at most log<sub>2</sub>(n) comparisons."
space_complexity: "O(1). We only use two pointers (`left` and `right`) regardless of input size."
solutions:
- approach_name: Binary Search
is_optimal: true
code: |
def search_insert(nums: list[int], target: int) -> int:
left, right = 0, len(nums) - 1
while left <= right:
# Avoid potential overflow with this formula
mid = left + (right - left) // 2
if nums[mid] == target:
# Found the target, return its index
return mid
elif nums[mid] < target:
# Target is in the right half
left = mid + 1
else:
# Target is in the left half
right = mid - 1
# Target not found, left is the insertion point
return left
explanation: |
**Time Complexity:** O(log n) — Each iteration eliminates half the remaining elements.
**Space Complexity:** O(1) — Only two pointer variables used.
The binary search maintains the invariant that the answer (either the target's index or its insertion point) lies within `[left, right]`. When the target isn't found, `left` converges to the correct insertion position.
- approach_name: Linear Search
is_optimal: false
code: |
def search_insert(nums: list[int], target: int) -> int:
# Find first element >= target
for i in range(len(nums)):
if nums[i] >= target:
return i
# Target is larger than all elements
return len(nums)
explanation: |
**Time Complexity:** O(n) — May need to scan the entire array.
**Space Complexity:** O(1) — Only loop variable used.
This approach is simple and correct, but doesn't meet the O(log n) requirement. Included to illustrate why binary search is necessary. In an interview, using this solution would likely be marked incorrect despite passing test cases.

View File

@@ -0,0 +1,247 @@
title: Serialize and Deserialize Binary Tree
slug: serialize-and-deserialize-binary-tree
difficulty: hard
leetcode_id: 297
leetcode_url: https://leetcode.com/problems/serialize-and-deserialize-binary-tree/
categories:
- trees
- strings
patterns:
- bfs
- dfs
- tree-traversal
description: |
Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.
Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary tree can be serialized to a string and this string can be deserialized to the original tree structure.
**Clarification:** The input/output format is the same as how LeetCode serializes a binary tree. You do not necessarily need to follow this format, so please be creative and come up with different approaches yourself.
constraints: |
- The number of nodes in the tree is in the range `[0, 10^4]`
- `-1000 <= Node.val <= 1000`
examples:
- input: "root = [1,2,3,null,null,4,5]"
output: "[1,2,3,null,null,4,5]"
explanation: "The tree is serialized to a string representation and then deserialized back to the original tree structure."
- input: "root = []"
output: "[]"
explanation: "An empty tree serializes to an empty representation."
explanation:
intuition: |
Think of serialization like writing directions to recreate a sculpture. You need to capture enough information that someone else (or a computer) can rebuild the exact same structure from your description alone.
For a binary tree, the challenge is that we need to encode not just the *values* of nodes, but also the *structure* — which nodes are children of which, and where the tree has missing children (null positions). Without encoding the nulls, we couldn't distinguish between different tree shapes that have the same values.
Imagine you're describing a family tree over the phone. You might say: "The root is 1. Their left child is 2, right child is 3. Node 2 has no children. Node 3's left child is 4, right child is 5." This level-by-level description is essentially **BFS serialization**.
Alternatively, you could describe it depth-first: "Start at 1, go left to 2. Node 2 has no left child (null), no right child (null). Back to 1, go right to 3. Node 3's left is 4..." This is **preorder DFS serialization**.
Both approaches work because they capture the complete structure. The key insight is that by including null markers, we can uniquely reconstruct any binary tree — even ones that aren't complete or balanced.
approach: |
We present two approaches: **Preorder DFS** (recursive and elegant) and **Level-Order BFS** (iterative and intuitive).
### Approach 1: Preorder DFS
**Step 1: Serialization — Preorder traversal with null markers**
- Visit the current node and append its value to the result
- Use a delimiter (comma) between values
- Use a special marker (e.g., `"null"` or `"#"`) for null nodes
- Recursively serialize left subtree, then right subtree
&nbsp;
**Step 2: Deserialization — Rebuild from preorder sequence**
- Split the serialized string by delimiter to get a list of values
- Use an iterator or index to track position in the list
- For each value: if it's null, return `None`; otherwise create a node
- Recursively build left subtree, then right subtree
- The preorder property ensures we process nodes in the correct order
&nbsp;
### Approach 2: Level-Order BFS
**Step 1: Serialization — Level-by-level traversal**
- Use a queue starting with the root
- For each node: append its value (or null marker) to result
- Add children to queue (even if null, we still record them)
&nbsp;
**Step 2: Deserialization — Rebuild level by level**
- Parse the first value as root
- Use a queue of parent nodes
- For each parent, the next two values in the list are its left and right children
- Add non-null children to the queue for processing their children
&nbsp;
Both approaches have the same complexity, but DFS is often more concise while BFS matches LeetCode's standard format.
common_pitfalls:
- title: Forgetting to Encode Null Children
description: |
Without null markers, you cannot distinguish between different tree structures. For example, consider these two trees:
- Tree A: root=1, left=2, right=null
- Tree B: root=1, left=null, right=2
If you only serialize non-null values as "1,2", both trees produce the same string! By encoding nulls, Tree A becomes "1,2,null" and Tree B becomes "1,null,2".
wrong_approach: "Only serialize non-null node values"
correct_approach: "Include null markers for missing children"
- title: Delimiter Conflicts with Node Values
description: |
If node values can contain your delimiter character, parsing breaks. For example, if values could be strings containing commas and you use comma as delimiter.
For this problem, node values are integers in `[-1000, 1000]`, so comma is safe. But in general, consider escaping or using a delimiter that can't appear in values.
wrong_approach: "Using a delimiter that could appear in node values"
correct_approach: "Choose a delimiter that's guaranteed not to appear in values"
- title: Off-by-One Errors in BFS Deserialization
description: |
In BFS deserialization, it's easy to lose track of which values correspond to which parent's children. The pattern is: for each parent node dequeued, consume the next two values for its left and right children.
A common mistake is processing children before their parent is dequeued, or consuming too many/few values per parent.
wrong_approach: "Inconsistent pairing of parents to children values"
correct_approach: "Strictly consume two values (left, right) per dequeued parent"
- title: Not Handling Empty Tree
description: |
The empty tree (null root) is a valid input. Your serialization should produce a recognizable empty representation, and deserialization should correctly return `None`.
Forgetting this edge case leads to errors like trying to access `root.val` when root is `None`.
key_takeaways:
- "**Null markers are essential**: They encode the tree's *structure*, not just its values — this is what distinguishes serialization from simple traversal"
- "**Preorder + nulls = unique tree**: Preorder traversal with null markers uniquely identifies any binary tree, enabling perfect reconstruction"
- "**BFS matches intuition**: Level-order serialization is often easier to visualize and debug since it matches how we draw trees"
- "**This pattern appears everywhere**: Serialization concepts apply to JSON parsing, protocol buffers, database storage, and network transmission of complex data structures"
time_complexity: "O(n). Both serialization and deserialization visit each node exactly once, where `n` is the number of nodes in the tree."
space_complexity: "O(n). The serialized string stores `n` node values plus null markers. The recursion stack (DFS) or queue (BFS) can hold up to O(n) nodes in the worst case (skewed tree)."
solutions:
- approach_name: Preorder DFS
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
class Codec:
def serialize(self, root: TreeNode | None) -> str:
"""Encodes a tree to a single string using preorder traversal."""
result = []
def dfs(node: TreeNode | None) -> None:
if node is None:
result.append("null")
return
# Preorder: process current node first
result.append(str(node.val))
# Then recursively serialize left and right subtrees
dfs(node.left)
dfs(node.right)
dfs(root)
return ",".join(result)
def deserialize(self, data: str) -> TreeNode | None:
"""Decodes your encoded data to tree."""
values = iter(data.split(","))
def build() -> TreeNode | None:
val = next(values)
if val == "null":
return None
# Create node and recursively build its subtrees
node = TreeNode(int(val))
node.left = build() # Next values are left subtree
node.right = build() # Followed by right subtree
return node
return build()
explanation: |
**Time Complexity:** O(n) — Each node is visited once during both serialization and deserialization.
**Space Complexity:** O(n) — The serialized string has O(n) elements. Recursion depth is O(h) where h is tree height, which is O(n) in the worst case (skewed tree).
The preorder approach is elegant because the serialization order naturally matches the deserialization order. When we deserialize, the first value is always the root, followed by its complete left subtree, then its complete right subtree. This recursive structure makes the code concise and easy to reason about.
- approach_name: Level-Order BFS
is_optimal: true
code: |
from collections import deque
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
class Codec:
def serialize(self, root: TreeNode | None) -> str:
"""Encodes a tree to a single string using level-order traversal."""
if root is None:
return "null"
result = []
queue = deque([root])
while queue:
node = queue.popleft()
if node is None:
result.append("null")
else:
result.append(str(node.val))
# Add children to queue (including None for null markers)
queue.append(node.left)
queue.append(node.right)
return ",".join(result)
def deserialize(self, data: str) -> TreeNode | None:
"""Decodes your encoded data to tree."""
values = data.split(",")
if values[0] == "null":
return None
# Create root from first value
root = TreeNode(int(values[0]))
queue = deque([root])
i = 1 # Index for remaining values
while queue and i < len(values):
parent = queue.popleft()
# Left child is next value
if values[i] != "null":
parent.left = TreeNode(int(values[i]))
queue.append(parent.left)
i += 1
# Right child is the value after that
if i < len(values) and values[i] != "null":
parent.right = TreeNode(int(values[i]))
queue.append(parent.right)
i += 1
return root
explanation: |
**Time Complexity:** O(n) — Each node is visited once during both operations.
**Space Complexity:** O(n) — The queue can hold up to O(n/2) ≈ O(n) nodes at the widest level. The serialized string is O(n).
The BFS approach processes the tree level by level, which produces output matching LeetCode's standard tree format. During deserialization, we pair each parent with its two children in order, using a queue to track which parents still need children assigned. This approach is more intuitive for those who think in terms of tree levels.

View File

@@ -0,0 +1,209 @@
title: Set Matrix Zeroes
slug: set-matrix-zeroes
difficulty: medium
leetcode_id: 73
leetcode_url: https://leetcode.com/problems/set-matrix-zeroes/
categories:
- arrays
- hash-tables
patterns:
- matrix-traversal
description: |
Given an `m x n` integer matrix `matrix`, if an element is `0`, set its entire row and column to `0`'s.
You must do it **in place**.
constraints: |
- `m == matrix.length`
- `n == matrix[0].length`
- `1 <= m, n <= 200`
- `-2^31 <= matrix[i][j] <= 2^31 - 1`
examples:
- input: "matrix = [[1,1,1],[1,0,1],[1,1,1]]"
output: "[[1,0,1],[0,0,0],[1,0,1]]"
explanation: "The element at position (1,1) is 0, so its entire row (row 1) and column (column 1) are set to 0."
- input: "matrix = [[0,1,2,0],[3,4,5,2],[1,3,1,5]]"
output: "[[0,0,0,0],[0,4,5,0],[0,3,1,0]]"
explanation: "Zeroes at positions (0,0) and (0,3) cause row 0 and columns 0 and 3 to be zeroed. The middle elements remain unchanged where rows and columns don't intersect with zeroes."
explanation:
intuition: |
Imagine you're playing a game where every zero in a grid is a "bomb" that explodes horizontally and vertically, turning everything in its row and column to zero.
The challenge is that you can't simply iterate through and set values to zero as you find bombs — if you do, you'll create new zeroes that weren't originally there, causing a chain reaction that zeros out the entire matrix.
Think of it like this: you need to first **survey the battlefield** to find all the original bombs, then **detonate them all at once**. The question is: how do we remember where all the bombs are without using extra space proportional to the matrix size?
The key insight is that we can use the **matrix itself as a notepad**. Specifically, we can use the first row and first column as markers to record which rows and columns need to be zeroed. Since these cells will eventually be overwritten anyway (if they need to be zeroed), we're not losing any information — we just need to handle the first row and column specially.
approach: |
We solve this using the **In-Place Marker Approach** with O(1) extra space:
**Step 1: Check if first row and first column need zeroing**
- `first_row_has_zero`: Scan the first row for any zero
- `first_col_has_zero`: Scan the first column for any zero
- We need these flags because we'll overwrite the first row/column with markers
&nbsp;
**Step 2: Use first row and column as markers**
- Iterate through the matrix (excluding first row/column)
- If `matrix[i][j] == 0`, set `matrix[i][0] = 0` and `matrix[0][j] = 0`
- This marks row `i` and column `j` for zeroing
&nbsp;
**Step 3: Zero out cells based on markers**
- Iterate through the matrix (excluding first row/column)
- If `matrix[i][0] == 0` or `matrix[0][j] == 0`, set `matrix[i][j] = 0`
- This propagates the zeroes based on our markers
&nbsp;
**Step 4: Handle first row and column**
- If `first_row_has_zero` was true, zero out the entire first row
- If `first_col_has_zero` was true, zero out the entire first column
- This must be done **last** to avoid corrupting our markers
&nbsp;
This approach cleverly reuses space we were going to modify anyway, achieving constant extra space.
common_pitfalls:
- title: Modifying While Iterating
description: |
The most common mistake is setting cells to zero as you find zeroes:
```python
for i in range(m):
for j in range(n):
if matrix[i][j] == 0:
# Set row and column to zero immediately
for k in range(n): matrix[i][k] = 0
for k in range(m): matrix[k][j] = 0
```
This creates **new zeroes** that weren't originally there. When you later encounter these new zeroes, you'll zero out even more rows and columns, potentially turning the entire matrix to zeroes.
The fix is to first **record** all original zero positions, then **apply** the changes in a second pass.
wrong_approach: "Setting cells to zero immediately when a zero is found"
correct_approach: "First pass to mark, second pass to zero"
- title: Using O(m*n) Space
description: |
A straightforward but wasteful approach is to create a copy of the matrix:
```python
copy = [[matrix[i][j] for j in range(n)] for i in range(m)]
```
This uses O(m*n) extra space. The problem specifically asks for an in-place solution, and the follow-up challenges you to use O(1) space.
Even using two sets to store which rows and columns need zeroing uses O(m + n) space — better, but not optimal.
wrong_approach: "Copying the entire matrix or storing all zero positions"
correct_approach: "Use the first row and column as markers"
- title: Corrupting Markers Before Using Them
description: |
When using the first row/column as markers, a subtle bug is handling them in the wrong order:
```python
# WRONG: Zeroing first row/column early corrupts markers
if first_row_has_zero:
for j in range(n): matrix[0][j] = 0 # Destroys column markers!
```
If you zero the first row before using its values as markers, you lose the information about which columns need zeroing.
The fix is to **always handle the first row and column last**, after all other cells have been processed.
wrong_approach: "Zeroing first row/column before processing other cells"
correct_approach: "Process interior cells first, then handle first row/column last"
key_takeaways:
- "**In-place markers**: When you need to mark items for later processing, consider reusing space that will be overwritten anyway"
- "**Two-pass pattern**: Many matrix problems benefit from separating 'marking' and 'applying' into distinct passes to avoid corrupting data"
- "**Order matters**: When using part of the input as auxiliary storage, process it last to avoid destroying information you still need"
- "**Space optimisation progression**: This problem illustrates a common pattern — O(mn) -> O(m+n) -> O(1) — each step requiring more clever use of existing space"
time_complexity: "O(m * n). We traverse the entire matrix a constant number of times (marking pass + zeroing pass)."
space_complexity: "O(1). We only use two boolean variables (`first_row_has_zero` and `first_col_has_zero`), regardless of matrix size."
solutions:
- approach_name: In-Place Markers
is_optimal: true
code: |
def set_zeroes(matrix: list[list[int]]) -> None:
"""
Modify matrix in-place to zero out rows and columns containing zeroes.
"""
m, n = len(matrix), len(matrix[0])
# Step 1: Check if first row/column originally have zeroes
first_row_has_zero = any(matrix[0][j] == 0 for j in range(n))
first_col_has_zero = any(matrix[i][0] == 0 for i in range(m))
# Step 2: Use first row/column as markers for the rest of the matrix
for i in range(1, m):
for j in range(1, n):
if matrix[i][j] == 0:
matrix[i][0] = 0 # Mark this row
matrix[0][j] = 0 # Mark this column
# Step 3: Zero out cells based on markers (excluding first row/column)
for i in range(1, m):
for j in range(1, n):
if matrix[i][0] == 0 or matrix[0][j] == 0:
matrix[i][j] = 0
# Step 4: Handle first row and column last
if first_row_has_zero:
for j in range(n):
matrix[0][j] = 0
if first_col_has_zero:
for i in range(m):
matrix[i][0] = 0
explanation: |
**Time Complexity:** O(m * n) — We make two passes through the matrix.
**Space Complexity:** O(1) — Only two boolean flags used.
This solution cleverly uses the first row and column as storage for markers, achieving constant space. The key is processing the first row/column last so we don't corrupt our markers.
- approach_name: Hash Sets
is_optimal: false
code: |
def set_zeroes(matrix: list[list[int]]) -> None:
"""
Uses O(m + n) space to track which rows and columns to zero.
"""
m, n = len(matrix), len(matrix[0])
# Track which rows and columns contain zeroes
zero_rows = set()
zero_cols = set()
# First pass: find all zeroes
for i in range(m):
for j in range(n):
if matrix[i][j] == 0:
zero_rows.add(i)
zero_cols.add(j)
# Second pass: zero out marked rows and columns
for i in range(m):
for j in range(n):
if i in zero_rows or j in zero_cols:
matrix[i][j] = 0
explanation: |
**Time Complexity:** O(m * n) — Two passes through the matrix.
**Space Complexity:** O(m + n) — Sets can contain at most m rows and n columns.
This approach is more intuitive but uses extra space. It's a good stepping stone to understanding the optimal solution — the key insight is that we can encode the same information in the matrix itself.

View File

@@ -0,0 +1,188 @@
title: Simplify Path
slug: simplify-path
difficulty: medium
leetcode_id: 71
leetcode_url: https://leetcode.com/problems/simplify-path/
categories:
- strings
- stack
patterns:
- monotonic-stack
description: |
You are given an *absolute* path for a Unix-style file system, which always begins with a slash `'/'`. Your task is to transform this absolute path into its **simplified canonical path**.
The *rules* of a Unix-style file system are as follows:
- A single period `'.'` represents the current directory.
- A double period `'..'` represents the previous/parent directory.
- Multiple consecutive slashes such as `'//'` and `'///'` are treated as a single slash `'/'`.
- Any sequence of periods that does **not match** the rules above should be treated as a **valid directory or file name**. For example, `'...'` and `'....'` are valid directory or file names.
The simplified canonical path should follow these *rules*:
- The path must start with a single slash `'/'`.
- Directories within the path must be separated by exactly one slash `'/'`.
- The path must not end with a slash `'/'`, unless it is the root directory.
- The path must not have any single or double periods (`'.'` and `'..'`) used to denote current or parent directories.
Return the **simplified canonical path**.
constraints: |
- `1 <= path.length <= 3000`
- `path` consists of English letters, digits, period `'.'`, slash `'/'` or `'_'`.
- `path` is a valid absolute Unix path.
examples:
- input: 'path = "/home/"'
output: '"/home"'
explanation: "The trailing slash should be removed."
- input: 'path = "/home//foo/"'
output: '"/home/foo"'
explanation: "Multiple consecutive slashes are replaced by a single one."
- input: 'path = "/home/user/Documents/../Pictures"'
output: '"/home/user/Pictures"'
explanation: 'A double period ".." refers to the directory up a level (the parent directory).'
- input: 'path = "/../"'
output: '"/"'
explanation: "Going one level up from the root directory is not possible."
- input: 'path = "/.../a/../b/c/../d/./"'
output: '"/.../b/d"'
explanation: '"..." is a valid name for a directory in this problem.'
explanation:
intuition: |
Think of navigating a file system in a terminal. When you type `cd ..`, you go up one directory. When you type `cd .`, you stay in the same directory. And when you type `cd foldername`, you enter that folder.
The key insight is that a **stack** perfectly models directory navigation:
- Entering a directory = **push** the directory name onto the stack
- Going up a level (`..`) = **pop** from the stack (if not empty)
- Staying in place (`.`) = do nothing
Imagine the stack as your "breadcrumb trail" of directories from root to your current location. Each valid directory name adds a breadcrumb; `..` removes the last breadcrumb; `.` keeps things unchanged.
After processing all parts of the path, the stack contains exactly the directories in the canonical path, from root to deepest level. Join them with `'/'` and prepend a `'/'` to get the answer.
approach: |
We solve this using a **Stack-Based Path Simplification**:
**Step 1: Split the path into components**
- Split the input path by `'/'` to get individual components
- This automatically handles multiple consecutive slashes — they produce empty strings which we'll filter out
&nbsp;
**Step 2: Process each component with a stack**
- Initialise an empty stack to track valid directory names
- For each component:
- If it's `'..'`: pop from the stack (if not empty) — go up one level
- If it's `'.'` or empty string: skip — stay in current directory
- Otherwise: push onto the stack — it's a valid directory name
&nbsp;
**Step 3: Reconstruct the canonical path**
- Join all elements in the stack with `'/'`
- Prepend `'/'` to ensure the path starts with root
- If the stack is empty, return `'/'` (the root directory)
&nbsp;
Example walkthrough with `"/home/user/Documents/../Pictures"`:
- Split: `['', 'home', 'user', 'Documents', '..', 'Pictures']`
- Process: `''` skip → `'home'` push → `'user'` push → `'Documents'` push → `'..'` pop → `'Pictures'` push
- Stack: `['home', 'user', 'Pictures']`
- Result: `'/home/user/Pictures'`
common_pitfalls:
- title: Treating All Periods as Special
description: |
Only `'.'` (single period) and `'..'` (double period) have special meaning. Sequences like `'...'`, `'....'`, or `'.hidden'` are **valid directory names**.
Don't use regex like `/\.+/` to match all period sequences — check for exact matches.
wrong_approach: "Treating '...' or '....' as special navigation"
correct_approach: "Only '.' and '..' are special; others are valid names"
- title: Popping from Empty Stack
description: |
When encountering `'..'` at the root level, there's nothing to pop. Trying to go above root is a no-op in Unix.
For example, `'/../'` should return `'/'`, not cause an error.
wrong_approach: "stack.pop() without checking if stack is empty"
correct_approach: "if stack: stack.pop()"
- title: Forgetting the Leading Slash
description: |
The canonical path must start with `'/'`. After joining stack elements, don't forget to prepend it.
`['home', 'foo']` should become `'/home/foo'`, not `'home/foo'`.
wrong_approach: "'/'.join(stack)"
correct_approach: "'/' + '/'.join(stack)"
- title: Trailing Slash in Result
description: |
The canonical path must not end with a slash (unless it's just `'/'`).
Be careful not to add a trailing slash when reconstructing the path.
wrong_approach: "'/' + '/'.join(stack) + '/'"
correct_approach: "'/' + '/'.join(stack)"
key_takeaways:
- "**Stack for hierarchical navigation**: Stacks naturally model push/pop operations like directory traversal"
- "**Split to simplify**: Splitting by delimiter handles multiple consecutive delimiters elegantly"
- "**Only exact matches are special**: `'.'` and `'..'` are special; everything else (including `'...'`) is a valid name"
- "**Real-world application**: This is exactly how `os.path.normpath()` or `realpath` work under the hood"
time_complexity: "O(n). We iterate through the path once to split it, then process each component once."
space_complexity: "O(n). In the worst case, the stack stores all directory names from the path."
solutions:
- approach_name: Stack-Based Simplification
is_optimal: true
code: |
def simplify_path(path: str) -> str:
# Split by '/' — consecutive slashes produce empty strings
components = path.split('/')
stack = []
for component in components:
if component == '..':
# Go up one level (if possible)
if stack:
stack.pop()
elif component == '.' or component == '':
# Current directory or empty — skip
continue
else:
# Valid directory name — add to path
stack.append(component)
# Reconstruct canonical path with leading slash
return '/' + '/'.join(stack)
explanation: |
**Time Complexity:** O(n) — We process each character of the path a constant number of times (split + iteration).
**Space Complexity:** O(n) — The stack and split result can each hold up to O(n) elements in the worst case.
We split the path, filter components using a stack (push for valid names, pop for `..`, skip for `.` and empty), then join with slashes. The leading `'/'` ensures proper format.
- approach_name: Using Built-in (Python)
is_optimal: false
code: |
import os
def simplify_path(path: str) -> str:
# os.path.normpath handles path normalisation
# But it uses OS-specific separators, so ensure Unix style
normalised = os.path.normpath(path)
# On Windows, replace backslashes with forward slashes
return normalised.replace('\\', '/')
explanation: |
**Time Complexity:** O(n) — `normpath` processes the path linearly.
**Space Complexity:** O(n) — Creates a new string for the result.
This leverages Python's built-in path normalisation. However, it's OS-dependent (uses `\\` on Windows) and may not match LeetCode's expected behaviour exactly. The manual stack approach is more portable and demonstrates the algorithm. Included here to show the real-world equivalent.

View File

@@ -0,0 +1,192 @@
title: Single Number
slug: single-number
difficulty: easy
leetcode_id: 136
leetcode_url: https://leetcode.com/problems/single-number/
categories:
- arrays
- math
patterns:
- greedy
function_signature: "def single_number(nums: list[int]) -> int:"
test_cases:
visible:
- input: { nums: [2, 2, 1] }
expected: 1
- input: { nums: [4, 1, 2, 1, 2] }
expected: 4
- input: { nums: [1] }
expected: 1
hidden:
- input: { nums: [5, 3, 5] }
expected: 3
- input: { nums: [-1, -1, -2] }
expected: -2
- input: { nums: [0, 1, 0] }
expected: 1
description: |
Given a **non-empty** array of integers `nums`, every element appears *twice* except for one. Find that single one.
You must implement a solution with a linear runtime complexity and use only constant extra space.
constraints: |
- `1 <= nums.length <= 3 * 10^4`
- `-3 * 10^4 <= nums[i] <= 3 * 10^4`
- Each element in the array appears twice except for one element which appears only once.
examples:
- input: "nums = [2,2,1]"
output: "1"
explanation: "The element 1 appears only once, while 2 appears twice."
- input: "nums = [4,1,2,1,2]"
output: "4"
explanation: "The element 4 appears only once, while 1 and 2 each appear twice."
- input: "nums = [1]"
output: "1"
explanation: "There is only one element, so it must be the single number."
explanation:
intuition: |
At first glance, you might think of using a hash set or hash map to count occurrences. But the problem explicitly requires **constant space**, ruling out data structures that grow with input size.
The key insight lies in a special property of the **XOR (exclusive or)** operation:
- `a XOR a = 0` — any number XORed with itself equals zero
- `a XOR 0 = a` — any number XORed with zero equals itself
- XOR is **commutative** and **associative** — the order doesn't matter
Think of it like this: imagine each number is a light switch. When you flip a switch twice (the number appears twice), it returns to its original position (cancels out to zero). When you flip a switch only once (the single number), it stays flipped.
If we XOR all numbers together, every pair will cancel out (`a XOR a = 0`), leaving only the single number that has no pair to cancel with.
approach: |
We solve this using **XOR Bit Manipulation**:
**Step 1: Initialise the result**
- `result`: Set to `0` since `a XOR 0 = a` (XORing with zero doesn't change a value)
&nbsp;
**Step 2: Iterate through all numbers**
- For each number in the array, XOR it with `result`
- Pairs of identical numbers will cancel each other out: `result XOR a XOR a = result`
- The single number will remain: `0 XOR single = single`
&nbsp;
**Step 3: Return the result**
- After XORing all elements, `result` contains only the single number
- All pairs have cancelled to zero, leaving just the unpaired element
&nbsp;
This works because XOR is both commutative (order doesn't matter) and associative (grouping doesn't matter), so the pairs will always find each other and cancel out regardless of their positions in the array.
common_pitfalls:
- title: Using Hash Map for Counting
description: |
A natural instinct is to use a hash map to count occurrences of each number, then find the one with count 1.
While this works correctly with O(n) time complexity, it uses **O(n) space** to store the counts. The problem explicitly requires O(1) space, so this approach violates the constraints.
For interviews, always read the space complexity requirement carefully before choosing your approach.
wrong_approach: "Hash map to count occurrences"
correct_approach: "XOR all elements together"
- title: Using Math with Sum
description: |
Another approach: find all unique numbers, sum them and multiply by 2, then subtract the original sum. The difference is the single number.
For example, with `[4,1,2,1,2]`: unique sum = `4+1+2 = 7`, so `2*7 - 10 = 4`.
However, this requires storing unique elements (O(n) space) or sorting (O(n log n) time), neither of which meets the constraints. It can also cause integer overflow with large numbers.
wrong_approach: "2 * sum(set) - sum(array)"
correct_approach: "XOR all elements together"
- title: Not Understanding XOR Properties
description: |
If you're unfamiliar with XOR, you might not realise that `a XOR a = 0` and `a XOR 0 = a`. These properties are essential to understand why the solution works.
XOR returns 1 when bits are different, 0 when same. So any number XORed with itself has all bits become 0.
Practice XOR operations: `5 XOR 5 = 0`, `5 XOR 0 = 5`, `5 XOR 3 XOR 5 = 3`.
key_takeaways:
- "**XOR for pair cancellation**: When elements appear in pairs, XORing all elements leaves only the unpaired one"
- "**Bit manipulation for O(1) space**: XOR operates on the number itself without extra storage, meeting strict space constraints"
- "**Know your XOR properties**: `a XOR a = 0`, `a XOR 0 = a`, and XOR is commutative and associative"
- "**Foundation for harder problems**: This pattern extends to problems like finding two single numbers (using XOR with bit partitioning)"
time_complexity: "O(n). We traverse the array exactly once, performing a constant-time XOR operation for each element."
space_complexity: "O(1). We only use a single variable (`result`) regardless of the input size."
solutions:
- approach_name: XOR Bit Manipulation
is_optimal: true
code: |
def single_number(nums: list[int]) -> int:
# Start with 0 since a XOR 0 = a
result = 0
for num in nums:
# XOR each number with result
# Pairs cancel out: a XOR a = 0
# Single number remains: 0 XOR single = single
result ^= num
return result
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(1) — Only one variable used.
We XOR all numbers together. Since `a XOR a = 0` and `a XOR 0 = a`, all pairs cancel out, leaving only the single number. This is the most elegant solution that meets both time and space requirements.
- approach_name: Hash Set
is_optimal: false
code: |
def single_number(nums: list[int]) -> int:
seen = set()
for num in nums:
if num in seen:
# Second occurrence: remove from set
seen.remove(num)
else:
# First occurrence: add to set
seen.add(num)
# The only remaining element is the single number
return seen.pop()
explanation: |
**Time Complexity:** O(n) — Single pass with O(1) set operations.
**Space Complexity:** O(n) — Set can grow up to n/2 elements.
This approach adds numbers to a set on first occurrence and removes them on second occurrence. The single number stays in the set. While correct, it violates the O(1) space constraint and would not be accepted in an interview requiring constant space.
- approach_name: Hash Map Counting
is_optimal: false
code: |
from collections import Counter
def single_number(nums: list[int]) -> int:
# Count occurrences of each number
counts = Counter(nums)
# Find the number with count 1
for num, count in counts.items():
if count == 1:
return num
explanation: |
**Time Complexity:** O(n) — One pass to count, one pass to find.
**Space Complexity:** O(n) — Hash map stores all unique elements.
This is the most intuitive approach: count each number and find the one appearing once. While easy to understand, it uses O(n) space and doesn't meet the problem's constant space requirement. Included to show the progression from intuitive to optimal solutions.

View File

@@ -0,0 +1,205 @@
title: Sliding Window Maximum
slug: sliding-window-maximum
difficulty: hard
leetcode_id: 239
leetcode_url: https://leetcode.com/problems/sliding-window-maximum/
categories:
- arrays
- queue
- heap
patterns:
- sliding-window
- monotonic-stack
description: |
You are given an array of integers `nums`, there is a sliding window of size `k` which is moving from the very left of the array to the very right. You can only see the `k` numbers in the window. Each time the sliding window moves right by one position.
Return *the max sliding window*.
constraints: |
- `1 <= nums.length <= 10^5`
- `-10^4 <= nums[i] <= 10^4`
- `1 <= k <= nums.length`
examples:
- input: "nums = [1,3,-1,-3,5,3,6,7], k = 3"
output: "[3,3,5,5,6,7]"
explanation: |
Window position Max
--------------- -----
[1 3 -1] -3 5 3 6 7 3
1 [3 -1 -3] 5 3 6 7 3
1 3 [-1 -3 5] 3 6 7 5
1 3 -1 [-3 5 3] 6 7 5
1 3 -1 -3 [5 3 6] 7 6
1 3 -1 -3 5 [3 6 7] 7
- input: "nums = [1], k = 1"
output: "[1]"
explanation: "With a window size of 1, the maximum is always the single element in the window."
explanation:
intuition: |
Imagine you're looking through a window of fixed size `k` that slides across an array. At each position, you need to report the largest value visible through that window. The naive approach would be to scan all `k` elements for each window position, but with up to `10^5` elements, this O(n×k) approach is too slow.
The key insight is that **not all elements in the window matter**. If you see a large element, any smaller elements to its *left* within the window can never be the maximum — they'll leave the window before the larger element does.
Think of it like a queue where people line up by strength: when a stronger person arrives, weaker people already in line might as well leave — they'll never be the strongest while the new person is there. This is a **monotonic decreasing queue**: elements are ordered from largest to smallest, and smaller elements get removed when a larger one enters.
The front of this queue always holds the current maximum. When it slides out of the window (based on index), we remove it and the next largest becomes the answer.
approach: |
We solve this using a **Monotonic Decreasing Deque**:
**Step 1: Initialise the deque**
- `deque`: Stores *indices* (not values) of elements in decreasing order of their values
- `result`: Collects the maximum for each window position
&nbsp;
**Step 2: Process each element**
For each index `i`:
- **Remove expired indices**: If the front of the deque is outside the current window (`i - k`), pop it from the front
- **Maintain monotonic order**: While the deque is not empty and `nums[i]` is greater than or equal to the value at the back index, pop from the back — these elements can never be the maximum
- **Add current index**: Push `i` to the back of the deque
&nbsp;
**Step 3: Record the maximum**
- Once we've processed at least `k` elements (`i >= k - 1`), the front of the deque is the index of the maximum in the current window
- Append `nums[deque[0]]` to the result
&nbsp;
**Why store indices instead of values?**
We need to know when an element has left the window. By storing indices, we can check if `deque[0] <= i - k` to determine if the maximum has expired.
common_pitfalls:
- title: Using a Max-Heap Without Index Tracking
description: |
A max-heap seems natural for finding maximums, but the heap may contain elements that have already left the window. You'd need to lazily remove them by checking indices, adding complexity.
The deque approach is cleaner because we proactively remove elements when they're no longer useful *or* when they leave the window.
wrong_approach: "Max-heap with values only"
correct_approach: "Monotonic deque with indices"
- title: O(n×k) Brute Force
description: |
Scanning all `k` elements for each window results in O(n×k) time. With `n = 10^5` and `k = 10^5`, this means 10^10 operations — far too slow.
Each element should enter and leave the deque at most once, giving amortised O(1) per element and O(n) overall.
wrong_approach: "Nested loop: for each window, scan all k elements"
correct_approach: "Monotonic deque: each element processed once"
- title: Forgetting to Remove Expired Elements
description: |
The deque front might hold an index from outside the current window. Always check if `deque[0] <= i - k` and pop it *before* recording the maximum.
Failing to do this returns maximums from elements no longer in the window.
wrong_approach: "Only removing from back, ignoring front expiration"
correct_approach: "Check and pop front if index is outside window"
- title: Storing Values Instead of Indices
description: |
If you store values, you can't tell when an element has left the window. Two elements might have the same value at different positions — you need indices to track window boundaries.
wrong_approach: "deque stores nums[i] directly"
correct_approach: "deque stores i (index), lookup nums[i] when needed"
key_takeaways:
- "**Monotonic deque pattern**: Maintain elements in sorted order; remove smaller elements when a larger one arrives"
- "**Store indices, not values**: Enables tracking whether elements have left the sliding window"
- "**Amortised O(1) per element**: Each element enters and exits the deque at most once"
- "**Classic hard problem**: Appears frequently in interviews; mastering this unlocks many similar problems"
time_complexity: "O(n). Each element is pushed and popped from the deque at most once, giving amortised O(1) per element."
space_complexity: "O(k). The deque holds at most `k` indices (the current window size)."
solutions:
- approach_name: Monotonic Deque
is_optimal: true
code: |
from collections import deque
def max_sliding_window(nums: list[int], k: int) -> list[int]:
# Deque stores indices in decreasing order of their values
dq = deque()
result = []
for i, num in enumerate(nums):
# Remove indices outside the current window
if dq and dq[0] <= i - k:
dq.popleft()
# Remove indices of smaller elements (they'll never be max)
while dq and nums[dq[-1]] <= num:
dq.pop()
# Add current index
dq.append(i)
# Window is complete, record the maximum
if i >= k - 1:
result.append(nums[dq[0]])
return result
explanation: |
**Time Complexity:** O(n) — Each element is pushed and popped at most once.
**Space Complexity:** O(k) — Deque holds at most k indices.
The deque maintains indices in decreasing order of their values. When a new element arrives, we remove all smaller elements from the back (they can never be the maximum while the new element is in the window). The front always holds the current maximum's index.
- approach_name: Heap with Lazy Deletion
is_optimal: false
code: |
import heapq
def max_sliding_window(nums: list[int], k: int) -> list[int]:
# Max-heap stores (-value, index) for max extraction
heap = []
result = []
for i in range(len(nums)):
# Add current element (negate for max-heap behavior)
heapq.heappush(heap, (-nums[i], i))
# Remove elements outside the window (lazy deletion)
while heap[0][1] <= i - k:
heapq.heappop(heap)
# Record maximum once window is complete
if i >= k - 1:
result.append(-heap[0][0])
return result
explanation: |
**Time Complexity:** O(n log n) — Each push/pop is O(log n), and we may have up to n elements in heap.
**Space Complexity:** O(n) — Heap may accumulate elements with lazy deletion.
Uses a max-heap (simulated with negated values) to track the maximum. Elements outside the window are lazily removed only when they reach the top. While correct, this is less efficient than the deque approach due to logarithmic heap operations.
- approach_name: Brute Force
is_optimal: false
code: |
def max_sliding_window(nums: list[int], k: int) -> list[int]:
result = []
n = len(nums)
# For each window position
for i in range(n - k + 1):
# Find max in current window
window_max = max(nums[i:i + k])
result.append(window_max)
return result
explanation: |
**Time Complexity:** O(n × k) — For each of (n - k + 1) windows, we scan k elements.
**Space Complexity:** O(1) — Only storing the result (not counting output).
The straightforward approach: for each window position, scan all k elements to find the maximum. Simple to understand but too slow for large inputs. With n = k = 10^5, this would require 10^10 operations.

View File

@@ -0,0 +1,266 @@
title: Sort an Array
slug: sort-an-array
difficulty: medium
leetcode_id: 912
leetcode_url: https://leetcode.com/problems/sort-an-array/
categories:
- arrays
- sorting
- recursion
patterns:
- dynamic-programming
description: |
Given an array of integers `nums`, sort the array in ascending order and return it.
You must solve the problem **without using any built-in** functions in `O(n log n)` time complexity and with the smallest space complexity possible.
constraints: |
- `1 <= nums.length <= 5 * 10^4`
- `-5 * 10^4 <= nums[i] <= 5 * 10^4`
examples:
- input: "nums = [5,2,3,1]"
output: "[1,2,3,5]"
explanation: "After sorting the array, the positions of some numbers are not changed (for example, 2 and 3), while the positions of other numbers are changed (for example, 1 and 5)."
- input: "nums = [5,1,1,2,0,0]"
output: "[0,0,1,1,2,5]"
explanation: "Note that the values of nums are not necessarily unique."
explanation:
intuition: |
Imagine you have a messy deck of cards and need to sort them. One natural approach is **divide and conquer**: split the deck in half, sort each half separately, and then merge them back together.
This is the essence of **Merge Sort**. The key insight is that merging two *already sorted* arrays is easy and efficient - you just compare the front elements of each array and pick the smaller one, repeating until both arrays are exhausted.
Think of it like this: sorting a huge pile of papers is overwhelming, but sorting two small piles and then combining them? That's manageable. By recursively applying this principle, we break the problem into trivially small pieces (single elements, which are inherently sorted) and build up the solution.
The constraint requiring `O(n log n)` time rules out simple algorithms like Bubble Sort or Insertion Sort (which are `O(n^2)`). Merge Sort guarantees `O(n log n)` in all cases - worst, average, and best - making it a reliable choice.
approach: |
We solve this using **Merge Sort**, a classic divide-and-conquer algorithm:
**Step 1: Base case**
- If the array has 0 or 1 elements, it's already sorted - return it as-is
- This is the termination condition for our recursion
&nbsp;
**Step 2: Divide the array**
- Find the middle index: `mid = len(nums) // 2`
- Split into two halves: `left = nums[:mid]` and `right = nums[mid:]`
&nbsp;
**Step 3: Recursively sort each half**
- Call merge sort on the left half
- Call merge sort on the right half
- Trust the recursion - each half will come back sorted
&nbsp;
**Step 4: Merge the sorted halves**
- Create a result array
- Use two pointers, one for each sorted half
- Compare elements at both pointers, add the smaller one to result
- Advance the pointer of the array from which we took the element
- When one array is exhausted, append all remaining elements from the other
&nbsp;
**Step 5: Return the merged result**
- The merged array is now fully sorted
- This bubbles up through the recursion to produce the final sorted array
common_pitfalls:
- title: Using O(n^2) Sorting Algorithms
description: |
Simple algorithms like Bubble Sort, Selection Sort, or Insertion Sort have `O(n^2)` time complexity.
With `n = 5 * 10^4`, this means up to 2.5 billion operations - far too slow and will result in **Time Limit Exceeded (TLE)**.
The problem explicitly requires `O(n log n)`, so you must use Merge Sort, Heap Sort, or Quick Sort (with proper pivot selection).
wrong_approach: "Nested loops comparing/swapping adjacent elements"
correct_approach: "Divide and conquer with O(n log n) guarantee"
- title: Quick Sort Worst Case
description: |
While Quick Sort is often fast in practice, its worst case is `O(n^2)` - which happens with already sorted arrays or when the pivot is always the smallest/largest element.
LeetCode tests often include edge cases that trigger this worst case. Merge Sort avoids this entirely with guaranteed `O(n log n)` performance.
If using Quick Sort, randomize pivot selection or use median-of-three to avoid worst case.
wrong_approach: "Quick Sort with first/last element as pivot"
correct_approach: "Merge Sort or Quick Sort with randomized pivot"
- title: Inefficient Merging
description: |
A common mistake is using inefficient operations during the merge step, like repeatedly inserting at the beginning of a list (`O(n)` per insert) or using `pop(0)` in Python.
Always append to the end of your result array and use index pointers to track position in the source arrays.
wrong_approach: "Using pop(0) or insert(0) during merge"
correct_approach: "Use index pointers and append to result"
- title: Not Handling Remaining Elements
description: |
After the main merge loop, one of the two halves may still have elements remaining.
Forgetting to append these leftover elements results in missing data in the output. Always extend the result with any remaining elements from both halves.
wrong_approach: "Stopping when one array is exhausted"
correct_approach: "Append remaining elements from both arrays after loop"
key_takeaways:
- "**Divide and conquer**: Break large problems into smaller subproblems, solve them, and combine results"
- "**Merge Sort guarantees**: Unlike Quick Sort, Merge Sort is `O(n log n)` in all cases - no worst-case degradation"
- "**Space tradeoff**: Merge Sort uses `O(n)` extra space for the merge step, but this is acceptable for guaranteed performance"
- "**Foundation for advanced algorithms**: The merge technique appears in external sorting, merge intervals, and many other problems"
time_complexity: "O(n log n). We divide the array log n times, and each level of recursion does O(n) work to merge."
space_complexity: "O(n). We need auxiliary space for the temporary arrays during merging. The recursion stack adds O(log n), but O(n) dominates."
solutions:
- approach_name: Merge Sort
is_optimal: true
code: |
def sortArray(nums: list[int]) -> list[int]:
# Base case: single element or empty array is already sorted
if len(nums) <= 1:
return nums
# Divide: find the middle and split
mid = len(nums) // 2
left = nums[:mid]
right = nums[mid:]
# Conquer: recursively sort both halves
sorted_left = sortArray(left)
sorted_right = sortArray(right)
# Combine: merge the two sorted halves
return merge(sorted_left, sorted_right)
def merge(left: list[int], right: list[int]) -> list[int]:
result = []
i = j = 0
# Compare elements from both arrays and add smaller one
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
# Add any remaining elements from left array
while i < len(left):
result.append(left[i])
i += 1
# Add any remaining elements from right array
while j < len(right):
result.append(right[j])
j += 1
return result
explanation: |
**Time Complexity:** O(n log n) - We divide the array into halves log n times, and merging at each level takes O(n) total.
**Space Complexity:** O(n) - We need extra space for the temporary arrays created during slicing and merging.
This is the classic Merge Sort implementation. It recursively divides the array until we have single elements, then merges them back in sorted order. The `<=` comparison in merge ensures stability (equal elements maintain their relative order).
- approach_name: Heap Sort
is_optimal: true
code: |
def sortArray(nums: list[int]) -> list[int]:
n = len(nums)
# Build max heap - start from last non-leaf node
for i in range(n // 2 - 1, -1, -1):
heapify(nums, n, i)
# Extract elements from heap one by one
for i in range(n - 1, 0, -1):
# Move current root (max) to end
nums[0], nums[i] = nums[i], nums[0]
# Heapify the reduced heap
heapify(nums, i, 0)
return nums
def heapify(nums: list[int], n: int, i: int) -> None:
largest = i # Initialize largest as root
left = 2 * i + 1 # Left child index
right = 2 * i + 2 # Right child index
# Check if left child exists and is greater than root
if left < n and nums[left] > nums[largest]:
largest = left
# Check if right child exists and is greater than current largest
if right < n and nums[right] > nums[largest]:
largest = right
# If largest is not root, swap and continue heapifying
if largest != i:
nums[i], nums[largest] = nums[largest], nums[i]
heapify(nums, n, largest)
explanation: |
**Time Complexity:** O(n log n) - Building the heap takes O(n), and we perform n extractions each taking O(log n).
**Space Complexity:** O(1) - Heap Sort is in-place, using only the input array (ignoring recursion stack for heapify, which can be made iterative).
Heap Sort first builds a max heap from the array. Then it repeatedly extracts the maximum element (root), places it at the end, and restores the heap property. This achieves O(n log n) with O(1) extra space, though it's not stable.
- approach_name: Quick Sort (Randomized)
is_optimal: false
code: |
import random
def sortArray(nums: list[int]) -> list[int]:
quick_sort(nums, 0, len(nums) - 1)
return nums
def quick_sort(nums: list[int], low: int, high: int) -> None:
if low < high:
# Randomize pivot to avoid worst case
pivot_idx = random.randint(low, high)
nums[pivot_idx], nums[high] = nums[high], nums[pivot_idx]
# Partition and get pivot position
pivot_pos = partition(nums, low, high)
# Recursively sort elements before and after partition
quick_sort(nums, low, pivot_pos - 1)
quick_sort(nums, pivot_pos + 1, high)
def partition(nums: list[int], low: int, high: int) -> int:
pivot = nums[high] # Pivot is at the end
i = low - 1 # Index of smaller element
for j in range(low, high):
# If current element is smaller than or equal to pivot
if nums[j] <= pivot:
i += 1
nums[i], nums[j] = nums[j], nums[i]
# Place pivot in correct position
nums[i + 1], nums[high] = nums[high], nums[i + 1]
return i + 1
explanation: |
**Time Complexity:** O(n log n) average, O(n^2) worst case - Randomization makes worst case extremely unlikely.
**Space Complexity:** O(log n) average for recursion stack, O(n) worst case.
Quick Sort partitions the array around a pivot, placing smaller elements before it and larger elements after. Randomizing the pivot selection prevents adversarial inputs from triggering the O(n^2) worst case. While fast in practice, Merge Sort is preferred when guaranteed O(n log n) is required.

View File

@@ -0,0 +1,188 @@
title: Sort Colors
slug: sort-colors
difficulty: medium
leetcode_id: 75
leetcode_url: https://leetcode.com/problems/sort-colors/
categories:
- arrays
- two-pointers
- sorting
patterns:
- two-pointers
description: |
Given an array `nums` with `n` objects colored red, white, or blue, sort them **in-place** so that objects of the same color are adjacent, with the colors in the order red, white, and blue.
We will use the integers `0`, `1`, and `2` to represent the color red, white, and blue, respectively.
You must solve this problem without using the library's sort function.
constraints: |
- `n == nums.length`
- `1 <= n <= 300`
- `nums[i]` is either `0`, `1`, or `2`
examples:
- input: "nums = [2,0,2,1,1,0]"
output: "[0,0,1,1,2,2]"
explanation: "The array is sorted in-place so all 0s come first, then all 1s, then all 2s."
- input: "nums = [2,0,1]"
output: "[0,1,2]"
explanation: "Each color appears exactly once, sorted in the correct order."
explanation:
intuition: |
Imagine you have a deck of cards in three colours: red, white, and blue, all shuffled together. Your task is to arrange them so all reds are on the left, all whites in the middle, and all blues on the right — without using any extra piles.
The key insight is that we can **partition the array into three regions** using just two boundaries. Think of it like having two "sweepers" that push elements to their correct regions:
- Everything before the left boundary is **red (0)**
- Everything after the right boundary is **blue (2)**
- Everything in between is either **white (1)** or **unsorted**
As we scan through the array with a current pointer, we make decisions:
- See a red? Swap it to the left region and expand that region
- See a blue? Swap it to the right region and shrink that region
- See a white? Leave it and move on
This is known as the **Dutch National Flag** algorithm, named by Edsger Dijkstra because the Dutch flag has three horizontal stripes (red, white, blue).
approach: |
We solve this using the **Dutch National Flag (Three-Way Partitioning)** algorithm:
**Step 1: Initialise three pointers**
- `low`: Points to the boundary of the red region (starts at index `0`)
- `mid`: The current element being examined (starts at index `0`)
- `high`: Points to the boundary of the blue region (starts at index `n - 1`)
&nbsp;
**Step 2: Process elements while `mid <= high`**
- If `nums[mid] == 0` (red): Swap `nums[low]` and `nums[mid]`, increment both `low` and `mid`
- If `nums[mid] == 1` (white): Just increment `mid` — it's already in the correct region
- If `nums[mid] == 2` (blue): Swap `nums[mid]` and `nums[high]`, decrement `high` only (don't increment `mid` because the swapped element needs to be checked)
&nbsp;
**Step 3: Termination**
- When `mid > high`, all elements have been processed
- The array is now partitioned: `[0...0, 1...1, 2...2]`
&nbsp;
The reason we don't increment `mid` after swapping with `high` is that we don't know what value came from the right side — it could be `0`, `1`, or `2`, so we need to examine it on the next iteration.
common_pitfalls:
- title: Incrementing mid After Swapping with high
description: |
When you swap `nums[mid]` with `nums[high]`, the element that comes from position `high` is unknown — it hasn't been examined yet.
If you increment `mid` after this swap, you might skip over a `0` that should be moved to the left region.
For example, with `[2, 0, 1]`: swapping index 0 with index 2 gives `[1, 0, 2]`. If you increment `mid`, you'd move to index 1, but the `1` at index 0 is already correct by luck — in other cases, the swapped-in value could be `0` and need further processing.
wrong_approach: "Always increment mid after every swap"
correct_approach: "Only increment mid when swapping with low or when current element is 1"
- title: Using Extra Space
description: |
A common first approach is to count the occurrences of each colour and then overwrite the array in two passes:
1. Count: `count0`, `count1`, `count2`
2. Fill: write `count0` zeros, then `count1` ones, then `count2` twos
While this works and is O(n) time, the follow-up asks for a **one-pass** solution. The Dutch National Flag algorithm achieves this in a single traversal.
wrong_approach: "Two-pass counting approach"
correct_approach: "Single-pass three-way partitioning"
- title: Off-by-One in Loop Condition
description: |
The loop should continue while `mid <= high`, not `mid < high`.
When `mid == high`, there's still one element that hasn't been classified. If you use `mid < high`, you might leave the last element in the wrong position.
For example, with `[0, 1, 2]` where `low = 0`, `mid = 2`, `high = 2`: the element at index 2 still needs to be verified (even though it's already correct in this case).
wrong_approach: "while mid < high"
correct_approach: "while mid <= high"
key_takeaways:
- "**Three-way partitioning**: Use two boundaries to divide an array into three regions — a powerful technique for problems with three distinct values"
- "**Dutch National Flag**: Named by Dijkstra, this algorithm is the foundation for quicksort's partition step with duplicate elements"
- "**Pointer movement asymmetry**: When swapping with the left boundary, we know the swapped element is `1` (already examined). When swapping with the right, we don't — hence the different increment behaviour"
- "**In-place sorting**: Achieves O(1) space by using swap operations rather than auxiliary arrays"
time_complexity: "O(n). Each element is examined at most twice — once when `mid` reaches it, and possibly once more if it was swapped from the right side."
space_complexity: "O(1). We only use three pointer variables (`low`, `mid`, `high`) regardless of input size."
solutions:
- approach_name: Dutch National Flag
is_optimal: true
code: |
def sort_colors(nums: list[int]) -> None:
"""
Sorts the array in-place using three-way partitioning.
Do not return anything, modify nums in-place instead.
"""
# Pointers for the three regions
low = 0 # Boundary for red (0s)
mid = 0 # Current element being examined
high = len(nums) - 1 # Boundary for blue (2s)
while mid <= high:
if nums[mid] == 0:
# Red: swap to the left region
nums[low], nums[mid] = nums[mid], nums[low]
low += 1
mid += 1
elif nums[mid] == 1:
# White: already in correct region, just move on
mid += 1
else:
# Blue (nums[mid] == 2): swap to the right region
nums[mid], nums[high] = nums[high], nums[mid]
high -= 1
# Don't increment mid — need to examine the swapped element
explanation: |
**Time Complexity:** O(n) — Each element is visited at most twice.
**Space Complexity:** O(1) — Only three pointer variables used.
The algorithm maintains three regions: `[0, low)` contains 0s, `[low, mid)` contains 1s, and `(high, n-1]` contains 2s. The region `[mid, high]` is unsorted. We process until `mid` crosses `high`.
- approach_name: Counting Sort
is_optimal: false
code: |
def sort_colors(nums: list[int]) -> None:
"""
Two-pass approach: count then fill.
Do not return anything, modify nums in-place instead.
"""
# First pass: count occurrences
count0 = count1 = count2 = 0
for num in nums:
if num == 0:
count0 += 1
elif num == 1:
count1 += 1
else:
count2 += 1
# Second pass: overwrite array
idx = 0
for _ in range(count0):
nums[idx] = 0
idx += 1
for _ in range(count1):
nums[idx] = 1
idx += 1
for _ in range(count2):
nums[idx] = 2
idx += 1
explanation: |
**Time Complexity:** O(n) — Two passes through the array.
**Space Complexity:** O(1) — Only three counter variables.
This approach counts each colour in the first pass, then overwrites the array in the second pass. While it achieves the same time and space complexity as the Dutch National Flag, it requires two passes. The follow-up explicitly asks for a one-pass solution.

View File

@@ -0,0 +1,189 @@
title: Spiral Matrix II
slug: spiral-matrix-ii
difficulty: medium
leetcode_id: 59
leetcode_url: https://leetcode.com/problems/spiral-matrix-ii/
categories:
- arrays
patterns:
- matrix-traversal
description: |
Given a positive integer `n`, generate an `n x n` matrix filled with elements from `1` to `n^2` in spiral order.
The spiral order starts from the top-left corner and moves **right**, then **down**, then **left**, then **up**, repeating this pattern layer by layer until all cells are filled.
constraints: |
- `1 <= n <= 20`
examples:
- input: "n = 3"
output: "[[1,2,3],[8,9,4],[7,6,5]]"
explanation: "Starting from top-left, fill 1-3 going right, 4-5 going down, 6-7 going left, 8 going up, and 9 in the center."
- input: "n = 1"
output: "[[1]]"
explanation: "A 1x1 matrix contains only the element 1."
explanation:
intuition: |
Imagine peeling an onion layer by layer, but in reverse — you're *building* layers from the outside in.
Each "layer" of the spiral is like a rectangular frame around the matrix. For a 3x3 matrix, the outer layer contains positions for numbers 1-8, and the center (a 1x1 "layer") holds just the number 9.
Think of it like walking around a track: you walk along the **top edge** from left to right, then down the **right edge**, then across the **bottom edge** from right to left, and finally up the **left edge**. After completing one lap, you step inward to the next track (layer) and repeat.
The key insight is that you can define each layer by its **boundaries**: top row, bottom row, left column, and right column. After completing a layer, you shrink these boundaries inward and continue until no space remains.
approach: |
We solve this using a **Layer-by-Layer Simulation** approach:
**Step 1: Initialise the matrix and boundaries**
- Create an `n x n` matrix filled with zeros
- `num`: Counter starting at `1`, incrementing as we fill each cell
- `top`, `bottom`: Row boundaries, starting at `0` and `n-1`
- `left`, `right`: Column boundaries, starting at `0` and `n-1`
&nbsp;
**Step 2: Fill the matrix layer by layer**
While `top <= bottom` and `left <= right`, fill one complete layer:
- **Fill top row**: Traverse from `left` to `right` along row `top`, then increment `top`
- **Fill right column**: Traverse from `top` to `bottom` along column `right`, then decrement `right`
- **Fill bottom row**: Traverse from `right` to `left` along row `bottom`, then decrement `bottom`
- **Fill left column**: Traverse from `bottom` to `top` along column `left`, then increment `left`
After each cell is filled, increment `num` to place the next number.
&nbsp;
**Step 3: Return the result**
- Once the boundaries cross (`top > bottom` or `left > right`), all cells are filled
- Return the completed matrix
&nbsp;
This simulation approach systematically handles each layer, shrinking the working area until the entire matrix is filled.
common_pitfalls:
- title: Off-by-One Errors in Boundaries
description: |
A frequent mistake is incorrectly updating or checking boundaries. For example, forgetting to increment `top` after filling the top row, or using `<` instead of `<=` in the while condition.
With `n = 1`, `top == bottom == 0` and `left == right == 0`. If you use `top < bottom`, you'd skip the only cell entirely! Always use `<=` for the boundary checks.
wrong_approach: "Using strict inequality (< instead of <=)"
correct_approach: "Use <= to include single-row or single-column layers"
- title: Forgetting to Adjust Boundaries After Each Edge
description: |
After traversing the top row, you must increment `top` before moving to the right column. Similarly for other edges. Forgetting this causes the same row or column to be overwritten.
For example, without incrementing `top`, the right column traversal would start at the same row as the top row traversal ended, placing numbers in already-filled cells.
wrong_approach: "Filling edges without updating boundaries"
correct_approach: "Update the corresponding boundary immediately after each edge"
- title: Not Handling Non-Square Subregions
description: |
While this problem uses square matrices, the layer technique must handle rectangular subregions that appear in later layers. For a 3x3 matrix, after the first layer, you have a 1x1 center — effectively a single cell.
The condition `top <= bottom and left <= right` correctly handles this: when only one row or column remains, we fill just that portion without issues.
wrong_approach: "Assuming all layers are full rectangles"
correct_approach: "Let boundary conditions naturally handle degenerate cases"
key_takeaways:
- "**Boundary tracking**: Define layers by their four boundaries (`top`, `bottom`, `left`, `right`) and shrink inward after each complete layer"
- "**Simulation pattern**: Some problems are best solved by directly simulating the described process rather than finding mathematical patterns"
- "**Consistent direction order**: Always traverse in the same order (right → down → left → up) to maintain spiral consistency"
- "**Foundation for related problems**: This technique directly applies to *Spiral Matrix* (reading values) and other matrix traversal problems"
time_complexity: "O(n^2). We visit each of the `n^2` cells exactly once to place a number."
space_complexity: "O(n^2). We create an `n x n` matrix to store the result. If we exclude the output, the algorithm uses O(1) auxiliary space."
solutions:
- approach_name: Layer-by-Layer Simulation
is_optimal: true
code: |
def generate_matrix(n: int) -> list[list[int]]:
# Create n x n matrix filled with zeros
matrix = [[0] * n for _ in range(n)]
# Number to place, starting from 1
num = 1
# Define the boundaries of the current layer
top, bottom = 0, n - 1
left, right = 0, n - 1
while top <= bottom and left <= right:
# Fill top row: left to right
for col in range(left, right + 1):
matrix[top][col] = num
num += 1
top += 1 # Shrink top boundary
# Fill right column: top to bottom
for row in range(top, bottom + 1):
matrix[row][right] = num
num += 1
right -= 1 # Shrink right boundary
# Fill bottom row: right to left
for col in range(right, left - 1, -1):
matrix[bottom][col] = num
num += 1
bottom -= 1 # Shrink bottom boundary
# Fill left column: bottom to top
for row in range(bottom, top - 1, -1):
matrix[row][left] = num
num += 1
left += 1 # Shrink left boundary
return matrix
explanation: |
**Time Complexity:** O(n^2) — We place exactly `n^2` numbers, visiting each cell once.
**Space Complexity:** O(n^2) — The output matrix requires `n^2` space. The algorithm itself uses O(1) auxiliary space (just boundary variables and the counter).
We simulate the spiral by maintaining four boundaries. For each layer, we traverse all four edges in order, updating boundaries after each edge to "peel off" that layer and prepare for the next inner layer.
- approach_name: Direction-Based Simulation
is_optimal: false
code: |
def generate_matrix(n: int) -> list[list[int]]:
# Create n x n matrix filled with zeros
matrix = [[0] * n for _ in range(n)]
# Direction vectors: right, down, left, up
directions = [(0, 1), (1, 0), (0, -1), (-1, 0)]
dir_idx = 0 # Start moving right
row, col = 0, 0
for num in range(1, n * n + 1):
matrix[row][col] = num
# Calculate next position
next_row = row + directions[dir_idx][0]
next_col = col + directions[dir_idx][1]
# Check if we need to turn: out of bounds or cell already filled
if (next_row < 0 or next_row >= n or
next_col < 0 or next_col >= n or
matrix[next_row][next_col] != 0):
# Turn clockwise: right -> down -> left -> up -> right
dir_idx = (dir_idx + 1) % 4
next_row = row + directions[dir_idx][0]
next_col = col + directions[dir_idx][1]
row, col = next_row, next_col
return matrix
explanation: |
**Time Complexity:** O(n^2) — We iterate through all `n^2` numbers.
**Space Complexity:** O(n^2) — Same output matrix, plus O(1) for direction tracking.
Instead of explicitly tracking layers, we move in the current direction until we hit a boundary or an already-filled cell, then turn clockwise. This is more intuitive for some but slightly less efficient due to boundary checks on every step.

View File

@@ -0,0 +1,200 @@
title: Spiral Matrix
slug: spiral-matrix
difficulty: medium
leetcode_id: 54
leetcode_url: https://leetcode.com/problems/spiral-matrix/
categories:
- arrays
patterns:
- matrix-traversal
description: |
Given an `m x n` `matrix`, return *all elements of the matrix in spiral order*.
A **spiral order** traversal starts from the top-left corner and moves right, then down, then left, then up, and repeats this pattern, moving inward with each cycle until all elements are visited.
constraints: |
- `m == matrix.length`
- `n == matrix[i].length`
- `1 <= m, n <= 10`
- `-100 <= matrix[i][j] <= 100`
examples:
- input: "matrix = [[1,2,3],[4,5,6],[7,8,9]]"
output: "[1,2,3,6,9,8,7,4,5]"
explanation: "Start at top-left, go right (1,2,3), down (6,9), left (8,7), up (4), then right to centre (5)."
- input: "matrix = [[1,2,3,4],[5,6,7,8],[9,10,11,12]]"
output: "[1,2,3,4,8,12,11,10,9,5,6,7]"
explanation: "Traverse the outer ring first, then the inner row."
explanation:
intuition: |
Imagine peeling an onion layer by layer. The spiral traversal works the same way — we traverse the **outermost ring** of the matrix first, then move inward to the next ring, and repeat until all elements are visited.
For each ring, we follow four directions in order:
1. **Right**: across the top row
2. **Down**: along the right column
3. **Left**: across the bottom row (in reverse)
4. **Up**: along the left column (in reverse)
Think of it like this: we maintain four boundaries — `top`, `bottom`, `left`, and `right` — that define the current ring. After traversing each edge, we shrink the boundary inward. When the boundaries cross each other, we've visited everything.
The key insight is that by tracking these four boundaries, we always know exactly where to go next without needing to mark cells as visited.
approach: |
We solve this using **Boundary Simulation**:
**Step 1: Initialise the boundaries**
- `top`: Set to `0` — the first row to process
- `bottom`: Set to `m - 1` — the last row to process
- `left`: Set to `0` — the first column to process
- `right`: Set to `n - 1` — the last column to process
- `result`: Empty list to store the spiral order
&nbsp;
**Step 2: Traverse while boundaries are valid**
- Continue while `top <= bottom` and `left <= right`
- For each iteration, traverse one complete ring:
&nbsp;
**Step 3: Traverse the four edges of the current ring**
- **Right**: Traverse from `left` to `right` along row `top`, then increment `top`
- **Down**: Traverse from `top` to `bottom` along column `right`, then decrement `right`
- **Left**: If `top <= bottom`, traverse from `right` to `left` along row `bottom`, then decrement `bottom`
- **Up**: If `left <= right`, traverse from `bottom` to `top` along column `left`, then increment `left`
&nbsp;
**Step 4: Return the result**
- Return the `result` list containing all elements in spiral order
&nbsp;
The boundary checks before the left and up traversals handle matrices that aren't square — when one dimension runs out before the other.
common_pitfalls:
- title: Forgetting Boundary Checks on Inner Traversals
description: |
After traversing right and down, you must check if the boundaries are still valid before traversing left and up.
For example, in a single-row matrix `[[1,2,3]]`, after going right and down (which adds nothing), the top boundary moves past bottom. Without checking `top <= bottom` before the left traversal, you'd incorrectly traverse again.
wrong_approach: "Always traverse all four directions"
correct_approach: "Check top <= bottom before left, and left <= right before up"
- title: Off-by-One Errors in Range
description: |
Python's `range()` is exclusive on the end. For leftward and upward traversals, you need `range(right, left - 1, -1)` and `range(bottom, top - 1, -1)` to include the boundary positions.
A common mistake is using `range(right, left, -1)` which misses the leftmost column.
wrong_approach: "range(right, left, -1) — misses left boundary"
correct_approach: "range(right, left - 1, -1) — includes left boundary"
- title: Handling Single Row or Column
description: |
When the matrix is a single row or single column, the spiral simplifies to a straight line. Your algorithm should handle this naturally without special cases if the boundary checks are correct.
Test with `[[1,2,3,4]]` (single row) and `[[1],[2],[3]]` (single column) to verify.
wrong_approach: "Assuming matrix is always multi-row and multi-column"
correct_approach: "Boundary checks handle edge cases automatically"
key_takeaways:
- "**Boundary tracking**: Use four variables (`top`, `bottom`, `left`, `right`) to define the current ring being traversed"
- "**Shrink inward**: After each edge traversal, move the corresponding boundary inward"
- "**Conditional checks**: Always check if boundaries are still valid before the left and up traversals"
- "**Pattern recognition**: This technique applies to many matrix problems — spiral printing, rotating matrices, and layer-by-layer processing"
time_complexity: "O(m × n). We visit each cell exactly once."
space_complexity: "O(1). We only use a few boundary variables. The output list doesn't count toward auxiliary space."
solutions:
- approach_name: Boundary Simulation
is_optimal: true
code: |
def spiral_order(matrix: list[list[int]]) -> list[int]:
if not matrix or not matrix[0]:
return []
result = []
top, bottom = 0, len(matrix) - 1
left, right = 0, len(matrix[0]) - 1
while top <= bottom and left <= right:
# Traverse right along the top row
for col in range(left, right + 1):
result.append(matrix[top][col])
top += 1
# Traverse down along the right column
for row in range(top, bottom + 1):
result.append(matrix[row][right])
right -= 1
# Traverse left along the bottom row (if rows remain)
if top <= bottom:
for col in range(right, left - 1, -1):
result.append(matrix[bottom][col])
bottom -= 1
# Traverse up along the left column (if columns remain)
if left <= right:
for row in range(bottom, top - 1, -1):
result.append(matrix[row][left])
left += 1
return result
explanation: |
**Time Complexity:** O(m × n) — Each cell is visited exactly once.
**Space Complexity:** O(1) — Only boundary variables are used.
We simulate the spiral by maintaining four boundaries. For each ring, we traverse right, down, left, and up, shrinking the boundaries after each direction. The key is checking if boundaries are still valid before the reverse traversals.
- approach_name: Direction Vectors
is_optimal: false
code: |
def spiral_order(matrix: list[list[int]]) -> list[int]:
if not matrix or not matrix[0]:
return []
m, n = len(matrix), len(matrix[0])
result = []
visited = [[False] * n for _ in range(m)]
# Direction vectors: right, down, left, up
dr = [0, 1, 0, -1]
dc = [1, 0, -1, 0]
direction = 0 # Start going right
row, col = 0, 0
for _ in range(m * n):
result.append(matrix[row][col])
visited[row][col] = True
# Calculate next position
next_row = row + dr[direction]
next_col = col + dc[direction]
# Check if we need to turn
if (next_row < 0 or next_row >= m or
next_col < 0 or next_col >= n or
visited[next_row][next_col]):
# Turn clockwise
direction = (direction + 1) % 4
next_row = row + dr[direction]
next_col = col + dc[direction]
row, col = next_row, next_col
return result
explanation: |
**Time Complexity:** O(m × n) — Each cell is visited once.
**Space Complexity:** O(m × n) — The visited matrix.
This approach uses direction vectors to simulate movement. We track visited cells and change direction (clockwise) when we hit a boundary or visited cell. While correct, it uses more space than the boundary approach.

View File

@@ -0,0 +1,200 @@
title: Split Array Largest Sum
slug: split-array-largest-sum
difficulty: hard
leetcode_id: 410
leetcode_url: https://leetcode.com/problems/split-array-largest-sum/
categories:
- arrays
- binary-search
- dynamic-programming
patterns:
- binary-search
- greedy
description: |
Given an integer array `nums` and an integer `k`, split `nums` into `k` non-empty subarrays such that the largest sum of any subarray is **minimized**.
Return *the minimized largest sum of the split*.
A **subarray** is a contiguous part of the array.
constraints: |
- `1 <= nums.length <= 1000`
- `0 <= nums[i] <= 10^6`
- `1 <= k <= min(50, nums.length)`
examples:
- input: "nums = [7,2,5,10,8], k = 2"
output: "18"
explanation: "There are four ways to split nums into two subarrays. The best way is to split it into [7,2,5] and [10,8], where the largest sum among the two subarrays is only 18."
- input: "nums = [1,2,3,4,5], k = 2"
output: "9"
explanation: "There are four ways to split nums into two subarrays. The best way is to split it into [1,2,3] and [4,5], where the largest sum among the two subarrays is only 9."
explanation:
intuition: |
This problem asks us to split an array into `k` parts to minimize the maximum subarray sum. At first glance, it seems like we need to try all possible ways to partition the array — but that's exponentially complex.
Here's the key insight: **instead of searching for where to split, search for what the answer could be**.
Think of it like this: imagine you're a manager assigning work to `k` workers. Each worker must handle a contiguous segment of tasks, and you want to minimize the maximum workload any single worker receives. You could ask: "Is it possible to distribute the work so that no worker handles more than X units?"
If you can answer that question efficiently, you can use **binary search** to find the smallest valid X. The answer lies somewhere between:
- **Lower bound**: the largest single element (one subarray must contain at least the max element)
- **Upper bound**: the sum of all elements (one subarray contains everything)
For any candidate answer `mid`, we greedily check: can we split the array into at most `k` subarrays where each has sum ≤ `mid`? If yes, we might be able to do better (search lower). If no, we need a larger limit (search higher).
approach: |
We solve this using **Binary Search on the Answer**:
**Step 1: Define the search space**
- `left`: Set to `max(nums)` — the answer can't be smaller than the largest element
- `right`: Set to `sum(nums)` — the answer can't exceed putting everything in one subarray
&nbsp;
**Step 2: Binary search for the minimum valid answer**
- Calculate `mid = (left + right) // 2`
- Check if we can split the array into at most `k` subarrays with each sum ≤ `mid`
- If yes: this `mid` works, but maybe we can do better — set `right = mid`
- If no: we need a larger limit — set `left = mid + 1`
&nbsp;
**Step 3: Greedy feasibility check (can_split function)**
- Iterate through the array, accumulating a running sum
- When adding the next element would exceed `mid`, start a new subarray
- Count how many subarrays we need
- Return `True` if we need ≤ `k` subarrays
&nbsp;
**Step 4: Return the result**
- When `left == right`, we've found the minimum valid maximum subarray sum
- Return `left`
common_pitfalls:
- title: Trying All Partitions (Exponential Blowup)
description: |
A natural first thought is to enumerate all ways to split the array into `k` parts. For an array of length `n`, there are `C(n-1, k-1)` ways to place `k-1` dividers among `n-1` gaps.
With `n = 1000` and `k = 50`, this is astronomically large — far too many combinations to check. This approach won't pass time limits.
wrong_approach: "Enumerate all partition combinations"
correct_approach: "Binary search on the answer with greedy validation"
- title: Wrong Search Space Bounds
description: |
Setting `left = 0` is incorrect because the answer must be at least `max(nums)` — a subarray containing only the largest element has that sum.
For example, with `nums = [10, 1, 1, 1]` and `k = 4`, the answer is `10` (each element in its own subarray), not `4`.
wrong_approach: "left = 0, right = sum(nums)"
correct_approach: "left = max(nums), right = sum(nums)"
- title: Off-by-One in Subarray Counting
description: |
When checking feasibility, remember that you start with one subarray. Each time you "cut" to start a new subarray, increment the count.
A common bug is initializing `count = 0` instead of `count = 1`, which underestimates the number of subarrays needed.
wrong_approach: "Initialize subarray count to 0"
correct_approach: "Initialize subarray count to 1 (first subarray)"
- title: Using Exclusive Upper Bound Incorrectly
description: |
This is a "minimize the maximum" binary search. When `can_split(mid)` returns `True`, set `right = mid` (not `mid - 1`) because `mid` itself might be the answer.
Using `right = mid - 1` could skip the optimal answer.
wrong_approach: "right = mid - 1 when feasible"
correct_approach: "right = mid when feasible (mid might be optimal)"
key_takeaways:
- "**Binary search on the answer**: When searching for an optimal value in a range, binary search the answer space and validate with a greedy check"
- "**Greedy feasibility**: The `can_split` function greedily packs elements into subarrays — this works because we're checking a fixed limit, not optimizing"
- "**Minimize-the-maximum pattern**: This problem structure (minimize the max of subarray sums) appears in many variants: allocating pages to students, shipping packages within D days, etc."
- "**Search space bounds matter**: The lower bound is `max(nums)`, not `0` — understand why the bounds are what they are"
time_complexity: "O(n × log(sum(nums) - max(nums))). We perform binary search over a range of size `sum - max`, and each feasibility check takes O(n) time."
space_complexity: "O(1). We only use a constant number of variables for tracking bounds and subarray sums."
solutions:
- approach_name: Binary Search on Answer
is_optimal: true
code: |
def splitArray(nums: list[int], k: int) -> int:
def can_split(max_sum: int) -> bool:
"""Check if we can split into <= k subarrays with each sum <= max_sum."""
subarrays = 1 # Start with one subarray
current_sum = 0
for num in nums:
# Would adding this element exceed our limit?
if current_sum + num > max_sum:
# Start a new subarray with this element
subarrays += 1
current_sum = num
else:
# Add to current subarray
current_sum += num
return subarrays <= k
# Search space: [max element, total sum]
left = max(nums)
right = sum(nums)
# Binary search for minimum valid maximum
while left < right:
mid = (left + right) // 2
if can_split(mid):
# mid works, try to find something smaller
right = mid
else:
# mid is too small, need larger limit
left = mid + 1
return left
explanation: |
**Time Complexity:** O(n × log(S)) where S = sum(nums) - max(nums) — binary search with O(n) validation per iteration.
**Space Complexity:** O(1) — only constant extra space used.
We binary search over possible answers. For each candidate `mid`, we greedily check if the array can be split into at most `k` subarrays where each has sum ≤ `mid`. The greedy approach works because if we can fit more elements in the current subarray without exceeding `mid`, we should — this minimizes the number of subarrays needed.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def splitArray(nums: list[int], k: int) -> int:
n = len(nums)
# Precompute prefix sums for O(1) range sum queries
prefix = [0] * (n + 1)
for i in range(n):
prefix[i + 1] = prefix[i] + nums[i]
# dp[i][j] = min largest sum to split nums[0:i] into j parts
dp = [[float('inf')] * (k + 1) for _ in range(n + 1)]
dp[0][0] = 0 # Base case: empty array, 0 parts
for i in range(1, n + 1):
for j in range(1, min(i, k) + 1):
# Try all possible last subarray starting points
for m in range(j - 1, i):
# Sum of nums[m:i] = prefix[i] - prefix[m]
last_sum = prefix[i] - prefix[m]
# Max of (best for first m elements in j-1 parts) and last subarray
dp[i][j] = min(dp[i][j], max(dp[m][j - 1], last_sum))
return dp[n][k]
explanation: |
**Time Complexity:** O(n² × k) — three nested loops over positions and partitions.
**Space Complexity:** O(n × k) — the DP table.
This DP approach defines `dp[i][j]` as the minimum largest subarray sum when splitting the first `i` elements into exactly `j` parts. For each state, we try all possible positions for the last cut.
While correct, this is slower than binary search for large inputs. It's useful for understanding the problem structure and can be optimized further with monotonic queue techniques.

View File

@@ -0,0 +1,218 @@
title: Sqrt(x)
slug: sqrtx
difficulty: easy
leetcode_id: 69
leetcode_url: https://leetcode.com/problems/sqrtx/
categories:
- binary-search
- math
patterns:
- binary-search
function_signature: "def my_sqrt(x: int) -> int:"
test_cases:
visible:
- input: { x: 4 }
expected: 2
- input: { x: 8 }
expected: 2
- input: { x: 1 }
expected: 1
hidden:
- input: { x: 0 }
expected: 0
- input: { x: 16 }
expected: 4
- input: { x: 15 }
expected: 3
description: |
Given a non-negative integer `x`, return *the square root of* `x` *rounded down to the nearest integer*. The returned integer should be **non-negative** as well.
You **must not use** any built-in exponent function or operator.
- For example, do not use `pow(x, 0.5)` in C++ or `x ** 0.5` in Python.
constraints: |
- `0 <= x <= 2^31 - 1`
examples:
- input: "x = 4"
output: "2"
explanation: "The square root of 4 is 2, so we return 2."
- input: "x = 8"
output: "2"
explanation: "The square root of 8 is 2.82842..., and since we round it down to the nearest integer, 2 is returned."
explanation:
intuition: |
Imagine you're trying to guess a number between 1 and `x`. Someone will tell you if your guess squared is too high, too low, or exactly right. How would you find the answer efficiently?
The key insight is that the square root function is **monotonically increasing** — if `a < b`, then `sqrt(a) < sqrt(b)`. This means the integers `1, 2, 3, ..., x` form a sorted sequence when we consider their squares. When you have a sorted search space and need to find a specific value, **binary search** is the perfect tool.
Think of it like this: we're searching for the largest integer `k` such that `k * k <= x`. Instead of checking every number from 1 to x (which could be up to 2 billion!), we repeatedly halve our search space. Start in the middle — if `mid * mid` is too big, search the left half; if it's too small or just right, search the right half while remembering this valid candidate.
The "rounded down" requirement means we want the **floor** of the square root. So even if the exact square root is 2.83, we return 2. This translates to finding the largest integer whose square doesn't exceed `x`.
approach: |
We solve this using **Binary Search** on the answer space:
**Step 1: Handle the edge case**
- If `x` is `0` or `1`, return `x` directly (the square root of 0 is 0, and the square root of 1 is 1)
&nbsp;
**Step 2: Initialise binary search bounds**
- `left`: Set to `1` (minimum possible answer for x >= 1)
- `right`: Set to `x // 2` (for x >= 2, the square root is always <= x/2)
- `result`: Set to `0` to store our answer
&nbsp;
**Step 3: Perform binary search**
- While `left <= right`:
- Calculate `mid = left + (right - left) // 2` to avoid integer overflow
- Calculate `square = mid * mid`
- If `square == x`: we found an exact match, return `mid`
- If `square < x`: `mid` is a valid candidate (could be our answer), store it in `result` and search right half (`left = mid + 1`)
- If `square > x`: `mid` is too large, search left half (`right = mid - 1`)
&nbsp;
**Step 4: Return the result**
- Return `result`, which holds the largest integer whose square is <= x
&nbsp;
The binary search efficiently narrows down from potentially billions of candidates to the exact answer in at most 31 iterations (log base 2 of 2^31).
common_pitfalls:
- title: Integer Overflow When Squaring
description: |
When computing `mid * mid`, the result can overflow if `mid` is large. For example, if `mid = 50000` and we're using 32-bit integers, `mid * mid = 2,500,000,000` exceeds the 32-bit signed integer maximum of ~2.1 billion.
In Python, integers have arbitrary precision, so this isn't an issue. But in languages like C++ or Java, you need to either:
- Use `long long` / `long` for the square calculation
- Compare using division: `mid <= x / mid` instead of `mid * mid <= x`
wrong_approach: "Using 32-bit integers for mid * mid"
correct_approach: "Use 64-bit integers or division-based comparison"
- title: Linear Search is Too Slow
description: |
A naive approach might iterate from 1 to x, checking each number:
```python
for i in range(1, x + 1):
if i * i > x:
return i - 1
```
With `x` up to 2^31 - 1, this means up to 46,340 iterations in the worst case (since sqrt(2^31) ~ 46340). While this might pass, it's inefficient. Binary search solves it in ~15-16 iterations.
wrong_approach: "Linear scan from 1 to sqrt(x)"
correct_approach: "Binary search reducing search space by half each iteration"
- title: Off-by-One Errors in Binary Search
description: |
Binary search is notoriously tricky with boundary conditions. Common mistakes include:
- Using `left < right` instead of `left <= right`, missing the case where they're equal
- Not storing valid candidates when `mid * mid < x`
- Returning `mid` instead of `result` at the end
The key is understanding we want the **largest** valid answer. When `mid * mid < x`, `mid` is valid but there might be a larger valid number, so we store it and keep searching right.
wrong_approach: "Incorrect loop condition or forgetting to track valid candidates"
correct_approach: "Use left <= right, store valid candidates, return the stored result"
key_takeaways:
- "**Binary search on answer space**: When the answer lies in a sorted range and you can verify if a candidate is valid, binary search works beautifully"
- "**Monotonic property**: The square function is monotonically increasing, which is the prerequisite for binary search"
- "**Floor vs exact match**: This problem asks for the floor, so track the largest valid candidate rather than only returning exact matches"
- "**Foundation for harder problems**: This technique extends to problems like finding cube roots, nth roots, or any monotonic function inversion"
time_complexity: "O(log x). Binary search halves the search space each iteration, giving us at most log2(x) iterations."
space_complexity: "O(1). We only use a constant number of variables (`left`, `right`, `mid`, `result`) regardless of input size."
solutions:
- approach_name: Binary Search
is_optimal: true
code: |
def my_sqrt(x: int) -> int:
# Edge cases: sqrt(0) = 0, sqrt(1) = 1
if x < 2:
return x
# Search space: answer is between 1 and x//2
left, right = 1, x // 2
result = 0
while left <= right:
# Calculate mid avoiding overflow (not needed in Python, but good habit)
mid = left + (right - left) // 2
square = mid * mid
if square == x:
# Found exact square root
return mid
elif square < x:
# mid is valid candidate, but there might be a larger one
result = mid
left = mid + 1
else:
# mid is too large, search smaller numbers
right = mid - 1
return result
explanation: |
**Time Complexity:** O(log x) — Binary search halves the search space each iteration.
**Space Complexity:** O(1) — Only uses a few integer variables.
This solution uses binary search to find the largest integer whose square is at most `x`. By tracking valid candidates in `result` and continuing to search for potentially larger valid answers, we correctly handle the "round down" requirement.
- approach_name: Newton's Method
is_optimal: false
code: |
def my_sqrt(x: int) -> int:
# Newton's method for finding roots
# For f(r) = r^2 - x, we iterate: r = r - f(r)/f'(r) = (r + x/r) / 2
if x < 2:
return x
# Start with initial guess
r = x
while r * r > x:
# Newton's iteration: average of r and x/r
r = (r + x // r) // 2
return r
explanation: |
**Time Complexity:** O(log x) — Newton's method converges quadratically.
**Space Complexity:** O(1) — Only uses a single variable for the approximation.
Newton's method (also known as Heron's method for square roots) uses calculus-based iteration. Starting from an initial guess, each iteration gets closer to the true square root. While mathematically elegant, the binary search approach is often preferred in interviews as it's more straightforward to reason about and debug.
- approach_name: Linear Search
is_optimal: false
code: |
def my_sqrt(x: int) -> int:
# Brute force: check each number from 0 upward
if x < 2:
return x
i = 1
while i * i <= x:
i += 1
# i * i > x, so answer is i - 1
return i - 1
explanation: |
**Time Complexity:** O(sqrt(x)) — We iterate up to the square root of x.
**Space Complexity:** O(1) — Only uses a counter variable.
This brute force approach checks each integer starting from 1 until we find one whose square exceeds x. While correct and simple to understand, it's less efficient than binary search. For x = 2^31 - 1, this takes ~46,340 iterations versus ~31 for binary search.

View File

@@ -0,0 +1,219 @@
title: Stone Game II
slug: stone-game-ii
difficulty: medium
leetcode_id: 1140
leetcode_url: https://leetcode.com/problems/stone-game-ii/
categories:
- arrays
- dynamic-programming
patterns:
- dynamic-programming
- prefix-sum
description: |
Alice and Bob continue their games with piles of stones. There are a number of piles **arranged in a row**, and each pile has a positive integer number of stones `piles[i]`. The objective of the game is to end with the most stones.
Alice and Bob take turns, with Alice starting first.
On each player's turn, that player can take **all the stones** in the **first** `X` remaining piles, where `1 <= X <= 2M`. Then, we set `M = max(M, X)`.
Initially, `M = 1`.
The game continues until all the stones have been taken.
Assuming Alice and Bob play **optimally**, return *the maximum number of stones Alice can get*.
constraints: |
- `1 <= piles.length <= 100`
- `1 <= piles[i] <= 10^4`
examples:
- input: "piles = [2,7,9,4,4]"
output: "10"
explanation: "If Alice takes one pile at the beginning, Bob takes two piles, then Alice takes 2 piles again. Alice can get 2 + 4 + 4 = 10 stones in total. If Alice takes two piles at the beginning, then Bob can take all three piles left. In this case, Alice gets 2 + 7 = 9 stones. So we return 10 since it's larger."
- input: "piles = [1,2,3,4,5,100]"
output: "104"
explanation: "Alice needs to play optimally to secure the pile worth 100 stones."
explanation:
intuition: |
Imagine you and a friend are dividing a row of treasure chests, taking turns from the left side. Each turn, you can take between 1 and `2M` chests, where `M` grows based on how greedy either player gets.
The twist is that **both players play optimally** — they both make the best possible decision at every step. This creates a **minimax** situation: when it's your turn, you want to maximise your score, but you must account for the fact that your opponent will then try to maximise *their* score from the remaining piles.
Think of it like this: at any position in the game, defined by which pile we're starting from (`i`) and the current value of `M`, there's some total number of stones remaining. If you take `X` piles, your opponent then plays optimally from the new state. The key insight is:
**Your score = (Total remaining stones) - (Opponent's optimal score from the next state)**
This works because the stones you *don't* take go to your opponent's potential pool. By using a **suffix sum** (sum of all stones from index `i` to the end), we can quickly calculate the total remaining stones at any point.
We use **memoisation** because the same game state `(i, M)` can be reached through different sequences of moves, and recomputing it each time would be wasteful.
approach: |
We solve this using **Top-Down Dynamic Programming with Memoisation**:
**Step 1: Precompute suffix sums**
- Create a `suffix_sum` array where `suffix_sum[i]` = sum of all stones from index `i` to the end
- This allows O(1) lookup of total remaining stones at any position
- Build it by iterating backwards: `suffix_sum[i] = piles[i] + suffix_sum[i+1]`
&nbsp;
**Step 2: Define the recursive state**
- State: `(i, M)` where `i` is the current pile index and `M` is the current M value
- `dp(i, M)` returns the **maximum stones the current player can get** starting from pile `i` with the given `M`
&nbsp;
**Step 3: Handle base cases**
- If `i >= n` (no piles left), return `0`
- If `i + 2*M >= n` (can take all remaining piles), return `suffix_sum[i]`
&nbsp;
**Step 4: Recursive choice**
- Try taking `X` piles for each valid `X` from `1` to `2*M`
- If we take `X` piles, opponent plays optimally from state `(i + X, max(M, X))`
- Our score = `suffix_sum[i] - dp(i + X, max(M, X))`
- Take the maximum across all choices of `X`
&nbsp;
**Step 5: Return the answer**
- Call `dp(0, 1)` — Alice starts at index 0 with M = 1
common_pitfalls:
- title: Forgetting the Suffix Sum Optimisation
description: |
Without suffix sums, you'd need to sum the remaining piles repeatedly inside the recursion, adding an O(n) factor to each state evaluation.
With `n = 100` and up to `n * n = 10,000` states, this could push the solution towards TLE. Precomputing suffix sums keeps each state evaluation at O(M), which is bounded by O(n).
wrong_approach: "Summing piles[i:] inside each recursive call"
correct_approach: "Precompute suffix_sum array for O(1) remaining sum lookups"
- title: Confusing Whose Score to Maximise
description: |
A common mistake is trying to track Alice's and Bob's scores separately or using different logic for each player's turn.
The elegant insight is that **both players use the same logic**: maximise the current player's score. The relationship `my_score = remaining - opponent_score` handles the adversarial nature automatically.
wrong_approach: "Separate recursion branches for Alice vs Bob"
correct_approach: "Single recursive function that maximises current player's score"
- title: Missing the Greedy Shortcut
description: |
When `i + 2*M >= n`, the current player can take *all* remaining piles. Some implementations miss this base case and continue recursing unnecessarily.
This optimisation also helps with memoisation efficiency — fewer states need to be explored.
wrong_approach: "Always iterating through all X choices even when you can take everything"
correct_approach: "Return suffix_sum[i] immediately when 2*M covers all remaining piles"
- title: Incorrect M Update
description: |
Remember that `M` updates to `max(M, X)`, not just `X`. If the current `M` is already larger than `X`, it stays the same.
For example, if M = 3 and you take X = 2 piles, the new M remains 3, not 2.
wrong_approach: "Setting new_M = X"
correct_approach: "Setting new_M = max(M, X)"
key_takeaways:
- "**Game theory pattern**: In two-player zero-sum games, use `my_score = total - opponent_score` to simplify the logic"
- "**Suffix sum technique**: Precompute cumulative sums from the end when you frequently need 'remaining total' calculations"
- "**State design**: The state `(i, M)` captures everything needed — whose turn it is doesn't matter because both players use identical optimal logic"
- "**Memoisation is essential**: Without caching, the same state would be recomputed exponentially many times"
time_complexity: "O(n^3). We have O(n^2) possible states (index `i` from 0 to n, M from 1 to n), and for each state we try up to O(n) choices of X."
space_complexity: "O(n^2). The memoisation cache stores up to O(n^2) states, plus O(n) for the suffix sum array and recursion stack."
solutions:
- approach_name: Top-Down DP with Memoisation
is_optimal: true
code: |
def stone_game_ii(piles: list[int]) -> int:
n = len(piles)
# Precompute suffix sums for O(1) "remaining stones" lookup
suffix_sum = [0] * (n + 1)
for i in range(n - 1, -1, -1):
suffix_sum[i] = piles[i] + suffix_sum[i + 1]
# Memoisation cache: (index, M) -> max stones for current player
memo = {}
def dp(i: int, m: int) -> int:
# Base case: no piles left
if i >= n:
return 0
# Optimisation: can take all remaining piles
if i + 2 * m >= n:
return suffix_sum[i]
# Check cache
if (i, m) in memo:
return memo[(i, m)]
# Try all valid choices of X (1 to 2M piles)
max_stones = 0
for x in range(1, 2 * m + 1):
# Remaining after we take X piles
remaining = suffix_sum[i]
# Opponent's optimal score from new state
opponent_score = dp(i + x, max(m, x))
# Our score = remaining - what opponent gets
our_score = remaining - opponent_score
max_stones = max(max_stones, our_score)
memo[(i, m)] = max_stones
return max_stones
# Alice starts at index 0 with M = 1
return dp(0, 1)
explanation: |
**Time Complexity:** O(n^3) — O(n^2) states, each examining up to O(n) choices.
**Space Complexity:** O(n^2) — Memoisation cache plus suffix sum array.
The key insight is using suffix sums to quickly calculate remaining stones, and the relationship `my_score = remaining - opponent_score` to handle the adversarial game elegantly. Memoisation ensures each state is computed only once.
- approach_name: Bottom-Up DP
is_optimal: true
code: |
def stone_game_ii(piles: list[int]) -> int:
n = len(piles)
# Precompute suffix sums
suffix_sum = [0] * (n + 1)
for i in range(n - 1, -1, -1):
suffix_sum[i] = piles[i] + suffix_sum[i + 1]
# dp[i][m] = max stones current player can get starting at i with M = m
# M can be at most n (taking all piles at once)
dp = [[0] * (n + 1) for _ in range(n + 1)]
# Fill table backwards (from end of piles to start)
for i in range(n - 1, -1, -1):
for m in range(1, n + 1):
# Can take all remaining piles
if i + 2 * m >= n:
dp[i][m] = suffix_sum[i]
else:
# Try all valid X choices
max_stones = 0
for x in range(1, 2 * m + 1):
opponent_score = dp[i + x][max(m, x)]
our_score = suffix_sum[i] - opponent_score
max_stones = max(max_stones, our_score)
dp[i][m] = max_stones
return dp[0][1]
explanation: |
**Time Complexity:** O(n^3) — Same as top-down approach.
**Space Complexity:** O(n^2) — 2D DP table plus suffix sum array.
This iterative version fills the DP table from the end backwards. It's functionally equivalent to the memoised version but may have slightly better constant factors due to avoiding recursion overhead. The logic remains the same: maximise current player's stones using the suffix sum relationship.

View File

@@ -0,0 +1,258 @@
title: Stone Game III
slug: stone-game-iii
difficulty: hard
leetcode_id: 1406
leetcode_url: https://leetcode.com/problems/stone-game-iii/
categories:
- arrays
- dynamic-programming
patterns:
- dynamic-programming
description: |
Alice and Bob continue their games with piles of stones. There are several stones **arranged in a row**, and each stone has an associated value which is an integer given in the array `stoneValue`.
Alice and Bob take turns, with Alice starting first. On each player's turn, that player can take `1`, `2`, or `3` stones from the **first** remaining stones in the row.
The score of each player is the sum of the values of the stones taken. The score of each player is `0` initially.
The objective of the game is to end with the highest score, and the winner is the player with the highest score and there could be a tie. The game continues until all the stones have been taken.
Assume Alice and Bob **play optimally**.
Return `"Alice"` *if Alice will win*, `"Bob"` *if Bob will win*, or `"Tie"` *if they will end the game with the same score*.
constraints: |
- `1 <= stoneValue.length <= 5 * 10^4`
- `-1000 <= stoneValue[i] <= 1000`
examples:
- input: "stoneValue = [1,2,3,7]"
output: '"Bob"'
explanation: "Alice will always lose. Her best move will be to take three piles and the score becomes 6. Now the score of Bob is 7 and Bob wins."
- input: "stoneValue = [1,2,3,-9]"
output: '"Alice"'
explanation: "Alice must choose all the three piles at the first move to win and leave Bob with negative score. If Alice chooses one pile her score will be 1 and the next move Bob's score becomes 5. In the next move, Alice will take the pile with value = -9 and lose."
- input: "stoneValue = [1,2,3,6]"
output: '"Tie"'
explanation: "Alice cannot win this game. She can end the game in a draw if she decided to choose all the first three piles, otherwise she will lose."
explanation:
intuition: |
Imagine you're at a buffet line where each dish has a "value" — some positive (delicious) and some negative (terrible). You and your opponent take turns, and each turn you must take 1, 2, or 3 consecutive dishes from the front of the line. Both of you want to maximize your own total value.
The key insight is the **zero-sum nature** of this game: whatever stones remain after you pick, your opponent will play optimally on those remaining stones. So instead of tracking both players' scores separately, we can think in terms of **relative advantage**.
Define `dp[i]` as the maximum **score difference** (current player's score minus opponent's score) that the current player can achieve starting from index `i`. When it's your turn at position `i`:
- If you take stones `i` to `i+k-1` (where `k` is 1, 2, or 3), you gain those values
- Then your opponent plays optimally from position `i+k`, achieving `dp[i+k]` for themselves
- Your relative advantage becomes: `sum of stones taken - dp[i+k]`
The subtraction of `dp[i+k]` captures the **minimax** principle — your opponent's best outcome becomes your deficit.
At the end, if `dp[0] > 0`, Alice (who starts) has a positive advantage and wins. If `dp[0] < 0`, Bob wins. If `dp[0] == 0`, it's a tie.
approach: |
We solve this using **Dynamic Programming** with state representing the relative score difference:
**Step 1: Define the DP state**
- `dp[i]`: Maximum score difference (current player minus opponent) achievable starting from index `i`
- Base case: `dp[n] = 0` (no stones left means no advantage)
&nbsp;
**Step 2: Build the recurrence relation**
- At each position `i`, the current player can take 1, 2, or 3 stones
- For each choice `k` (1, 2, or 3):
- Player gains: `stoneValue[i] + stoneValue[i+1] + ... + stoneValue[i+k-1]`
- Opponent then achieves: `dp[i+k]` from the remaining stones
- Net advantage: `sum of k stones - dp[i+k]`
- Choose the maximum among all valid options
&nbsp;
**Step 3: Iterate backwards from the end**
- Process positions from `n-1` down to `0`
- Use a suffix sum to efficiently calculate the sum of stones taken
- Track `dp[i+1]`, `dp[i+2]`, `dp[i+3]` for the three possible moves
&nbsp;
**Step 4: Determine the winner**
- If `dp[0] > 0`: Alice wins (she has positive advantage)
- If `dp[0] < 0`: Bob wins (Alice has negative advantage, meaning Bob is ahead)
- If `dp[0] == 0`: Tie
common_pitfalls:
- title: Tracking Both Scores Separately
description: |
A natural first instinct is to track Alice's score and Bob's score as separate states. This leads to a 2D DP with states like `dp[i][aliceScore][bobScore]`, which has prohibitive complexity.
The insight is that we only care about the **difference** between scores, not the absolute values. This reduces the problem to a single dimension: the relative advantage of whoever is currently playing.
wrong_approach: "Track separate scores for Alice and Bob"
correct_approach: "Track score difference (current player - opponent)"
- title: Forgetting Negative Stone Values
description: |
Unlike simpler stone game variants, this problem has **negative values**. This means sometimes the optimal play is to take fewer stones to force your opponent to take negative ones.
For example, with `[1, 2, 3, -9]`, Alice's optimal move is to take all three positive stones (sum = 6) and leave Bob with just the -9, giving Bob a score of -9. Alice wins 6 to -9.
If Alice only took one stone, Bob could take `[2, 3]` (sum = 5) and leave Alice with -9. Bob would win.
wrong_approach: "Assume taking more stones is always better"
correct_approach: "Consider all 1, 2, 3 stone options and pick the best difference"
- title: Off-by-One Errors in Suffix Sums
description: |
When calculating the sum of the next `k` stones, be careful with indices. The sum from index `i` taking `k` stones is `suffixSum[i] - suffixSum[i+k]`, not `suffixSum[i+k]`.
Also ensure you handle the case where `i + k > n` by treating out-of-bounds `dp` values as 0.
wrong_approach: "Incorrect suffix sum indexing"
correct_approach: "Use suffixSum[i] - suffixSum[min(i+k, n)] for sum of k stones"
key_takeaways:
- "**Minimax principle**: In two-player zero-sum games, maximizing your advantage equals minimizing your opponent's advantage"
- "**Score difference DP**: Track relative advantage instead of absolute scores to reduce state complexity"
- "**Backward iteration**: Process from end to start, as each state depends on future states"
- "**Foundation for game theory**: This pattern applies to many competitive game problems (Stone Game variants, Nim, etc.)"
time_complexity: "O(n). We compute each `dp[i]` exactly once, and each computation considers at most 3 previous states."
space_complexity: "O(n) for the DP array. Can be optimized to O(1) since we only need the last 3 DP values."
solutions:
- approach_name: Dynamic Programming (Score Difference)
is_optimal: true
code: |
def stone_game_iii(stone_value: list[int]) -> str:
n = len(stone_value)
# dp[i] = max score difference (current player - opponent)
# starting from index i
# We need dp[i+1], dp[i+2], dp[i+3], so use array of size n+1
dp = [0] * (n + 1)
# Suffix sum for efficient range sum calculation
suffix_sum = [0] * (n + 1)
for i in range(n - 1, -1, -1):
suffix_sum[i] = suffix_sum[i + 1] + stone_value[i]
# Fill DP from right to left
for i in range(n - 1, -1, -1):
# Try taking 1, 2, or 3 stones
# Initialize to negative infinity to find maximum
dp[i] = float('-inf')
for k in range(1, 4): # k = 1, 2, or 3 stones
if i + k <= n:
# Sum of stones taken: suffix_sum[i] - suffix_sum[i+k]
stones_taken = suffix_sum[i] - suffix_sum[i + k]
# Our advantage = stones we get - opponent's best outcome
dp[i] = max(dp[i], stones_taken - dp[i + k])
# Determine winner based on Alice's advantage (she starts at index 0)
if dp[0] > 0:
return "Alice"
elif dp[0] < 0:
return "Bob"
else:
return "Tie"
explanation: |
**Time Complexity:** O(n) — Single pass through the array from right to left, with O(1) work per position.
**Space Complexity:** O(n) — For the DP array and suffix sum array.
We define `dp[i]` as the maximum score difference achievable by the current player starting from index `i`. By iterating backwards and considering all three choices (take 1, 2, or 3 stones), we build up the optimal strategy. The minimax principle is captured by subtracting `dp[i+k]` — the opponent's best outcome becomes our deficit.
- approach_name: Space-Optimized DP
is_optimal: true
code: |
def stone_game_iii(stone_value: list[int]) -> str:
n = len(stone_value)
# Only need last 3 DP values
# dp_next[0] = dp[i+1], dp_next[1] = dp[i+2], dp_next[2] = dp[i+3]
dp_next = [0, 0, 0]
# Process from right to left
suffix = 0 # Running suffix sum starting from position i
for i in range(n - 1, -1, -1):
suffix += stone_value[i]
# Try taking 1, 2, or 3 stones
best = float('-inf')
take_sum = 0
for k in range(1, 4):
if i + k - 1 < n:
take_sum += stone_value[i + k - 1]
# Opponent's best is dp_next[k-1] (which is dp[i+k])
opponent_best = dp_next[k - 1] if k <= 3 else 0
best = max(best, take_sum - opponent_best)
# Shift the window: dp[i+3] <- dp[i+2] <- dp[i+1] <- dp[i]
dp_next[2] = dp_next[1]
dp_next[1] = dp_next[0]
dp_next[0] = best
# dp_next[0] now holds dp[0], Alice's maximum advantage
if dp_next[0] > 0:
return "Alice"
elif dp_next[0] < 0:
return "Bob"
else:
return "Tie"
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(1) — Only stores 3 previous DP values.
This optimized version recognizes that `dp[i]` only depends on `dp[i+1]`, `dp[i+2]`, and `dp[i+3]`. By maintaining a sliding window of just 3 values, we reduce space from O(n) to O(1) while preserving the same logic.
- approach_name: Recursive with Memoization
is_optimal: false
code: |
def stone_game_iii(stone_value: list[int]) -> str:
n = len(stone_value)
memo = {}
def dp(i: int) -> int:
"""Return max score difference for current player starting at i."""
if i >= n:
return 0
if i in memo:
return memo[i]
# Try taking 1, 2, or 3 stones
best = float('-inf')
take_sum = 0
for k in range(1, 4):
if i + k - 1 < n:
take_sum += stone_value[i + k - 1]
# Our advantage = stones taken - opponent's best
best = max(best, take_sum - dp(i + k))
memo[i] = best
return best
result = dp(0)
if result > 0:
return "Alice"
elif result < 0:
return "Bob"
else:
return "Tie"
explanation: |
**Time Complexity:** O(n) — Each state computed once due to memoization.
**Space Complexity:** O(n) — For memoization dictionary and recursion stack.
This top-down approach may be more intuitive for some. We recursively compute the best score difference from each position, caching results to avoid recomputation. The logic is identical to the bottom-up DP but expressed recursively.

View File

@@ -0,0 +1,206 @@
title: Stone Game
slug: stone-game
difficulty: medium
leetcode_id: 877
leetcode_url: https://leetcode.com/problems/stone-game/
categories:
- arrays
- dynamic-programming
- math
patterns:
- dynamic-programming
description: |
Alice and Bob play a game with piles of stones. There are an **even** number of piles arranged in a row, and each pile has a **positive** integer number of stones `piles[i]`.
The objective of the game is to end with the most stones. The **total** number of stones across all the piles is **odd**, so there are no ties.
Alice and Bob take turns, with **Alice starting first**. Each turn, a player takes the entire pile of stones either from the **beginning** or from the **end** of the row. This continues until there are no more piles left, at which point the person with the **most stones wins**.
Assuming Alice and Bob play optimally, return `true` *if Alice wins the game, or* `false` *if Bob wins*.
constraints: |
- `2 <= piles.length <= 500`
- `piles.length` is **even**
- `1 <= piles[i] <= 500`
- `sum(piles[i])` is **odd**
examples:
- input: "piles = [5,3,4,5]"
output: "true"
explanation: "Alice starts first, and can only take the first 5 or the last 5. Say she takes the first 5, so that the row becomes [3, 4, 5]. If Bob takes 3, then the board is [4, 5], and Alice takes 5 to win with 10 points. If Bob takes the last 5, then the board is [3, 4], and Alice takes 4 to win with 9 points. Taking the first 5 was a winning move for Alice."
- input: "piles = [3,7,2,3]"
output: "true"
explanation: "Alice can always win by playing optimally."
explanation:
intuition: |
This is a classic **two-player game theory** problem where both players play optimally. At first glance, it seems like we need complex dynamic programming to simulate all possible game states.
However, there's a beautiful mathematical insight: **Alice can always win**. Here's why:
Think of the piles as being at positions `0, 1, 2, ..., n-1`. Since `n` is even, we can separate them into:
- **Even-indexed piles**: positions `0, 2, 4, ...`
- **Odd-indexed piles**: positions `1, 3, 5, ...`
The key insight is that **Alice can always choose to take all even-indexed piles OR all odd-indexed piles**. Here's how:
- If Alice wants even-indexed piles, she starts by taking `piles[0]` (the first pile)
- This forces Bob to choose between positions `1` and `n-1` — both odd-indexed!
- Whatever Bob picks, Alice's next choice will again include an even-indexed pile
- Alice continues this pattern, always having access to even-indexed positions
Since the total sum is odd and there are no ties, either the even-indexed sum or the odd-indexed sum must be larger. Alice simply chooses the strategy that gives her the larger total. Therefore, **Alice always wins**.
approach: |
We present two approaches: the elegant mathematical solution and the DP solution for educational purposes.
**Mathematical Approach (Optimal)**
**Step 1: Recognise the game structure**
- With an even number of piles, Alice can control whether she collects all even-indexed or all odd-indexed piles
- The sums of these two groups are different (total is odd, so no ties)
&nbsp;
**Step 2: Conclude Alice wins**
- Alice computes which group (even or odd indices) has more stones
- She plays the strategy to collect that group
- Return `True` unconditionally
&nbsp;
**Dynamic Programming Approach (Educational)**
**Step 1: Define the state**
- `dp[i][j]`: The maximum *advantage* (difference in stones) the current player can achieve over their opponent when choosing from `piles[i..j]`
&nbsp;
**Step 2: Base case**
- `dp[i][i] = piles[i]`: With one pile, the current player takes it all (advantage = pile value)
&nbsp;
**Step 3: Transition**
- Current player can take `piles[i]` (left) or `piles[j]` (right)
- If they take `piles[i]`, opponent then plays optimally on `piles[i+1..j]`
- The opponent's advantage becomes our disadvantage: `dp[i][j] = piles[i] - dp[i+1][j]`
- Similarly for taking right: `dp[i][j] = piles[j] - dp[i][j-1]`
- Take the maximum of both choices
&nbsp;
**Step 4: Return result**
- `dp[0][n-1] > 0` means Alice has positive advantage, so she wins
common_pitfalls:
- title: Overcomplicating with Full Simulation
description: |
Many people immediately jump to complex game-tree simulation or minimax with alpha-beta pruning. While these work, they're overkill for this problem.
The mathematical insight that Alice can always control parity makes the solution trivial. Always look for structural properties before implementing complex algorithms.
wrong_approach: "Full game tree exploration"
correct_approach: "Recognise Alice controls even/odd parity"
- title: Missing the Parity Control Insight
description: |
It's not obvious that the first player can force a specific subset of piles. The key is realising that after Alice takes from one end, Bob is forced to take from an odd-indexed position (relative to the remaining array), and this pattern continues.
With an even-length array, this control over parity is absolute and deterministic.
wrong_approach: "Assuming both players have equal opportunity"
correct_approach: "Alice dictates the parity pattern"
- title: Incorrect DP State Definition
description: |
A common mistake is defining `dp[i][j]` as "the maximum stones the current player can get" rather than "the maximum advantage over the opponent."
Using advantage simplifies the recurrence because taking a pile and then having the opponent play optimally means your advantage is `pile_value - opponent's_advantage`.
wrong_approach: "dp[i][j] = max stones for current player"
correct_approach: "dp[i][j] = max advantage (stone difference) for current player"
key_takeaways:
- "**Look for structural insights**: Before implementing complex algorithms, check if the problem has mathematical properties that simplify it drastically"
- "**Parity arguments in games**: When array length is even, the first player often has control over which subset of elements they can guarantee"
- "**Minimax with advantage**: In two-player zero-sum games, defining DP state as 'advantage over opponent' often simplifies the recurrence"
- "**Foundation for game theory**: This problem introduces concepts used in Stone Game II, III, and other game theory problems"
time_complexity: "O(1) for the mathematical solution (Alice always wins). O(n^2) for the DP solution where `n` is the number of piles."
space_complexity: "O(1) for the mathematical solution. O(n^2) for the DP solution to store the 2D table."
solutions:
- approach_name: Mathematical (Parity Argument)
is_optimal: true
code: |
def stone_game(piles: list[int]) -> bool:
# Alice can always choose to take all even-indexed
# or all odd-indexed piles. Since total is odd,
# one group must be larger. Alice picks that strategy.
# Therefore, Alice always wins.
return True
explanation: |
**Time Complexity:** O(1) — No computation needed.
**Space Complexity:** O(1) — No extra space used.
This elegant solution leverages the structural property that Alice, moving first with an even number of piles, can always force a winning strategy by controlling which parity of indices she collects.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def stone_game(piles: list[int]) -> bool:
n = len(piles)
# dp[i][j] = max advantage current player can achieve on piles[i..j]
dp = [[0] * n for _ in range(n)]
# Base case: single pile, take it all
for i in range(n):
dp[i][i] = piles[i]
# Fill for increasing lengths
for length in range(2, n + 1):
for i in range(n - length + 1):
j = i + length - 1
# Take left pile: gain piles[i], opponent plays on [i+1, j]
take_left = piles[i] - dp[i + 1][j]
# Take right pile: gain piles[j], opponent plays on [i, j-1]
take_right = piles[j] - dp[i][j - 1]
# Choose the better option
dp[i][j] = max(take_left, take_right)
# If Alice's advantage is positive, she wins
return dp[0][n - 1] > 0
explanation: |
**Time Complexity:** O(n^2) — We fill an n x n DP table.
**Space Complexity:** O(n^2) — Storage for the 2D DP table.
This approach explicitly computes the optimal play using interval DP. While it correctly solves the problem, the mathematical solution is simpler. This DP approach is valuable for understanding game theory and extends to variants like Stone Game II and III where the mathematical shortcut doesn't apply.
- approach_name: Space-Optimised DP
is_optimal: false
code: |
def stone_game(piles: list[int]) -> bool:
n = len(piles)
# dp[i] represents the advantage for interval starting at i
# We only need the previous row to compute the current row
dp = piles[:] # Base case: dp[i] = piles[i] for length 1
for length in range(2, n + 1):
for i in range(n - length + 1):
j = i + length - 1
# Take left or right, subtract opponent's optimal play
dp[i] = max(piles[i] - dp[i + 1], piles[j] - dp[i])
return dp[0] > 0
explanation: |
**Time Complexity:** O(n^2) — Same iteration as 2D DP.
**Space Complexity:** O(n) — Only a 1D array needed.
This optimises the 2D DP by observing that we only need the previous diagonal to compute the current one. We reuse a 1D array, updating it in-place. This is useful when `n` is large and O(n^2) space becomes a concern.

View File

@@ -0,0 +1,182 @@
title: Subarray Product Less Than K
slug: subarray-product-less-than-k
difficulty: medium
leetcode_id: 713
leetcode_url: https://leetcode.com/problems/subarray-product-less-than-k/
categories:
- arrays
- two-pointers
patterns:
- sliding-window
description: |
Given an array of integers `nums` and an integer `k`, return *the number of contiguous subarrays where the product of all the elements in the subarray is strictly less than* `k`.
constraints: |
- `1 <= nums.length <= 3 * 10^4`
- `1 <= nums[i] <= 1000`
- `0 <= k <= 10^6`
examples:
- input: "nums = [10, 5, 2, 6], k = 100"
output: "8"
explanation: "The 8 subarrays that have product less than 100 are: [10], [5], [2], [6], [10, 5], [5, 2], [2, 6], [5, 2, 6]. Note that [10, 5, 2] is not included as the product of 100 is not strictly less than k."
- input: "nums = [1, 2, 3], k = 0"
output: "0"
explanation: "Since k = 0, no product of positive integers can be less than 0, so the answer is 0."
explanation:
intuition: |
Imagine you're sliding a flexible window across the array, and inside this window you're multiplying all the numbers together. The window can expand by including more elements from the right, but if the product becomes too large (>= k), you need to shrink the window from the left until the product is valid again.
The key insight is this: **when the window is valid (product < k), every subarray ending at the current right pointer and starting anywhere within the window is also valid**. If your window spans from index `left` to `right`, then the subarrays ending at `right` are: `[right]`, `[right-1, right]`, `[right-2, right-1, right]`, ... all the way to `[left, ..., right]`. That's exactly `right - left + 1` new valid subarrays.
Think of it like this: each time you extend the window to include a new element, you're asking "how many new valid subarrays does this create?" The answer is the current window size, because each starting position within the window gives you a unique subarray ending at the current position.
approach: |
We solve this using the **Sliding Window** technique:
**Step 1: Handle edge cases**
- If `k <= 1`, return `0` immediately since all elements are >= 1, so no product can be < 1 (or < 0)
&nbsp;
**Step 2: Initialise variables**
- `left`: Left boundary of window, starts at `0`
- `product`: Running product of elements in current window, starts at `1`
- `count`: Total count of valid subarrays, starts at `0`
&nbsp;
**Step 3: Expand the window**
- Iterate `right` from `0` to `n-1`
- Multiply `product` by `nums[right]` to include the new element
&nbsp;
**Step 4: Shrink if needed**
- While `product >= k` and `left <= right`:
- Divide `product` by `nums[left]` to remove the leftmost element
- Increment `left` to shrink the window
&nbsp;
**Step 5: Count valid subarrays**
- After shrinking, the window from `left` to `right` is valid
- Add `right - left + 1` to `count` (this counts all subarrays ending at `right`)
&nbsp;
**Step 6: Return result**
- Return `count` after processing all elements
common_pitfalls:
- title: Missing the Edge Case k <= 1
description: |
Since all elements in `nums` are >= 1, the product of any non-empty subarray is at least 1. If `k <= 1`, no subarray can have a product strictly less than `k`.
Forgetting this check can lead to incorrect results or infinite loops when the window never becomes valid.
wrong_approach: "Not handling k <= 1 specially"
correct_approach: "Return 0 immediately if k <= 1"
- title: Counting Subarrays Incorrectly
description: |
A common mistake is to count all subarrays within the window at each step, leading to overcounting. Or counting only 1 subarray per valid window position.
The correct insight is: **at each position `right`, we add exactly `right - left + 1` new subarrays** — these are all the subarrays that *end* at `right` and start anywhere from `left` to `right`.
wrong_approach: "Counting window_size * (window_size + 1) / 2 or just 1"
correct_approach: "Add right - left + 1 at each step"
- title: The Brute Force Trap
description: |
The naive approach checks every possible subarray and computes its product:
- Outer loop for start index
- Inner loop for end index
- Multiply all elements in each subarray
This is O(n^2) or O(n^3) depending on implementation. With `nums.length <= 3 * 10^4`, this will cause **Time Limit Exceeded**.
wrong_approach: "Nested loops checking all subarrays"
correct_approach: "Sliding window with O(n) time"
- title: Integer Overflow Concerns
description: |
With `nums[i] <= 1000` and `nums.length <= 3 * 10^4`, the product can become extremely large. However, since we shrink the window whenever `product >= k` and `k <= 10^6`, the product stays bounded.
In languages without arbitrary precision integers, be aware of potential overflow, though Python handles this naturally.
key_takeaways:
- "**Sliding window for counting**: When counting valid subarrays with a monotonic property, sliding window often gives O(n) time"
- "**Count subarrays ending at each position**: Adding `right - left + 1` at each step counts all new subarrays introduced by expanding to `right`"
- "**Product vs sum windows**: For products, divide to shrink (vs subtract for sums). This works because all elements are positive"
- "**Related problems**: This pattern applies to subarray sum problems, longest substring problems, and other contiguous sequence counting tasks"
time_complexity: "O(n). Each element is added to the window once (when `right` advances) and removed at most once (when `left` advances)."
space_complexity: "O(1). We only use a fixed number of variables (`left`, `product`, `count`) regardless of input size."
solutions:
- approach_name: Sliding Window
is_optimal: true
code: |
def numSubarrayProductLessThanK(nums: list[int], k: int) -> int:
# Edge case: no product of positive integers can be < 1
if k <= 1:
return 0
left = 0
product = 1
count = 0
for right in range(len(nums)):
# Expand window by including nums[right]
product *= nums[right]
# Shrink window from left while product is too large
while product >= k:
product //= nums[left]
left += 1
# All subarrays ending at 'right' and starting from 'left' to 'right' are valid
# That's (right - left + 1) subarrays
count += right - left + 1
return count
explanation: |
**Time Complexity:** O(n) — Each element enters and leaves the window at most once.
**Space Complexity:** O(1) — Only three variables used.
The sliding window maintains the invariant that the product of elements from `left` to `right` is always less than `k`. At each position, we count all valid subarrays ending at that position.
- approach_name: Brute Force
is_optimal: false
code: |
def numSubarrayProductLessThanK(nums: list[int], k: int) -> int:
n = len(nums)
count = 0
# Try every starting position
for i in range(n):
product = 1
# Extend to every ending position
for j in range(i, n):
product *= nums[j]
# If product is still valid, count this subarray
if product < k:
count += 1
else:
# Product can only grow, so no point continuing
break
return count
explanation: |
**Time Complexity:** O(n^2) — Nested loops, though we break early when product exceeds k.
**Space Complexity:** O(1) — Only tracking product and count.
This approach tries every starting index and extends until the product becomes too large. While the early break helps in practice, worst case is still O(n^2). Too slow for large inputs but demonstrates the problem structure.

View File

@@ -0,0 +1,170 @@
title: Subarray Sum Equals K
slug: subarray-sum-equals-k
difficulty: medium
leetcode_id: 560
leetcode_url: https://leetcode.com/problems/subarray-sum-equals-k/
categories:
- arrays
- hash-tables
patterns:
- prefix-sum
description: |
Given an array of integers `nums` and an integer `k`, return *the total number of subarrays whose sum equals to* `k`.
A subarray is a contiguous **non-empty** sequence of elements within an array.
constraints: |
- `1 <= nums.length <= 2 * 10^4`
- `-1000 <= nums[i] <= 1000`
- `-10^7 <= k <= 10^7`
examples:
- input: "nums = [1,1,1], k = 2"
output: "2"
explanation: "The subarrays [1,1] (indices 0-1) and [1,1] (indices 1-2) both sum to 2."
- input: "nums = [1,2,3], k = 3"
output: "2"
explanation: "The subarrays [1,2] and [3] both sum to 3."
explanation:
intuition: |
Imagine you're walking along a number line, keeping a running total of all the numbers you've seen so far. This running total is called a **prefix sum**.
Here's the key insight: if you've reached a certain prefix sum `current_sum`, and you want to find a subarray ending at this position that sums to `k`, you need to find a *previous* prefix sum that equals `current_sum - k`. Why? Because the difference between two prefix sums gives you the sum of the subarray between them.
Think of it like this: if your cumulative total at position `j` is 10, and your cumulative total at position `i` was 3, then the sum of elements from index `i+1` to `j` is `10 - 3 = 7`. So if you want a subarray sum of `k = 7`, you look for any previous prefix sum of `current_sum - k = 10 - 7 = 3`.
The trick is to use a **hash map** to count how many times each prefix sum has occurred. As you iterate through the array, you can instantly look up how many times `current_sum - k` has appeared before — each occurrence represents a valid subarray ending at the current index.
approach: |
We solve this using the **Prefix Sum + Hash Map** technique:
**Step 1: Initialise tracking variables**
- `count`: Set to `0` to track the total number of valid subarrays
- `current_sum`: Set to `0` to track the running prefix sum
- `prefix_counts`: A hash map initialised with `{0: 1}` — this accounts for subarrays that start from index 0
&nbsp;
**Step 2: Iterate through the array**
- Add the current element to `current_sum`
- Calculate `complement = current_sum - k`
- If `complement` exists in `prefix_counts`, add its count to our result — each occurrence represents a valid subarray ending here
- Add `current_sum` to `prefix_counts` (increment its count if it already exists)
&nbsp;
**Step 3: Return the result**
- Return `count` after processing all elements
&nbsp;
The hash map initialisation `{0: 1}` is crucial: it handles the case where a prefix sum *exactly* equals `k`, meaning the subarray starts from index 0.
common_pitfalls:
- title: The Brute Force Trap
description: |
A natural first approach is to check every possible subarray with nested loops:
- Outer loop for start index
- Inner loop for end index
- Calculate sum for each subarray
This results in **O(n^2) time complexity** (or O(n^3) if you recalculate sums each time). With `n = 2 * 10^4`, this means up to 400 million operations — too slow!
wrong_approach: "Nested loops checking every subarray"
correct_approach: "Prefix sum with hash map for O(n) lookup"
- title: Forgetting to Initialise the Hash Map with Zero
description: |
If you start with an empty hash map, you'll miss subarrays that begin at index 0.
For example, with `nums = [3]` and `k = 3`: the prefix sum after the first element is 3, and we need `current_sum - k = 3 - 3 = 0` to exist in our map. Without `{0: 1}` initialisation, we'd return 0 instead of 1.
wrong_approach: "Starting with an empty hash map"
correct_approach: "Initialise with {0: 1} to handle subarrays starting at index 0"
- title: Using Sliding Window Instead
description: |
Sliding window doesn't work here because the array can contain **negative numbers**. With negative numbers, expanding the window doesn't guarantee the sum increases, and shrinking doesn't guarantee it decreases.
Sliding window is only valid when all elements are positive (or all negative), where the sum is monotonic with window size.
wrong_approach: "Sliding window technique"
correct_approach: "Prefix sum with hash map (handles negative numbers)"
- title: Updating the Hash Map Before Checking
description: |
The order of operations matters. You must check for `current_sum - k` in the hash map *before* adding `current_sum` to the map.
If you add first, you might count a subarray of length 0 (from an index to itself) when `k = 0`.
wrong_approach: "Adding current_sum to map before checking complement"
correct_approach: "Check complement first, then add current_sum to map"
key_takeaways:
- "**Prefix sum pattern**: The difference between two prefix sums gives the sum of elements in between — a powerful technique for range sum queries"
- "**Hash map for O(1) lookups**: Instead of searching for complements with nested loops, store counts in a hash map"
- "**Initialisation matters**: Starting with `{0: 1}` handles edge cases where the subarray starts at index 0"
- "**When sliding window fails**: This problem demonstrates why sliding window requires monotonic relationships — negative numbers break that assumption"
time_complexity: "O(n). We traverse the array once, with O(1) hash map operations at each step."
space_complexity: "O(n). In the worst case, all prefix sums are unique, requiring O(n) space in the hash map."
solutions:
- approach_name: Prefix Sum with Hash Map
is_optimal: true
code: |
def subarray_sum(nums: list[int], k: int) -> int:
# Count of valid subarrays found
count = 0
# Running prefix sum
current_sum = 0
# Map of prefix_sum -> count of occurrences
# Initialise with 0:1 to handle subarrays starting at index 0
prefix_counts = {0: 1}
for num in nums:
# Update running sum
current_sum += num
# Check if (current_sum - k) exists in our map
# If so, there are that many subarrays ending here with sum k
complement = current_sum - k
if complement in prefix_counts:
count += prefix_counts[complement]
# Add current prefix sum to the map
prefix_counts[current_sum] = prefix_counts.get(current_sum, 0) + 1
return count
explanation: |
**Time Complexity:** O(n) — Single pass through the array with O(1) hash map operations.
**Space Complexity:** O(n) — Hash map stores up to n unique prefix sums.
We use the mathematical insight that `sum(i, j) = prefix_sum(j) - prefix_sum(i-1)`. By storing prefix sum counts in a hash map, we can instantly find how many previous positions would create a valid subarray ending at the current position.
- approach_name: Brute Force
is_optimal: false
code: |
def subarray_sum(nums: list[int], k: int) -> int:
count = 0
n = len(nums)
# Try every possible starting index
for i in range(n):
current_sum = 0
# Try every possible ending index
for j in range(i, n):
current_sum += nums[j]
# Check if this subarray sums to k
if current_sum == k:
count += 1
return count
explanation: |
**Time Complexity:** O(n^2) — Nested loops check all possible subarrays.
**Space Complexity:** O(1) — Only tracking count and current_sum.
This approach checks every contiguous subarray by trying all start/end combinations. While correct, it's too slow for large inputs. Included to show why the prefix sum optimisation is necessary.

View File

@@ -0,0 +1,219 @@
title: Subsets II
slug: subsets-ii
difficulty: medium
leetcode_id: 90
leetcode_url: https://leetcode.com/problems/subsets-ii/
categories:
- arrays
- sorting
- recursion
patterns:
- backtracking
description: |
Given an integer array `nums` that may contain duplicates, return *all possible subsets* (the power set).
The solution set **must not** contain duplicate subsets. Return the solution in **any order**.
constraints: |
- `1 <= nums.length <= 10`
- `-10 <= nums[i] <= 10`
examples:
- input: "nums = [1,2,2]"
output: "[[],[1],[1,2],[1,2,2],[2],[2,2]]"
explanation: "The array contains a duplicate 2. We generate all unique subsets, avoiding duplicates like having two separate [2] subsets."
- input: "nums = [0]"
output: "[[],[0]]"
explanation: "With a single element, we have two subsets: the empty set and the set containing just that element."
explanation:
intuition: |
This problem extends the classic Subsets problem to handle **duplicate elements**. Think of it like selecting items from a box where some items are identical — you want to count each unique selection only once, even if there are multiple copies of the same item.
Imagine you have three marbles: one red and two blue. The possible selections are: nothing, red only, one blue, two blues, red with one blue, or red with both blues. Notice that "one blue" appears only once in our answer, even though there are two blue marbles we could pick. The *identity* of which blue marble we chose doesn't matter — only how many.
The key insight is that **sorting brings duplicates together**, making them easy to handle as a group. After sorting `[1, 2, 2]` stays as `[1, 2, 2]`, and we can see the two `2`s are adjacent. When we're at the second `2`, we know we've already explored all subsets that include "one `2`" — so we should only explore subsets that include "two `2`s" from this point.
By skipping duplicate elements that would start redundant branches, we prune the decision tree and generate only unique subsets.
approach: |
We solve this using **Backtracking with Duplicate Skipping**:
**Step 1: Sort the input array**
- Sorting groups duplicates together: `[2, 1, 2]` becomes `[1, 2, 2]`
- This is essential for our skip logic to work — we need duplicates to be adjacent
&nbsp;
**Step 2: Set up backtracking state**
- `result`: List to collect all unique subsets
- `current`: The subset being built
- `backtrack(index)`: Recursive function where `index` is the starting position for choosing next elements
&nbsp;
**Step 3: Define the backtracking function**
- First, add a copy of `current` to `result` (every path represents a valid subset)
- Then, iterate through remaining elements from `index` to `len(nums) - 1`
- For each element at position `i`:
- **Skip duplicates**: If `i > index` and `nums[i] == nums[i-1]`, skip this element
- Otherwise, add `nums[i]` to `current`, recurse with `i + 1`, then backtrack (remove the element)
&nbsp;
**Step 4: The duplicate skipping logic**
- The condition `i > index and nums[i] == nums[i-1]` means: "this element equals the previous one, and we're past the starting point for this level"
- When `i == index`, we must consider the element (it's our first choice at this level)
- When `i > index` and it's a duplicate, we've already explored subsets starting with this value at position `index`
- Skipping prevents generating the same subset through different paths
&nbsp;
**Step 5: Return all collected subsets**
- Start with `backtrack(0, [])` and return `result`
common_pitfalls:
- title: Forgetting to Sort
description: |
The duplicate-skipping logic relies on duplicates being adjacent. Without sorting, the skip condition `nums[i] == nums[i-1]` won't catch all duplicates.
For example, `[2, 1, 2]` unsorted has duplicates separated. The condition would miss them, producing duplicate subsets `[2]` from index 0 and index 2.
Always sort first: `nums.sort()` before starting backtracking.
wrong_approach: "Skip duplicates without sorting"
correct_approach: "Sort first, then skip adjacent duplicates"
- title: Wrong Skip Condition Index Check
description: |
A common mistake is using `i > 0` instead of `i > index`:
```python
if i > 0 and nums[i] == nums[i-1]: # Wrong
continue
if i > index and nums[i] == nums[i-1]: # Correct
continue
```
The condition must be `i > index` because we're checking if we've already made this choice *at the current recursion level*. Using `i > 0` would incorrectly skip valid subsets that legitimately include duplicate elements.
wrong_approach: "Use i > 0 in the skip condition"
correct_approach: "Use i > index to check within the current level"
- title: Using a Set for Deduplication
description: |
You might think "just use a set to store results and remove duplicates". While this works, it's inefficient:
- Converting lists to tuples for set storage has overhead
- You generate duplicate subsets only to discard them
- With many duplicates, you waste significant computation
For input `[1,1,1,1,1,1,1,1,1,1]` (10 identical elements), the naive approach generates 2^10 = 1024 subsets but only 11 are unique. The pruning approach generates exactly 11.
wrong_approach: "Generate all subsets, deduplicate with a set"
correct_approach: "Prune duplicate branches during backtracking"
- title: Forgetting to Copy the Subset
description: |
When adding to results, use `result.append(current[:])` or `result.append(list(current))`, not `result.append(current)`.
The `current` list is mutated during backtracking. If you append the reference directly, all entries in `result` will point to the same (eventually empty) list.
key_takeaways:
- "**Sorting enables duplicate detection**: Bringing duplicates together lets you identify and skip them with a simple `nums[i] == nums[i-1]` check"
- "**The index matters in the skip condition**: Use `i > index` (not `i > 0`) to only skip duplicates at the *current recursion level*, not legitimate uses of duplicate values deeper in the tree"
- "**Prune early, not late**: Avoiding duplicate work during generation is far more efficient than deduplicating results afterward"
- "**Extends the Subsets pattern**: This is the same backtracking template as Subsets, with just one additional line for duplicate handling — a powerful reminder that small tweaks can adapt patterns to new constraints"
time_complexity: "O(n * 2^n). In the worst case (all unique elements), we generate 2^n subsets, each taking O(n) to copy. With duplicates, the actual count is lower, but the upper bound remains O(2^n)."
space_complexity: "O(n). The recursion depth is at most n, and the `current` list holds at most n elements. The output space for storing subsets is not counted as auxiliary space."
solutions:
- approach_name: Backtracking with Duplicate Skipping
is_optimal: true
code: |
def subsets_with_dup(nums: list[int]) -> list[list[int]]:
result = []
nums.sort() # Sort to bring duplicates together
def backtrack(index: int, current: list[int]) -> None:
# Every path is a valid subset, add a copy
result.append(current[:])
for i in range(index, len(nums)):
# Skip duplicate values at the same recursion level
if i > index and nums[i] == nums[i - 1]:
continue
# Choose: add nums[i] to current subset
current.append(nums[i])
# Explore: recurse to consider elements after i
backtrack(i + 1, current)
# Unchoose: backtrack to try other options
current.pop()
backtrack(0, [])
return result
explanation: |
**Time Complexity:** O(n * 2^n) — We generate up to 2^n subsets (fewer with duplicates), each requiring O(n) to copy.
**Space Complexity:** O(n) — Recursion stack depth is at most n, plus the `current` list of size n.
The key optimisation is the duplicate skip on line 10. After sorting, duplicates are adjacent. When we encounter a value that equals the previous one *at the same recursion level* (`i > index`), we skip it because all subsets starting with this value have already been explored when we processed its predecessor.
- approach_name: Iterative with Duplicate Handling
is_optimal: true
code: |
def subsets_with_dup(nums: list[int]) -> list[list[int]]:
nums.sort() # Sort to group duplicates
result = [[]] # Start with empty subset
start = 0 # Track where new subsets begin
for i in range(len(nums)):
# If current element is a duplicate, only extend
# subsets added in the previous iteration
if i > 0 and nums[i] == nums[i - 1]:
new_subsets = [subset + [nums[i]] for subset in result[start:]]
else:
# For new elements, extend all existing subsets
new_subsets = [subset + [nums[i]] for subset in result]
start = len(result) # Mark where these new subsets start
result.extend(new_subsets)
return result
explanation: |
**Time Complexity:** O(n * 2^n) — Same as backtracking approach.
**Space Complexity:** O(1) auxiliary — We only use the output list (no recursion stack).
This iterative approach builds subsets incrementally. For each new element, we extend existing subsets by adding the element. The key insight for duplicates: when we see a repeated value, we only extend subsets that were *just* created in the previous round (tracked by `start`). This prevents creating duplicate subsets like `[2]` from both the first and second occurrence of `2`.
- approach_name: Brute Force with Set Deduplication
is_optimal: false
code: |
def subsets_with_dup(nums: list[int]) -> list[list[int]]:
result_set = set()
def backtrack(index: int, current: tuple) -> None:
# Add current subset as a sorted tuple (for set comparison)
result_set.add(current)
for i in range(index, len(nums)):
# Extend current subset with nums[i]
backtrack(i + 1, tuple(sorted(current + (nums[i],))))
backtrack(0, ())
# Convert tuples back to lists
return [list(subset) for subset in result_set]
explanation: |
**Time Complexity:** O(n * 2^n * log(n)) — Generates 2^n subsets, sorting each for comparison adds log(n) factor.
**Space Complexity:** O(n * 2^n) — Stores all unique subsets in a set.
This naive approach generates all possible subsets and relies on a set to remove duplicates. While correct, it's inefficient because it generates duplicate subsets only to discard them. For input with many duplicates like `[1,1,1,1,1]`, it generates 32 subsets but only 6 are unique. The optimised approaches avoid this wasted work.

View File

@@ -0,0 +1,198 @@
title: Subsets
slug: subsets
difficulty: medium
leetcode_id: 78
leetcode_url: https://leetcode.com/problems/subsets/
categories:
- arrays
- recursion
patterns:
- backtracking
description: |
Given an integer array `nums` of **unique** elements, return *all possible subsets* (the power set).
The solution set **must not** contain duplicate subsets. Return the solution in **any order**.
constraints: |
- `1 <= nums.length <= 10`
- `-10 <= nums[i] <= 10`
- All the numbers of `nums` are **unique**
examples:
- input: "nums = [1,2,3]"
output: "[[],[1],[2],[1,2],[3],[1,3],[2,3],[1,2,3]]"
explanation: "The power set contains all 2^3 = 8 subsets, from the empty set to the full set."
- input: "nums = [0]"
output: "[[],[0]]"
explanation: "With a single element, we have two subsets: the empty set and the set containing just that element."
explanation:
intuition: |
Think of building subsets as a series of **binary decisions**. For each element in the array, you have exactly two choices: either **include it** in the current subset or **exclude it**.
Imagine you're packing a bag and laying out items on a table. For each item, you decide "yes, take it" or "no, leave it." If you have 3 items, you make 3 independent yes/no decisions, giving you 2^3 = 8 possible combinations — from taking nothing (empty bag) to taking everything.
This decision tree structure naturally maps to **backtracking**: start with an empty subset, and at each step, branch into two paths — one where you add the current element, one where you don't. When you've made decisions for all elements, you've formed one complete subset.
The key insight is that every subset corresponds to a unique path through this decision tree. By exploring all paths systematically, we generate the complete power set.
approach: |
We solve this using **Backtracking** to explore all include/exclude decisions:
**Step 1: Initialise the result and define the recursive function**
- `result`: An empty list that will collect all subsets
- `backtrack(index, current)`: A recursive function where `index` is the current position in `nums` and `current` is the subset being built
&nbsp;
**Step 2: Define the base case**
- When `index` equals `len(nums)`, we've made decisions for all elements
- Add a copy of `current` to `result` (copy is important since we'll modify `current` later)
&nbsp;
**Step 3: Explore both choices at each step**
- **Include the element**: Add `nums[index]` to `current`, recurse with `index + 1`, then remove the element (backtrack)
- **Exclude the element**: Simply recurse with `index + 1` without adding anything
&nbsp;
**Step 4: Start the recursion**
- Call `backtrack(0, [])` to begin from the first element with an empty subset
- Return `result` containing all 2^n subsets
&nbsp;
This systematic exploration guarantees we visit every possible combination exactly once.
common_pitfalls:
- title: Forgetting to Copy the Subset
description: |
When adding `current` to the result, you must add a **copy**, not a reference:
```python
result.append(current[:]) # Correct: adds a copy
result.append(current) # Wrong: adds a reference
```
If you add the reference, all entries in `result` will point to the same list, which gets modified during backtracking. You'll end up with a result full of empty lists or unexpected values.
wrong_approach: "Appending the list directly without copying"
correct_approach: "Use `current[:]` or `list(current)` to create a copy"
- title: Not Backtracking After Recursion
description: |
After recursing with an element included, you must **remove it** before exploring the "exclude" path:
```python
current.append(nums[index])
backtrack(index + 1, current)
current.pop() # Essential: undo the choice
backtrack(index + 1, current)
```
Without `current.pop()`, the element remains in `current` for subsequent branches, corrupting all future subsets.
wrong_approach: "Forgetting to pop after the recursive call"
correct_approach: "Always undo modifications after recursing"
- title: Generating Duplicates
description: |
If you accidentally allow revisiting earlier elements, you'll generate duplicates:
```python
# Wrong: starts from 0 each time
for i in range(len(nums)):
backtrack(i + 1, current + [nums[i]])
# Correct: only consider elements after current index
backtrack(index + 1, current)
```
The key is to always move **forward** in the array. Once you've decided about `nums[i]`, never reconsider it.
wrong_approach: "Allowing the recursion to revisit earlier indices"
correct_approach: "Always pass index + 1 to move forward only"
key_takeaways:
- "**Backtracking pattern**: For problems asking for 'all combinations' or 'all subsets', think of a decision tree where each node branches based on include/exclude choices"
- "**Power set property**: An array of `n` unique elements has exactly `2^n` subsets, since each element is independently included or excluded"
- "**Foundation for harder problems**: This same backtracking template extends to Subsets II (with duplicates), Combinations, Permutations, and many constraint-satisfaction problems"
- "**Bit manipulation alternative**: Each subset maps to a binary number from `0` to `2^n - 1`, where bit `i` indicates whether `nums[i]` is included"
time_complexity: "O(n * 2^n). We generate 2^n subsets, and copying each subset takes O(n) time in the worst case."
space_complexity: "O(n). The recursion depth is at most `n`, and the `current` list holds at most `n` elements. Note: the output itself requires O(n * 2^n) space, but that's not counted as auxiliary space."
solutions:
- approach_name: Backtracking
is_optimal: true
code: |
def subsets(nums: list[int]) -> list[list[int]]:
result = []
def backtrack(index: int, current: list[int]) -> None:
# Base case: made decisions for all elements
if index == len(nums):
result.append(current[:]) # Add a copy of current subset
return
# Choice 1: Include nums[index]
current.append(nums[index])
backtrack(index + 1, current)
current.pop() # Backtrack: undo the choice
# Choice 2: Exclude nums[index]
backtrack(index + 1, current)
backtrack(0, [])
return result
explanation: |
**Time Complexity:** O(n * 2^n) — We generate 2^n subsets, each requiring O(n) to copy.
**Space Complexity:** O(n) — Recursion depth and current subset storage.
This approach explicitly models the include/exclude decision tree. At each index, we branch into two recursive calls: one with the element added, one without. The backtracking (pop) ensures we can reuse the same list across branches.
- approach_name: Iterative (Cascading)
is_optimal: true
code: |
def subsets(nums: list[int]) -> list[list[int]]:
result = [[]] # Start with empty subset
for num in nums:
# For each existing subset, create a new one with num added
result += [subset + [num] for subset in result]
return result
explanation: |
**Time Complexity:** O(n * 2^n) — Same as backtracking.
**Space Complexity:** O(1) auxiliary — We only use the output list (no recursion stack).
This iterative approach builds subsets incrementally. Starting with `[[]]`, for each new element, we take every existing subset and create a copy with the new element added. After processing all elements, we have all 2^n subsets.
- approach_name: Bit Manipulation
is_optimal: true
code: |
def subsets(nums: list[int]) -> list[list[int]]:
n = len(nums)
result = []
# Each number from 0 to 2^n - 1 represents a unique subset
for mask in range(1 << n): # 1 << n equals 2^n
subset = []
for i in range(n):
# Check if bit i is set in mask
if mask & (1 << i):
subset.append(nums[i])
result.append(subset)
return result
explanation: |
**Time Complexity:** O(n * 2^n) — We iterate through 2^n masks, checking n bits each.
**Space Complexity:** O(1) auxiliary — No recursion, just loop variables.
This approach leverages the bijection between subsets and binary numbers. For an array of size n, integers from 0 to 2^n - 1 enumerate all possible subsets. If bit `i` is set in the integer, include `nums[i]` in the subset. This is elegant and avoids recursion entirely.

View File

@@ -0,0 +1,182 @@
title: Subtree of Another Tree
slug: subtree-of-another-tree
difficulty: easy
leetcode_id: 572
leetcode_url: https://leetcode.com/problems/subtree-of-another-tree/
categories:
- trees
- recursion
patterns:
- dfs
- tree-traversal
description: |
Given the roots of two binary trees `root` and `subRoot`, return `true` if there is a subtree of `root` with the same structure and node values of `subRoot` and `false` otherwise.
A subtree of a binary tree `tree` is a tree that consists of a node in `tree` and all of this node's descendants. The tree `tree` could also be considered as a subtree of itself.
constraints: |
- The number of nodes in the `root` tree is in the range `[1, 2000]`
- The number of nodes in the `subRoot` tree is in the range `[1, 1000]`
- `-10^4 <= root.val <= 10^4`
- `-10^4 <= subRoot.val <= 10^4`
examples:
- input: "root = [3,4,5,1,2], subRoot = [4,1,2]"
output: "true"
explanation: "The subtree rooted at node 4 in root has the same structure and values as subRoot."
- input: "root = [3,4,5,1,2,null,null,null,null,0], subRoot = [4,1,2]"
output: "false"
explanation: "The subtree rooted at node 4 in root has an additional child node 0 that is not present in subRoot."
explanation:
intuition: |
Think of this problem as searching for a "matching snapshot" within a larger tree.
Imagine you have a big family tree hanging on your wall, and a smaller family tree on a card. You want to check if there's any branch in the big tree that looks *exactly* like the card — same people, same relationships, same structure.
The key insight is that this problem breaks down into two separate questions:
1. **Where to look?** We need to visit every node in the main tree as a potential "starting point" for a match
2. **How to compare?** At each node, we need to check if the tree rooted there is *identical* to `subRoot`
This naturally leads to a recursive solution: traverse the main tree (DFS), and at each node, run a separate tree comparison. If any comparison succeeds, we've found our subtree.
approach: |
We solve this using **recursive DFS with a helper function**:
**Step 1: Define a helper function to check tree equality**
- `is_same_tree(p, q)`: Returns `true` if trees rooted at `p` and `q` are identical
- Two trees are identical if:
- Both are `null` (empty trees match)
- Both have the same root value AND their left subtrees match AND their right subtrees match
- If one is `null` and the other isn't, they don't match
&nbsp;
**Step 2: Define the main subtree check**
- `is_subtree(root, subRoot)`: Returns `true` if `subRoot` is a subtree of `root`
- Base case: If `root` is `null`, return `false` (can't find a subtree in an empty tree)
- At each node, check three possibilities:
- The tree rooted at `root` is identical to `subRoot` (use `is_same_tree`)
- `subRoot` is a subtree of `root.left`
- `subRoot` is a subtree of `root.right`
- Return `true` if any of these conditions holds
&nbsp;
**Step 3: Return the result**
- The recursion naturally explores all nodes in the main tree
- As soon as we find a match, the `true` propagates up
common_pitfalls:
- title: Confusing Subtree with Same Tree
description: |
A subtree must include *all* descendants of a node, not just some of them.
For example, in `root = [3,4,5,1,2,null,null,null,null,0]` with `subRoot = [4,1,2]`, the node 4 in `root` has children 1 and 2, but node 2 also has a child 0. This extra node means the subtree rooted at 4 is `[4,1,2,null,null,0]`, which is NOT the same as `[4,1,2]`.
Your `is_same_tree` function must recursively check the entire structure, not just the first few levels.
wrong_approach: "Only comparing root values and immediate children"
correct_approach: "Recursively compare entire tree structures including all descendants"
- title: Forgetting to Check All Nodes
description: |
The subtree could be rooted at any node in the main tree, not just nodes with the same value as `subRoot.val`.
Make sure your traversal visits every node as a potential starting point. A common mistake is to only recurse when values match, but you need to recurse regardless — the match might be deeper in the tree.
wrong_approach: "Only checking nodes where root.val == subRoot.val"
correct_approach: "Check is_same_tree at every node, regardless of value"
- title: Incorrect Base Cases
description: |
Handle null cases carefully:
- `is_same_tree(null, null)` → `true` (two empty trees are identical)
- `is_same_tree(null, node)` or `is_same_tree(node, null)` → `false`
- `is_subtree(null, subRoot)` → `false` (can't find anything in an empty tree)
Getting these wrong leads to null pointer errors or incorrect results.
key_takeaways:
- "**Decomposition pattern**: Breaking a complex tree problem into two simpler recursive functions (traverse + compare) is a powerful technique"
- "**Subtree vs same tree**: Understanding the difference is crucial — a subtree must include all descendants, making it a stricter condition than partial matching"
- "**DFS for tree search**: When you need to find something anywhere in a tree, DFS traversal that checks each node is the standard approach"
- "**Foundation for advanced problems**: This pattern extends to problems like finding duplicate subtrees, tree serialisation, and tree hashing"
time_complexity: "O(m × n). For each of the `m` nodes in `root`, we potentially compare against all `n` nodes in `subRoot`."
space_complexity: "O(m + n). The recursion stack can go as deep as the height of each tree. In the worst case (skewed trees), this is O(m) for the outer traversal plus O(n) for the comparison."
solutions:
- approach_name: Recursive DFS
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def is_subtree(root: TreeNode | None, sub_root: TreeNode | None) -> bool:
# If main tree is empty, can't find any subtree
if root is None:
return False
# Check if tree rooted at current node matches subRoot
if is_same_tree(root, sub_root):
return True
# Otherwise, search in left and right subtrees
return is_subtree(root.left, sub_root) or is_subtree(root.right, sub_root)
def is_same_tree(p: TreeNode | None, q: TreeNode | None) -> bool:
# Both empty? They're the same
if p is None and q is None:
return True
# One empty, one not? Not the same
if p is None or q is None:
return False
# Values must match, and both subtrees must be identical
return (
p.val == q.val
and is_same_tree(p.left, q.left)
and is_same_tree(p.right, q.right)
)
explanation: |
**Time Complexity:** O(m × n) — For each node in root (m nodes), we may call is_same_tree which visits up to n nodes.
**Space Complexity:** O(m + n) — Recursion stack depth for nested calls.
We traverse every node in the main tree, and at each node, we check if the subtree rooted there is identical to subRoot. The is_same_tree helper does a parallel traversal of both trees, returning false as soon as a mismatch is found.
- approach_name: Tree Serialisation
is_optimal: false
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def is_subtree(root: TreeNode | None, sub_root: TreeNode | None) -> bool:
# Serialise both trees to strings
def serialise(node: TreeNode | None) -> str:
if node is None:
return "#"
# Use delimiters to avoid false matches like "12" containing "2"
return f"^{node.val},{serialise(node.left)},{serialise(node.right)}"
root_str = serialise(root)
sub_str = serialise(sub_root)
# Check if subRoot's serialisation appears in root's serialisation
return sub_str in root_str
explanation: |
**Time Complexity:** O(m + n) — Serialisation is O(m) and O(n), string search is O(m + n) with KMP/built-in.
**Space Complexity:** O(m + n) — Storing the serialised strings.
This approach converts both trees to unique string representations, then checks if the smaller string is a substring of the larger. The `^` prefix prevents false positives like value "2" matching inside "12". While asymptotically faster, the constant factors and space usage often make the recursive approach preferable for typical inputs.

View File

@@ -0,0 +1,165 @@
title: Sum of Two Integers
slug: sum-of-two-integers
difficulty: medium
leetcode_id: 371
leetcode_url: https://leetcode.com/problems/sum-of-two-integers/
categories:
- math
patterns:
- greedy
description: |
Given two integers `a` and `b`, return *the sum of the two integers without using the operators* `+` *and* `-`.
constraints: |
- `-1000 <= a, b <= 1000`
examples:
- input: "a = 1, b = 2"
output: "3"
explanation: "1 + 2 = 3"
- input: "a = 2, b = 3"
output: "5"
explanation: "2 + 3 = 5"
explanation:
intuition: |
Think back to how you learned to add numbers by hand in primary school: you add each column, and if the result is 10 or more, you "carry" to the next column. Binary addition works the same way, but simpler since each digit is either 0 or 1.
The key insight is that we can decompose addition into two separate operations:
1. **Sum without carrying**: When you add two bits, ignoring any carry, the result follows the XOR pattern: `0+0=0`, `0+1=1`, `1+0=1`, `1+1=0` (the "1+1=0" is because we're ignoring the carry).
2. **The carry itself**: A carry occurs only when both bits are 1. This is exactly the AND operation, but the carry needs to be added to the *next* position, so we shift it left by one.
By repeatedly computing the "sum without carry" (XOR) and the "carry" (AND shifted left), we eventually get to a point where there's no more carry, and we have our answer.
Think of it like this: each iteration handles one "wave" of carries. For most numbers, this converges very quickly since carries can only propagate as far as the number of bits.
approach: |
We solve this using **Bit Manipulation** to simulate binary addition:
**Step 1: Understand the two key operations**
- `a XOR b`: Gives the sum of bits where at most one of them is 1 (no carry needed)
- `(a AND b) << 1`: Identifies positions where both bits are 1 (carry occurs) and shifts the carry to the next position
&nbsp;
**Step 2: Iterate until no carry remains**
- Compute `sum_without_carry = a XOR b`
- Compute `carry = (a AND b) << 1`
- Set `a = sum_without_carry` and `b = carry`
- Repeat while `b != 0` (while there's still a carry to process)
&nbsp;
**Step 3: Handle negative numbers (Python-specific)**
- Python integers have arbitrary precision, so negative numbers don't naturally overflow
- We use a 32-bit mask (`0xFFFFFFFF`) to simulate 32-bit integer arithmetic
- If the result has its sign bit set (bit 31), we convert it back to a negative Python integer
&nbsp;
**Step 4: Return the result**
- When `b` becomes 0, all carries have been processed
- Return `a` which now contains the final sum
common_pitfalls:
- title: Infinite Loop with Negative Numbers
description: |
In Python, integers have unlimited precision. When working with negative numbers, the carry can keep propagating indefinitely because Python doesn't have fixed-width integers that naturally overflow.
For example, adding `-1` and `1`: in two's complement 32-bit representation, `-1` is all 1s (`0xFFFFFFFF`). Without masking, the carry keeps growing beyond 32 bits, and the loop never terminates.
The fix is to mask all operations to 32 bits using `& 0xFFFFFFFF`.
wrong_approach: "Direct XOR and AND without bit masking"
correct_approach: "Mask to 32 bits and handle sign conversion at the end"
- title: Forgetting to Convert Back to Signed
description: |
When you mask to 32 bits in Python, you get an unsigned result. If the actual sum is negative, you'll have a large positive number instead.
For example, `-1 + 0` should return `-1`, but with 32-bit masking you get `4294967295` (`0xFFFFFFFF`). You need to check if the sign bit (bit 31) is set and convert back to a negative number.
wrong_approach: "Return masked result directly"
correct_approach: "Check sign bit and convert: if result > 0x7FFFFFFF, subtract 0x100000000"
- title: Using + or - Operators
description: |
The entire point of this problem is to implement addition without using `+` or `-`. Some solutions accidentally use these operators in loop increments or other places.
Be careful that all arithmetic is done purely with bit operations: XOR, AND, OR, NOT, and shifts.
wrong_approach: "Using i += 1 in a loop or any arithmetic operators"
correct_approach: "Use only bitwise operations throughout"
key_takeaways:
- "**XOR for addition without carry**: `a ^ b` gives you the sum of each bit position where there's no carry"
- "**AND + shift for carry**: `(a & b) << 1` computes and positions the carry for the next iteration"
- "**Two's complement awareness**: When implementing bit manipulation in Python, remember to handle the difference between Python's arbitrary-precision integers and fixed-width machine integers"
- "**Foundation for hardware**: This is exactly how addition circuits work in CPUs - a series of half-adders and full-adders propagating carries"
time_complexity: "O(1). The number of iterations is bounded by the number of bits (at most 32 for 32-bit integers), which is constant."
space_complexity: "O(1). We only use a few variables regardless of input size."
solutions:
- approach_name: Bit Manipulation (Iterative)
is_optimal: true
code: |
def get_sum(a: int, b: int) -> int:
# 32-bit mask to handle negative numbers in Python
MASK = 0xFFFFFFFF
# Maximum positive value in 32-bit signed integer
MAX_INT = 0x7FFFFFFF
while b != 0:
# XOR gives sum without considering carry
sum_without_carry = (a ^ b) & MASK
# AND finds where both bits are 1 (carry positions)
# Left shift moves carry to next bit position
carry = ((a & b) << 1) & MASK
# Prepare for next iteration
a = sum_without_carry
b = carry
# If a is negative in 32-bit two's complement, convert to Python negative
# Numbers > MAX_INT have their sign bit set (are negative)
if a > MAX_INT:
# Convert from unsigned 32-bit to signed Python int
a = ~(a ^ MASK)
return a
explanation: |
**Time Complexity:** O(1) - At most 32 iterations for 32-bit integers.
**Space Complexity:** O(1) - Only a constant number of variables used.
We simulate binary addition by separating the sum into "sum without carry" (XOR) and "carry" (AND shifted left). We repeat until there's no carry. The masking handles Python's arbitrary-precision integers, and we convert back to a signed integer at the end if needed.
- approach_name: Recursive Bit Manipulation
is_optimal: false
code: |
def get_sum(a: int, b: int) -> int:
MASK = 0xFFFFFFFF
MAX_INT = 0x7FFFFFFF
# Mask inputs to 32 bits
a = a & MASK
b = b & MASK
# Base case: no carry to add
if b == 0:
# Convert back to signed if needed
return a if a <= MAX_INT else ~(a ^ MASK)
# Recursive case: add the carry
return get_sum(a ^ b, (a & b) << 1)
explanation: |
**Time Complexity:** O(1) - At most 32 recursive calls.
**Space Complexity:** O(1) - Call stack is bounded by 32 levels, which is constant.
This is the same algorithm expressed recursively. The base case is when there's no carry (`b == 0`), and each recursive call processes one round of carry propagation. While elegant, the iterative version is preferred in practice to avoid stack overhead.

View File

@@ -0,0 +1,289 @@
title: Surrounded Regions
slug: surrounded-regions
difficulty: medium
leetcode_id: 130
leetcode_url: https://leetcode.com/problems/surrounded-regions/
categories:
- graphs
- arrays
patterns:
- dfs
- bfs
- matrix-traversal
description: |
You are given an `m x n` matrix `board` containing letters `'X'` and `'O'`. **Capture** all regions that are **surrounded**:
- **Connect**: A cell is connected to adjacent cells horizontally or vertically.
- **Region**: To form a region, connect every `'O'` cell.
- **Surround**: A region is surrounded with `'X'` cells if you can connect the region with `'X'` cells and **none** of the region cells are on the edge of the board.
To capture a surrounded region, replace all `'O'`s with `'X'`s **in-place** within the original board. You do not need to return anything.
constraints: |
- `m == board.length`
- `n == board[i].length`
- `1 <= m, n <= 200`
- `board[i][j]` is `'X'` or `'O'`
examples:
- input: 'board = [["X","X","X","X"],["X","O","O","X"],["X","X","O","X"],["X","O","X","X"]]'
output: '[["X","X","X","X"],["X","X","X","X"],["X","X","X","X"],["X","O","X","X"]]'
explanation: "The O's in the center form a surrounded region and are captured. The O at the bottom-left edge is not surrounded (it touches the boundary), so it remains unchanged."
- input: 'board = [["X"]]'
output: '[["X"]]'
explanation: "A single X cell has no O's to capture."
explanation:
intuition: |
Imagine the board as an island map where `'O'` cells are land and `'X'` cells are water. A region of `'O'`s is "surrounded" if it has no connection to the boundary of the map — it's completely enclosed by water.
The **key insight** is to think about this problem *backwards*: instead of finding regions that ARE surrounded, find regions that are NOT surrounded (those connected to the boundary), and protect them. Everything else gets captured.
Think of it like this: any `'O'` on the edge of the board is automatically safe — it can never be fully surrounded. Furthermore, any `'O'` connected to an edge `'O'` is also safe, because the whole connected component touches the boundary.
So the strategy becomes:
1. Start from all `'O'`s on the boundary
2. Mark all `'O'`s connected to them as "safe"
3. Everything NOT marked safe is surrounded and should be captured
This "reverse thinking" transforms a complex region-finding problem into a simpler boundary-flood problem.
approach: |
We solve this using a **Boundary DFS/BFS Approach**:
**Step 1: Identify boundary O's**
- Iterate through all cells on the four edges of the board (first row, last row, first column, last column)
- When you find an `'O'` on the boundary, it cannot be captured
&nbsp;
**Step 2: Mark safe regions**
- From each boundary `'O'`, perform DFS (or BFS) to visit all connected `'O'` cells
- Mark these cells with a temporary marker (e.g., `'T'` for "temporary" or "safe")
- This marks the entire connected component as uncapturable
&nbsp;
**Step 3: Capture and restore**
- Iterate through the entire board
- Any `'O'` remaining is surrounded — convert it to `'X'`
- Any `'T'` (temporary marker) is safe — restore it to `'O'`
&nbsp;
This three-phase approach ensures we correctly identify which regions touch the boundary and which are truly surrounded.
common_pitfalls:
- title: Trying to Find Surrounded Regions Directly
description: |
A natural first instinct is to iterate through the board, find each `'O'` region, and check if it's surrounded. This is complex because you need to:
- Track all cells in a region
- Check if ANY cell touches the boundary
- Only then decide to capture or not
The boundary-first approach is simpler: mark safe cells first, then capture everything else in one pass.
wrong_approach: "Find each O region and check if surrounded"
correct_approach: "Mark boundary-connected O's first, then capture the rest"
- title: Forgetting to Check All Four Boundaries
description: |
The board has four edges: top row, bottom row, left column, and right column. Missing any edge means some safe `'O'`s won't be marked, leading to incorrect captures.
Make sure to iterate through:
- Row 0 and row `m-1` (top and bottom)
- Column 0 and column `n-1` (left and right)
wrong_approach: "Only checking top and left edges"
correct_approach: "Check all four edges of the board"
- title: Stack Overflow with Deep Recursion
description: |
With a 200x200 board, a region could contain up to 40,000 cells. Naive recursive DFS might cause stack overflow on such large connected components.
Solutions:
- Use iterative DFS with an explicit stack
- Use BFS with a queue
- Increase recursion limit (not recommended)
wrong_approach: "Deep recursive DFS on large boards"
correct_approach: "Iterative DFS or BFS for large inputs"
- title: Modifying While Searching
description: |
If you try to capture `'O'`s to `'X'`s while still searching, you might accidentally disconnect parts of a safe region before fully exploring it.
The temporary marker (`'T'`) prevents this — it distinguishes "visited safe" cells from both `'O'` (unvisited) and `'X'` (wall/captured).
key_takeaways:
- "**Reverse thinking**: Sometimes it's easier to find what you DON'T want and protect it, rather than directly finding what you want"
- "**Boundary-connected components**: Problems involving 'surrounded' often reduce to finding what's connected to the boundary"
- "**Temporary markers**: Using a third state (`'T'`) allows clean separation of visited-safe, unvisited, and captured cells"
- "**Pattern recognition**: This is similar to Number of Islands, but with the twist of boundary connectivity — recognise the DFS/BFS on grids pattern"
time_complexity: "O(m * n). We visit each cell at most twice: once during the boundary DFS/BFS marking phase, and once during the final capture/restore pass."
space_complexity: "O(m * n) in the worst case for the recursion stack or BFS queue, if almost all cells are `'O'` and connected. The modification is done in-place, so no additional board copy is needed."
solutions:
- approach_name: Boundary DFS
is_optimal: true
code: |
def solve(board: list[list[str]]) -> None:
if not board or not board[0]:
return
m, n = len(board), len(board[0])
def dfs(r: int, c: int) -> None:
# Out of bounds or not an O — stop
if r < 0 or r >= m or c < 0 or c >= n or board[r][c] != 'O':
return
# Mark as safe (temporary marker)
board[r][c] = 'T'
# Explore all four directions
dfs(r + 1, c) # down
dfs(r - 1, c) # up
dfs(r, c + 1) # right
dfs(r, c - 1) # left
# Step 1 & 2: Mark all O's connected to boundary
for i in range(m):
dfs(i, 0) # left edge
dfs(i, n - 1) # right edge
for j in range(n):
dfs(0, j) # top edge
dfs(m - 1, j) # bottom edge
# Step 3: Capture surrounded O's, restore safe T's
for i in range(m):
for j in range(n):
if board[i][j] == 'O':
board[i][j] = 'X' # Capture surrounded
elif board[i][j] == 'T':
board[i][j] = 'O' # Restore safe
explanation: |
**Time Complexity:** O(m * n) — Each cell is visited at most twice.
**Space Complexity:** O(m * n) — Recursion stack in worst case.
We start DFS from every boundary `'O'`, marking connected cells as `'T'` (safe). Then we sweep through the board: remaining `'O'`s are captured, `'T'`s are restored. This cleanly separates boundary-connected regions from surrounded ones.
- approach_name: Boundary BFS
is_optimal: true
code: |
from collections import deque
def solve(board: list[list[str]]) -> None:
if not board or not board[0]:
return
m, n = len(board), len(board[0])
queue = deque()
# Step 1: Collect all boundary O's
for i in range(m):
if board[i][0] == 'O':
queue.append((i, 0))
if board[i][n - 1] == 'O':
queue.append((i, n - 1))
for j in range(n):
if board[0][j] == 'O':
queue.append((0, j))
if board[m - 1][j] == 'O':
queue.append((m - 1, j))
# Step 2: BFS to mark all safe O's
while queue:
r, c = queue.popleft()
if r < 0 or r >= m or c < 0 or c >= n:
continue
if board[r][c] != 'O':
continue
board[r][c] = 'T' # Mark as safe
# Add neighbors to explore
queue.append((r + 1, c))
queue.append((r - 1, c))
queue.append((r, c + 1))
queue.append((r, c - 1))
# Step 3: Capture and restore
for i in range(m):
for j in range(n):
if board[i][j] == 'O':
board[i][j] = 'X' # Capture
elif board[i][j] == 'T':
board[i][j] = 'O' # Restore
explanation: |
**Time Complexity:** O(m * n) — Each cell processed at most once.
**Space Complexity:** O(m * n) — Queue size in worst case.
BFS avoids recursion depth issues. We seed the queue with all boundary `'O'`s, then expand outward marking safe cells. The final pass captures and restores just like the DFS approach. BFS is often preferred for very large grids.
- approach_name: Union-Find
is_optimal: false
code: |
def solve(board: list[list[str]]) -> None:
if not board or not board[0]:
return
m, n = len(board), len(board[0])
# Union-Find with path compression
parent = list(range(m * n + 1))
rank = [0] * (m * n + 1)
dummy = m * n # Virtual node for boundary-connected cells
def find(x: int) -> int:
if parent[x] != x:
parent[x] = find(parent[x]) # Path compression
return parent[x]
def union(x: int, y: int) -> None:
px, py = find(x), find(y)
if px == py:
return
# Union by rank
if rank[px] < rank[py]:
px, py = py, px
parent[py] = px
if rank[px] == rank[py]:
rank[px] += 1
def index(r: int, c: int) -> int:
return r * n + c
# Build unions
for i in range(m):
for j in range(n):
if board[i][j] != 'O':
continue
idx = index(i, j)
# Connect boundary O's to dummy node
if i == 0 or i == m - 1 or j == 0 or j == n - 1:
union(idx, dummy)
# Connect to adjacent O's
if i > 0 and board[i - 1][j] == 'O':
union(idx, index(i - 1, j))
if j > 0 and board[i][j - 1] == 'O':
union(idx, index(i, j - 1))
# Capture cells not connected to dummy
for i in range(m):
for j in range(n):
if board[i][j] == 'O' and find(index(i, j)) != find(dummy):
board[i][j] = 'X'
explanation: |
**Time Complexity:** O(m * n * α(m * n)) — Nearly linear due to path compression.
**Space Complexity:** O(m * n) — Parent and rank arrays.
Union-Find groups all `'O'`s into connected components. Boundary `'O'`s are connected to a virtual "dummy" node. After processing, any `'O'` not in the dummy's component is surrounded and captured. This approach is more complex but demonstrates the Union-Find pattern for connectivity problems.

View File

@@ -0,0 +1,273 @@
title: Swim in Rising Water
slug: swim-in-rising-water
difficulty: hard
leetcode_id: 778
leetcode_url: https://leetcode.com/problems/swim-in-rising-water/
categories:
- graphs
- binary-search
- heap
patterns:
- binary-search
- bfs
- heap
description: |
You are given an `n x n` integer matrix `grid` where each value `grid[i][j]` represents the elevation at that point `(i, j)`.
It starts raining, and water gradually rises over time. At time `t`, the water level is `t`, meaning **any** cell with elevation less than or equal to `t` is submerged or reachable.
You can swim from a square to another 4-directionally adjacent square if and only if the elevation of both squares individually are at most `t`. You can swim infinite distances in zero time. Of course, you must stay within the boundaries of the grid during your swim.
Return *the minimum time until you can reach the bottom right square* `(n - 1, n - 1)` *if you start at the top left square* `(0, 0)`.
constraints: |
- `n == grid.length`
- `n == grid[i].length`
- `1 <= n <= 50`
- `0 <= grid[i][j] < n^2`
- Each value `grid[i][j]` is **unique**
examples:
- input: "grid = [[0,2],[1,3]]"
output: "3"
explanation: "At time 0, you are at (0, 0). You cannot move anywhere because all adjacent cells have elevation > 0. At time 3, the water level allows you to swim through all cells to reach (1, 1)."
- input: "grid = [[0,1,2,3,4],[24,23,22,21,5],[12,13,14,15,16],[11,17,18,19,20],[10,9,8,7,6]]"
output: "16"
explanation: "The optimal path follows the outer edge of the grid. We need to wait until time 16 so that (0, 0) and (4, 4) are connected through cells with elevations ≤ 16."
explanation:
intuition: |
Imagine you're standing on a terrain map where each cell has a different elevation. Rain is falling steadily, and the water level rises by 1 unit each second. At time `t`, any cell with elevation ≤ `t` becomes a "lake" you can swim through.
The key insight is that this is a **path optimisation problem** where we want to minimise the **maximum elevation** along our path from start to end. We don't care about the sum of elevations or the number of steps — we only care about the single highest point we must traverse.
Think of it like this: if you find a path where the highest elevation is `16`, then at time `t = 16`, every cell on that path is underwater and you can swim the entire route. The answer is the **minimum possible maximum elevation** across all valid paths.
This problem can be solved in multiple ways:
- **Min-Heap (Dijkstra-like)**: Greedily expand to the lowest-elevation neighbour, tracking the maximum elevation seen
- **Binary Search + BFS/DFS**: Binary search on the answer `t`, and check if a path exists using only cells with elevation ≤ `t`
- **Union-Find**: Sort cells by elevation and union them until start and end are connected
approach: |
We'll use a **Min-Heap (Priority Queue)** approach, similar to Dijkstra's algorithm but optimised for this problem:
**Step 1: Initialise the data structures**
- `heap`: A min-heap storing `(elevation, row, col)` tuples, starting with `(grid[0][0], 0, 0)`
- `visited`: A set to track cells we've already processed
- `result`: Track the maximum elevation encountered so far (initialised to `grid[0][0]`)
&nbsp;
**Step 2: Process cells in order of elevation**
- Pop the cell with the **smallest elevation** from the heap
- Update `result` to be the maximum of current result and this cell's elevation
- If we've reached `(n-1, n-1)`, return `result` — this is our answer
&nbsp;
**Step 3: Explore neighbours**
- For each of the 4 adjacent cells (up, down, left, right):
- If the neighbour is within bounds and not visited, add it to the heap
- Mark it as visited to avoid reprocessing
&nbsp;
**Step 4: Return the result**
- When we pop the destination cell from the heap, `result` contains the minimum time needed
&nbsp;
The min-heap ensures we always expand the lowest-elevation unvisited cell first. This greedy strategy guarantees that when we reach the destination, we've found the path with the minimum maximum elevation.
common_pitfalls:
- title: Confusing This With Shortest Path
description: |
This is NOT a standard shortest path problem where you sum edge weights. Here, the cost of a path is the **maximum** elevation along it, not the sum.
Using standard BFS (which finds shortest path by number of edges) will give wrong answers. For example, a path with 10 cells of elevation 5 is better than a path with 2 cells where one has elevation 20.
wrong_approach: "Standard BFS counting steps"
correct_approach: "Min-heap tracking maximum elevation"
- title: Not Handling the Starting Cell
description: |
The starting cell `(0, 0)` has an elevation too! If `grid[0][0] = 5`, you cannot start swimming until time `t = 5`.
Always initialise your result/answer with `grid[0][0]`, not `0`.
wrong_approach: "Initialising answer to 0"
correct_approach: "Initialising answer to grid[0][0]"
- title: Revisiting Cells
description: |
Without proper visited tracking, you might add the same cell to the heap multiple times, leading to TLE or incorrect results.
Mark cells as visited **when adding to the heap**, not when popping. This prevents duplicate entries.
wrong_approach: "Marking visited only when popping from heap"
correct_approach: "Marking visited immediately when adding to heap"
- title: Binary Search Without Proper Bounds
description: |
If using binary search, the search space is `[max(grid[0][0], grid[n-1][n-1]), n^2 - 1]`. The lower bound must include both the start and end cell elevations since we must traverse both.
Using `[0, n^2 - 1]` works but is less efficient.
key_takeaways:
- "**Minimax path problem**: When optimising the maximum (or minimum) value along a path, consider Dijkstra-like approaches with a heap"
- "**Multiple valid approaches**: This problem can be solved with heap, binary search + BFS, or Union-Find — each offers different insights"
- "**Greedy expansion**: The min-heap ensures we always process the most promising cell first, similar to Dijkstra's algorithm"
- "**Related problems**: Path With Minimum Effort (LC 1631), Cheapest Flights Within K Stops — similar minimax/path optimisation patterns"
time_complexity: "O(n^2 log n). Each of the n^2 cells is added to the heap at most once, and heap operations are O(log n^2) = O(log n)."
space_complexity: "O(n^2). The heap and visited set can each hold up to n^2 elements."
solutions:
- approach_name: Min-Heap (Dijkstra-like)
is_optimal: true
code: |
import heapq
def swim_in_water(grid: list[list[int]]) -> int:
n = len(grid)
# Heap stores (elevation, row, col)
heap = [(grid[0][0], 0, 0)]
visited = {(0, 0)}
# Track the maximum elevation we've had to traverse
result = grid[0][0]
# 4 directions: up, down, left, right
directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
while heap:
# Get the cell with the smallest elevation
elevation, row, col = heapq.heappop(heap)
# Update our answer with the maximum elevation seen
result = max(result, elevation)
# If we've reached the destination, return the answer
if row == n - 1 and col == n - 1:
return result
# Explore all 4 neighbours
for dr, dc in directions:
new_row, new_col = row + dr, col + dc
# Check bounds and if not visited
if 0 <= new_row < n and 0 <= new_col < n and (new_row, new_col) not in visited:
visited.add((new_row, new_col))
heapq.heappush(heap, (grid[new_row][new_col], new_row, new_col))
return result # Should never reach here for valid input
explanation: |
**Time Complexity:** O(n^2 log n) — Each cell is pushed/popped from the heap once, with O(log n^2) per operation.
**Space Complexity:** O(n^2) — For the heap and visited set.
This approach is a modified Dijkstra's algorithm. Instead of tracking cumulative distances, we track the maximum elevation encountered. The min-heap ensures we always expand the lowest-elevation frontier cell, guaranteeing optimality.
- approach_name: Binary Search + BFS
is_optimal: false
code: |
from collections import deque
def swim_in_water(grid: list[list[int]]) -> int:
n = len(grid)
def can_reach(threshold: int) -> bool:
"""Check if we can reach (n-1, n-1) using only cells with elevation <= threshold."""
if grid[0][0] > threshold:
return False
queue = deque([(0, 0)])
visited = {(0, 0)}
directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
while queue:
row, col = queue.popleft()
if row == n - 1 and col == n - 1:
return True
for dr, dc in directions:
new_row, new_col = row + dr, col + dc
if (0 <= new_row < n and 0 <= new_col < n and
(new_row, new_col) not in visited and
grid[new_row][new_col] <= threshold):
visited.add((new_row, new_col))
queue.append((new_row, new_col))
return False
# Binary search on the answer
left = max(grid[0][0], grid[n-1][n-1])
right = n * n - 1
while left < right:
mid = (left + right) // 2
if can_reach(mid):
right = mid # Try a smaller threshold
else:
left = mid + 1 # Need a larger threshold
return left
explanation: |
**Time Complexity:** O(n^2 log(n^2)) = O(n^2 log n) — Binary search has O(log n^2) iterations, each running BFS in O(n^2).
**Space Complexity:** O(n^2) — For the BFS queue and visited set.
This approach reframes the problem: "Given a water level `t`, can we reach the destination?" We binary search on `t` to find the minimum value where the answer is "yes". The BFS checks connectivity using only cells with elevation ≤ `t`.
- approach_name: Union-Find
is_optimal: false
code: |
def swim_in_water(grid: list[list[int]]) -> int:
n = len(grid)
# Union-Find data structure
parent = list(range(n * n))
rank = [0] * (n * n)
def find(x: int) -> int:
if parent[x] != x:
parent[x] = find(parent[x]) # Path compression
return parent[x]
def union(x: int, y: int) -> None:
px, py = find(x), find(y)
if px == py:
return
# Union by rank
if rank[px] < rank[py]:
px, py = py, px
parent[py] = px
if rank[px] == rank[py]:
rank[px] += 1
# Map elevation to position
elevation_to_pos = {}
for r in range(n):
for c in range(n):
elevation_to_pos[grid[r][c]] = (r, c)
directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
# Process cells in order of elevation
for t in range(n * n):
r, c = elevation_to_pos[t]
# Union with any adjacent cell that has elevation <= t
for dr, dc in directions:
nr, nc = r + dr, c + dc
if 0 <= nr < n and 0 <= nc < n and grid[nr][nc] <= t:
union(r * n + c, nr * n + nc)
# Check if start and end are connected
if find(0) == find(n * n - 1):
return t
return n * n - 1 # Should never reach here for valid input
explanation: |
**Time Complexity:** O(n^2 α(n^2)) ≈ O(n^2) — We process each cell once, with near-constant time Union-Find operations.
**Space Complexity:** O(n^2) — For the parent and rank arrays, plus the elevation map.
This approach processes cells in increasing order of elevation. At each step, we union the current cell with any adjacent cells that are already "underwater" (elevation ≤ current time). We stop when the start and end cells become connected. This is theoretically the fastest approach due to the near-constant time Union-Find operations.

View File

@@ -0,0 +1,218 @@
title: Target Sum
slug: target-sum
difficulty: medium
leetcode_id: 494
leetcode_url: https://leetcode.com/problems/target-sum/
categories:
- arrays
- dynamic-programming
patterns:
- dynamic-programming
- backtracking
description: |
You are given an integer array `nums` and an integer `target`.
You want to build an **expression** out of nums by adding one of the symbols `'+'` and `'-'` before each integer in nums and then concatenate all the integers.
For example, if `nums = [2, 1]`, you can add a `'+'` before `2` and a `'-'` before `1` and concatenate them to build the expression `"+2-1"`.
Return *the number of different expressions that you can build, which evaluates to* `target`.
constraints: |
- `1 <= nums.length <= 20`
- `0 <= nums[i] <= 1000`
- `0 <= sum(nums[i]) <= 1000`
- `-1000 <= target <= 1000`
examples:
- input: "nums = [1,1,1,1,1], target = 3"
output: "5"
explanation: "There are 5 ways to assign symbols to make the sum of nums be target 3: -1+1+1+1+1, +1-1+1+1+1, +1+1-1+1+1, +1+1+1-1+1, +1+1+1+1-1."
- input: "nums = [1], target = 1"
output: "1"
explanation: "There is only one way: +1 = 1."
explanation:
intuition: |
Imagine you have a set of coins, and for each coin you must decide whether to put it in a "positive pile" or a "negative pile". The question becomes: how many ways can you split the coins so that the positive pile minus the negative pile equals the target?
This reframing reveals a powerful insight. Let `P` be the sum of numbers assigned `+` and `N` be the sum assigned `-`. We know:
- `P + N = total` (all numbers are used)
- `P - N = target` (the goal)
Adding these equations: `2P = total + target`, so `P = (total + target) / 2`.
**The problem transforms into a subset sum problem**: find how many subsets of `nums` sum to exactly `P`. This is a classic dynamic programming pattern!
Think of it like this: instead of tracking all possible sums from `-total` to `+total` (which could be huge), we only need to count subsets that reach one specific target sum. This dramatically reduces the problem space.
approach: |
We solve this using **Dynamic Programming (Subset Sum Count)**:
**Step 1: Transform the problem**
- Calculate `total = sum(nums)`
- Calculate `subset_sum = (total + target) / 2`
- If `(total + target)` is odd, return `0` — no valid split exists
- If `total + target < 0`, return `0` — target is unreachable
&nbsp;
**Step 2: Initialise the DP array**
- Create `dp` array of size `subset_sum + 1`
- `dp[s]` represents the number of ways to form sum `s`
- Set `dp[0] = 1` — there's exactly one way to form sum `0` (use no elements)
&nbsp;
**Step 3: Fill the DP table**
- For each number `num` in `nums`:
- Iterate `s` from `subset_sum` down to `num` (reverse order is crucial!)
- Update: `dp[s] += dp[s - num]`
- This adds the count of ways to reach `s - num` (if we include `num`)
&nbsp;
**Step 4: Return the result**
- Return `dp[subset_sum]` — the number of subsets that sum to our target
&nbsp;
The reverse iteration in Step 3 ensures each number is used at most once per subset. If we iterated forward, we'd count the same number multiple times.
common_pitfalls:
- title: Brute Force Exponential Blowup
description: |
The naive approach tries all `2^n` combinations of `+` and `-` signs using recursion or backtracking.
With `n = 20`, this means up to `2^20 = 1,048,576` combinations. While this might pass for small inputs, it's inefficient and the DP approach is much faster.
More importantly, the brute force approach doesn't reveal the elegant mathematical structure of the problem.
wrong_approach: "Recursively try all 2^n sign combinations"
correct_approach: "Transform to subset sum and use DP"
- title: Forward Iteration in DP
description: |
When filling the DP array, you might be tempted to iterate `s` from `0` to `subset_sum`. This is wrong!
Forward iteration allows the same element to be counted multiple times. For example, with `num = 2`, updating `dp[2]` first would affect `dp[4]` in the same iteration, treating `2` as if it could be used twice.
Always iterate in reverse (from `subset_sum` down to `num`) to ensure each element is considered only once.
wrong_approach: "for s in range(num, subset_sum + 1)"
correct_approach: "for s in range(subset_sum, num - 1, -1)"
- title: Forgetting Edge Cases
description: |
Several edge cases can trip you up:
- If `(total + target)` is odd, no valid partition exists — you can't split integers into two groups with a non-integer difference
- If `target > total` or `target < -total`, the target is unreachable
- If `total + target < 0`, the subset sum would be negative, which is impossible
Handle these before starting the DP computation.
- title: Zeros in the Array
description: |
Zeros double the count of ways! A zero can be assigned either `+` or `-` without changing the sum.
With `k` zeros in the array, each valid subset has `2^k` variations. The DP approach handles this automatically since `dp[s] += dp[s - 0]` effectively doubles the count.
key_takeaways:
- "**Problem transformation**: Recognising that `+/-` assignment is equivalent to subset partitioning unlocks an efficient solution"
- "**Subset sum pattern**: Counting subsets that sum to a target is a foundational DP pattern — memorise the `dp[s] += dp[s - num]` recurrence"
- "**Reverse iteration trick**: When each element can only be used once, iterate the DP array in reverse to avoid double-counting"
- "**Mathematical insight**: Always look for ways to simplify the state space — transforming from tracking sums in `[-total, total]` to just `[0, subset_sum]` is a huge optimisation"
time_complexity: "O(n * subset_sum). We process each of the `n` numbers once, and for each number we update up to `subset_sum` entries in the DP array."
space_complexity: "O(subset_sum). We use a 1D DP array of size `subset_sum + 1`. This can be at most `O(total)` where `total = sum(nums)`."
solutions:
- approach_name: Dynamic Programming (Subset Sum)
is_optimal: true
code: |
def find_target_sum_ways(nums: list[int], target: int) -> int:
total = sum(nums)
# Edge cases: impossible to reach target
if (total + target) % 2 != 0: # Can't split into equal parts
return 0
if total + target < 0: # Target too negative
return 0
# Transform: find subsets that sum to this value
subset_sum = (total + target) // 2
# dp[s] = number of ways to form sum s
dp = [0] * (subset_sum + 1)
dp[0] = 1 # One way to form sum 0: use nothing
for num in nums:
# Iterate in reverse to avoid using same element twice
for s in range(subset_sum, num - 1, -1):
dp[s] += dp[s - num]
return dp[subset_sum]
explanation: |
**Time Complexity:** O(n * subset_sum) — For each number, we update the DP array.
**Space Complexity:** O(subset_sum) — 1D DP array.
We transform the problem into counting subsets that sum to `(total + target) / 2`. The 1D DP array counts ways to form each possible sum, updated in reverse order to ensure each number is used once.
- approach_name: Recursion with Memoisation
is_optimal: false
code: |
def find_target_sum_ways(nums: list[int], target: int) -> int:
from functools import lru_cache
@lru_cache(maxsize=None)
def count_ways(index: int, current_sum: int) -> int:
# Base case: processed all numbers
if index == len(nums):
return 1 if current_sum == target else 0
# Try adding current number with + or -
add = count_ways(index + 1, current_sum + nums[index])
subtract = count_ways(index + 1, current_sum - nums[index])
return add + subtract
return count_ways(0, 0)
explanation: |
**Time Complexity:** O(n * total_sum) — Each unique (index, sum) state is computed once.
**Space Complexity:** O(n * total_sum) — Memoisation cache for all states.
This approach directly models the problem: at each index, we can add or subtract the current number. Memoisation prevents recomputing the same (index, current_sum) pairs. While intuitive, it uses more space than the subset sum DP approach.
- approach_name: Brute Force (Backtracking)
is_optimal: false
code: |
def find_target_sum_ways(nums: list[int], target: int) -> int:
count = 0
def backtrack(index: int, current_sum: int) -> None:
nonlocal count
# Base case: used all numbers
if index == len(nums):
if current_sum == target:
count += 1
return
# Try + and - for current number
backtrack(index + 1, current_sum + nums[index])
backtrack(index + 1, current_sum - nums[index])
backtrack(0, 0)
return count
explanation: |
**Time Complexity:** O(2^n) — Every number has two choices, leading to exponential combinations.
**Space Complexity:** O(n) — Recursion stack depth.
This brute force approach explicitly tries all `2^n` combinations of `+` and `-` signs. While correct and easy to understand, it's inefficient for larger inputs. Included to show the problem's natural recursive structure before optimisation.

View File

@@ -0,0 +1,207 @@
title: Task Scheduler
slug: task-scheduler
difficulty: medium
leetcode_id: 621
leetcode_url: https://leetcode.com/problems/task-scheduler/
categories:
- arrays
- hash-tables
- heap
patterns:
- greedy
- heap
description: |
You are given an array of CPU `tasks`, each labelled with a letter from A to Z, and a number `n`. Each CPU interval can be idle or allow the completion of one task. Tasks can be completed in any order, but there's a constraint: there has to be a gap of **at least** `n` intervals between two tasks with the same label.
Return *the minimum number of CPU intervals required to complete all tasks*.
constraints: |
- `1 <= tasks.length <= 10^4`
- `tasks[i]` is an uppercase English letter
- `0 <= n <= 100`
examples:
- input: 'tasks = ["A","A","A","B","B","B"], n = 2'
output: "8"
explanation: "A possible sequence is: A -> B -> idle -> A -> B -> idle -> A -> B. After completing task A, you must wait two intervals before doing A again. The same applies to task B."
- input: 'tasks = ["A","C","A","B","D","B"], n = 1'
output: "6"
explanation: "A possible sequence is: A -> B -> C -> D -> A -> B. With a cooling interval of 1, you can repeat a task after just one other task."
- input: 'tasks = ["A","A","A","B","B","B"], n = 3'
output: "10"
explanation: "A possible sequence is: A -> B -> idle -> idle -> A -> B -> idle -> idle -> A -> B. There are only two types of tasks, A and B, which need to be separated by 3 intervals."
explanation:
intuition: |
Imagine you're a CPU scheduler trying to execute tasks with mandatory cooldown periods. The **most frequent task** is your bottleneck — it determines the minimum structure of your schedule.
Think of it like this: if task A appears 3 times and must have 2 intervals between repetitions, you need at least the slots: `A _ _ A _ _ A`. That's a frame of `(count_A - 1) * (n + 1) + 1` intervals just for A.
The key insight is that **idle slots only appear when you don't have enough other tasks to fill the gaps**. If you have many different tasks, they can fill the cooling gaps perfectly, and you never idle. But if your tasks are dominated by one or two high-frequency labels, you'll have idle slots.
Visualise the schedule as a grid:
- Each row represents a "cycle" of `n + 1` slots
- The most frequent tasks occupy the first column
- Other tasks fill in the remaining slots
- Empty slots become idle time
For `tasks = [A,A,A,B,B,B]` with `n = 2`:
```
| A | B | idle |
| A | B | idle |
| A | B | |
```
This gives us `(3-1) * 3 + 2 = 8` intervals.
approach: |
We solve this using a **Greedy (Math) Approach** based on the most frequent task:
**Step 1: Count task frequencies**
- Use a hash map or counter to count occurrences of each task
- Find `max_count`: the highest frequency among all tasks
- Find `num_max`: how many tasks have this maximum frequency
&nbsp;
**Step 2: Calculate the frame size**
- The most frequent task creates a "frame" of `(max_count - 1)` complete cycles
- Each cycle has `n + 1` slots (the task itself plus `n` cooling intervals)
- Frame size = `(max_count - 1) * (n + 1)`
&nbsp;
**Step 3: Add the final row**
- After the last cycle, we still need to execute the tasks with maximum frequency one more time
- Add `num_max` to account for all tasks that appear `max_count` times
&nbsp;
**Step 4: Handle the edge case**
- If we have many diverse tasks, they can fill all gaps with no idle time
- In this case, the answer is simply `len(tasks)` (no idle needed)
- Return `max(len(tasks), frame_size + num_max)`
&nbsp;
This formula works because the greedy insight is: arrange tasks by frequency, most frequent first, and idle time only exists when the frame isn't filled.
common_pitfalls:
- title: Simulating the Actual Schedule
description: |
A common approach is to simulate task execution using a heap, popping the most frequent task, decrementing it, and tracking cooldowns. While this works, it's **O(n * total_tasks)** in the worst case.
For `tasks.length = 10^4` and `n = 100`, simulation can be slow. The math-based greedy approach runs in **O(tasks.length)** time with just counting.
wrong_approach: "Heap-based simulation with cooldown tracking"
correct_approach: "Mathematical formula based on max frequency"
- title: Forgetting Tasks with Same Max Frequency
description: |
If multiple tasks share the maximum frequency, they all need a slot in the final row.
For example, with `tasks = [A,A,A,B,B,B]` and `n = 2`:
- Both A and B appear 3 times
- The final row needs both A and B: `... A B`
- Formula: `(3-1) * 3 + 2 = 8`, not `(3-1) * 3 + 1 = 7`
Always count how many tasks have the maximum frequency (`num_max`).
wrong_approach: "Only adding 1 for the final row"
correct_approach: "Adding num_max (count of tasks with max frequency)"
- title: Ignoring the No-Idle Case
description: |
When `n` is small or tasks are highly diverse, you might not need any idle time at all.
For `tasks = [A,B,C,D,E,F,G,H,I,J]` with `n = 1`, each task appears once. The formula gives `(1-1) * 2 + 10 = 10`, which equals `len(tasks)`. But if the formula gave less than `len(tasks)`, you'd still need to execute all tasks!
Always return `max(len(tasks), calculated_result)`.
wrong_approach: "Returning the formula result directly"
correct_approach: "Return max of formula result and total task count"
key_takeaways:
- "**Greedy insight**: The most frequent task determines the minimum schedule structure — arrange around its cooldown requirements"
- "**Math over simulation**: Many scheduling problems have closed-form solutions based on counting, avoiding expensive simulation"
- "**Frame visualisation**: Think of the schedule as a grid with `n+1` columns; idle slots only appear when you can't fill the frame"
- "**Related problems**: This pattern applies to Task Scheduler II, Reorganize String, and other cooldown/spacing problems"
time_complexity: "O(n). We iterate through the tasks once to count frequencies, then compute the result in constant time."
space_complexity: "O(1). We use a fixed-size counter (at most 26 letters), which is constant regardless of input size."
solutions:
- approach_name: Greedy (Math Formula)
is_optimal: true
code: |
from collections import Counter
def least_interval(tasks: list[str], n: int) -> int:
# Count frequency of each task
freq = Counter(tasks)
# Find the maximum frequency
max_count = max(freq.values())
# Count how many tasks have this maximum frequency
num_max = sum(1 for count in freq.values() if count == max_count)
# Calculate minimum intervals using the frame formula
# (max_count - 1) complete cycles of (n + 1) slots each
# Plus num_max tasks in the final partial row
frame_size = (max_count - 1) * (n + 1) + num_max
# If we have many diverse tasks, we might not need idle time
# Return the maximum of frame size and total tasks
return max(len(tasks), frame_size)
explanation: |
**Time Complexity:** O(n) — Single pass to count frequencies, O(1) math operations.
**Space Complexity:** O(1) — Counter uses at most 26 keys (uppercase letters).
The formula `(max_count - 1) * (n + 1) + num_max` calculates the minimum intervals by considering the most frequent task as the scheduling backbone. The `max()` with `len(tasks)` handles cases where tasks are diverse enough to avoid any idle time.
- approach_name: Max Heap Simulation
is_optimal: false
code: |
from collections import Counter
import heapq
def least_interval(tasks: list[str], n: int) -> int:
# Count frequency of each task
freq = Counter(tasks)
# Max heap of remaining counts (negate for max heap behavior)
heap = [-count for count in freq.values()]
heapq.heapify(heap)
time = 0
while heap:
temp = [] # Tasks that need to wait for cooldown
# Process up to n+1 tasks in this cycle
for _ in range(n + 1):
if heap:
# Pop most frequent task and decrement
count = heapq.heappop(heap)
if count + 1 < 0: # Still has remaining executions
temp.append(count + 1)
time += 1
# If no more tasks in heap or temp, we're done
if not heap and not temp:
break
# Push tasks back that completed their cooldown
for count in temp:
heapq.heappush(heap, count)
return time
explanation: |
**Time Complexity:** O(n * m) — Where n is the number of tasks and m is the cooldown period. Each cycle processes up to n+1 tasks.
**Space Complexity:** O(26) = O(1) — Heap contains at most 26 different task types.
This simulation approach uses a max heap to always process the most frequent remaining task. While intuitive and correct, it's slower than the math formula for large inputs. It's useful for understanding the problem mechanics or when you need to output the actual schedule.

View File

@@ -0,0 +1,183 @@
title: 3Sum Closest
slug: three-sum-closest
difficulty: medium
leetcode_id: 16
leetcode_url: https://leetcode.com/problems/3sum-closest/
categories:
- arrays
- two-pointers
- sorting
patterns:
- two-pointers
description: |
Given an integer array `nums` of length `n` and an integer `target`, find three integers in `nums` such that the sum is **closest** to `target`.
Return *the sum of the three integers*.
You may assume that each input would have exactly one solution.
constraints: |
- `3 <= nums.length <= 500`
- `-1000 <= nums[i] <= 1000`
- `-10^4 <= target <= 10^4`
examples:
- input: "nums = [-1,2,1,-4], target = 1"
output: "2"
explanation: "The sum that is closest to the target is 2. (-1 + 2 + 1 = 2)."
- input: "nums = [0,0,0], target = 1"
output: "0"
explanation: "The sum that is closest to the target is 0. (0 + 0 + 0 = 0)."
explanation:
intuition: |
This problem is a close cousin of 3Sum, with one key difference: instead of finding triplets that sum to exactly zero, we want the triplet whose sum is **closest** to a target value.
Think of it like a number line. Imagine plotting all possible triplet sums on this line, with the target marked. Your job is to find which triplet sum lands nearest to the target — whether it lands exactly on it, slightly below, or slightly above.
The key insight is the same as 3Sum: **sorting unlocks the two-pointer technique**. Once sorted, if our current sum is too small, we can systematically increase it by moving the left pointer right. If too large, we move the right pointer left. At each step, we check: is this sum closer to the target than our best so far?
Unlike 3Sum (which collects all exact matches), we only care about the single closest sum, so we track a "best" answer and update it whenever we find something closer.
approach: |
We solve this using **Sort + Two Pointers**:
**Step 1: Sort the array**
- Sorting is essential for the two-pointer technique
- After sorting, we can adjust our sum by moving pointers left or right
&nbsp;
**Step 2: Initialise tracking variable**
- `closest_sum`: Store the sum of the first three elements as our initial best guess
- We'll update this whenever we find a sum closer to the target
&nbsp;
**Step 3: Fix the first element and use two pointers**
- For each index `i` from 0 to n-3:
- Set `left = i + 1`, `right = n - 1`
- While `left < right`:
- Calculate `current_sum = nums[i] + nums[left] + nums[right]`
- If `current_sum == target`: return immediately — we can't get closer than exact
- If `abs(current_sum - target) < abs(closest_sum - target)`: update `closest_sum`
- If `current_sum < target`: move `left` right to increase the sum
- If `current_sum > target`: move `right` left to decrease the sum
&nbsp;
**Step 4: Return the closest sum**
- After checking all triplets, return `closest_sum`
common_pitfalls:
- title: Not Handling Early Termination
description: |
If you find a sum that **exactly equals** the target, return immediately. The distance is zero — you can't possibly find anything closer.
Without this optimisation, you'll waste time checking remaining triplets when the answer is already known.
wrong_approach: "Continuing to search after finding an exact match"
correct_approach: "if current_sum == target: return target"
- title: Brute Force Approach
description: |
The naive approach checks all possible triplets with three nested loops:
- Outer loop `i` from 0 to n-3
- Middle loop `j` from i+1 to n-2
- Inner loop `k` from j+1 to n-1
This results in **O(n³) time complexity**. While the constraint `n <= 500` might allow this to pass, it's unnecessarily slow.
With sorting + two pointers, we reduce to O(n²) — a 500x speedup for the maximum input size.
wrong_approach: "Three nested loops checking all triplets"
correct_approach: "Sort + two pointers for O(n²)"
- title: Incorrect Distance Comparison
description: |
When comparing which sum is closer, use **absolute difference**: `abs(current_sum - target)`.
A common mistake is comparing raw differences without absolute value, which fails when sums can be both above and below the target.
wrong_approach: "current_sum - target < closest_sum - target"
correct_approach: "abs(current_sum - target) < abs(closest_sum - target)"
key_takeaways:
- "**Two pointers on sorted arrays**: Sorting transforms the search from O(n²) per fixed element to O(n)"
- "**Track best-so-far for optimisation problems**: Unlike 3Sum (collect all matches), we maintain a single best answer"
- "**Early termination on exact match**: When distance is zero, no further search is needed"
- "**3Sum variant pattern**: Many problems (3Sum, 3Sum Closest, 3Sum Smaller) use the same sort + two-pointer framework with small variations"
time_complexity: "O(n²). Sorting is O(n log n), then for each of n elements, the two-pointer search is O(n). The dominant term is O(n × n) = O(n²)."
space_complexity: "O(log n) to O(n). Depends on the sorting algorithm used — O(log n) for in-place sorts like heapsort, O(n) for others like mergesort."
solutions:
- approach_name: Sort + Two Pointers
is_optimal: true
code: |
def three_sum_closest(nums: list[int], target: int) -> int:
nums.sort() # Enable two-pointer technique
n = len(nums)
# Start with sum of first three elements as initial guess
closest_sum = nums[0] + nums[1] + nums[2]
for i in range(n - 2):
# Two pointers for the remaining array
left, right = i + 1, n - 1
while left < right:
current_sum = nums[i] + nums[left] + nums[right]
# Exact match — can't get closer than this
if current_sum == target:
return target
# Update closest if this sum is nearer to target
if abs(current_sum - target) < abs(closest_sum - target):
closest_sum = current_sum
# Adjust pointers based on comparison with target
if current_sum < target:
left += 1 # Need larger sum
else:
right -= 1 # Need smaller sum
return closest_sum
explanation: |
**Time Complexity:** O(n²) — O(n log n) sort + O(n) two-pointer search for each of O(n) elements.
**Space Complexity:** O(log n) to O(n) — Sorting space only.
We sort the array, then for each element, use two pointers to explore sums. We track the closest sum seen, updating when we find one nearer to the target. Early termination on exact match provides a minor optimisation.
- approach_name: Brute Force
is_optimal: false
code: |
def three_sum_closest(nums: list[int], target: int) -> int:
n = len(nums)
# Start with sum of first three elements
closest_sum = nums[0] + nums[1] + nums[2]
# Check all possible triplets
for i in range(n - 2):
for j in range(i + 1, n - 1):
for k in range(j + 1, n):
current_sum = nums[i] + nums[j] + nums[k]
# Update closest if this sum is nearer
if abs(current_sum - target) < abs(closest_sum - target):
closest_sum = current_sum
# Early exit on exact match
if closest_sum == target:
return target
return closest_sum
explanation: |
**Time Complexity:** O(n³) — Three nested loops checking all triplets.
**Space Complexity:** O(1) — Only tracking the closest sum.
This approach checks every possible triplet combination. While correct, the O(n³) complexity makes it significantly slower than the two-pointer approach. For n=500, this means 20 million comparisons vs 250,000 with the optimal approach.

View File

@@ -0,0 +1,194 @@
title: Top K Frequent Elements
slug: top-k-frequent-elements
difficulty: medium
leetcode_id: 347
leetcode_url: https://leetcode.com/problems/top-k-frequent-elements/
categories:
- arrays
- hash-tables
- heap
- sorting
patterns:
- heap
description: |
Given an integer array `nums` and an integer `k`, return *the* `k` *most frequent elements*. You may return the answer in **any order**.
constraints: |
- `1 <= nums.length <= 10^5`
- `-10^4 <= nums[i] <= 10^4`
- `k` is in the range `[1, the number of unique elements in the array]`
- It is **guaranteed** that the answer is **unique**
**Follow up:** Your algorithm's time complexity must be better than `O(n log n)`, where `n` is the array's size.
examples:
- input: "nums = [1,1,1,2,2,3], k = 2"
output: "[1,2]"
explanation: "Element 1 appears 3 times, element 2 appears 2 times, and element 3 appears once. The two most frequent elements are 1 and 2."
- input: "nums = [1], k = 1"
output: "[1]"
explanation: "There is only one element, and we need the top 1 most frequent element."
explanation:
intuition: |
Imagine you're analysing survey results and need to find the most popular choices. Your first instinct might be to count how many times each option appears, then sort by popularity. But can we do better than sorting?
The key insight is that **frequency values are bounded**. If you have `n` elements, the maximum possible frequency is `n` (when all elements are the same). This means we can use the frequency itself as an index into an array — a technique called **bucket sort**.
Think of it like this: create `n + 1` buckets labelled by frequency (0 through n). After counting each element's frequency, drop each element into its corresponding bucket. Then, starting from the highest-frequency bucket, collect elements until you have `k` of them.
This approach cleverly avoids comparison-based sorting (which has an `O(n log n)` lower bound) by using the frequency as a direct index.
approach: |
We solve this using a **Bucket Sort** approach:
**Step 1: Count frequencies**
- Use a hash map to count how many times each element appears
- Key: the element value, Value: its frequency
- This takes `O(n)` time with a single pass through the array
&nbsp;
**Step 2: Create frequency buckets**
- Create an array of `n + 1` empty lists (buckets), where index `i` will hold elements that appear exactly `i` times
- The maximum possible frequency is `n` (if all elements are identical)
&nbsp;
**Step 3: Fill the buckets**
- For each element in our frequency map, add it to the bucket corresponding to its frequency
- If element `x` appears 3 times, put `x` in `bucket[3]`
&nbsp;
**Step 4: Collect top k elements**
- Starting from the highest frequency bucket (index `n`), work backwards
- Add elements from each bucket to the result until we have `k` elements
- Return the result
&nbsp;
This works because elements in higher-indexed buckets have higher frequencies, so by traversing from high to low, we naturally get the most frequent elements first.
common_pitfalls:
- title: Sorting the Entire Frequency Map
description: |
A common approach is to count frequencies, then sort all elements by their frequency. While correct, sorting takes `O(n log n)` time.
The follow-up explicitly asks for better than `O(n log n)`. Bucket sort achieves `O(n)` by exploiting the bounded range of frequencies.
wrong_approach: "Sort all elements by frequency"
correct_approach: "Use bucket sort with frequency as index"
- title: Using a Max Heap Without Size Limit
description: |
Building a max heap of all unique elements and extracting `k` times works, but building the heap is `O(m)` and each extraction is `O(log m)` where `m` is the number of unique elements.
A more efficient heap approach uses a **min heap of size k**: maintain only the top `k` elements, evicting the minimum when the heap exceeds `k`. This gives `O(n log k)` time, which is better when `k << n`.
wrong_approach: "Max heap of all elements, extract k times"
correct_approach: "Min heap of size k, or bucket sort for O(n)"
- title: Off-by-One in Bucket Array Size
description: |
If you create only `n` buckets (indices 0 to n-1), you'll miss the case where an element appears `n` times (all elements are identical).
Create `n + 1` buckets to handle frequencies from 0 to `n` inclusive.
wrong_approach: "buckets = [[] for _ in range(n)]"
correct_approach: "buckets = [[] for _ in range(n + 1)]"
key_takeaways:
- "**Bucket sort** is powerful when values are bounded — use the value itself as an array index to avoid comparison-based sorting"
- "**Frequency counting + bucketing** is a common pattern for \"top k\" problems with bounded frequencies"
- "**Min heap of size k** is another useful technique: maintain only what you need, not everything"
- "This problem appears frequently in interviews and tests understanding of time complexity tradeoffs"
time_complexity: "O(n). We make one pass to count frequencies and one pass to fill and traverse buckets."
space_complexity: "O(n). We use a hash map for counts and an array of buckets, both proportional to the input size."
solutions:
- approach_name: Bucket Sort
is_optimal: true
code: |
def top_k_frequent(nums: list[int], k: int) -> list[int]:
# Step 1: Count frequency of each element
count = {}
for num in nums:
count[num] = count.get(num, 0) + 1
# Step 2: Create buckets where index = frequency
# Max frequency is n (all elements identical)
n = len(nums)
buckets = [[] for _ in range(n + 1)]
# Step 3: Place each element in its frequency bucket
for num, freq in count.items():
buckets[freq].append(num)
# Step 4: Collect k elements from highest frequency buckets
result = []
for freq in range(n, 0, -1): # Start from highest frequency
for num in buckets[freq]:
result.append(num)
if len(result) == k:
return result
return result
explanation: |
**Time Complexity:** O(n) — One pass to count, one pass to bucket, one pass to collect.
**Space Complexity:** O(n) — Hash map and bucket array.
Bucket sort exploits the fact that frequencies are bounded by `n`. By using frequency as an index, we avoid comparison-based sorting entirely and achieve linear time.
- approach_name: Min Heap
is_optimal: false
code: |
import heapq
def top_k_frequent(nums: list[int], k: int) -> list[int]:
# Step 1: Count frequency of each element
count = {}
for num in nums:
count[num] = count.get(num, 0) + 1
# Step 2: Use min heap to keep only top k elements
# Heap contains (frequency, element) tuples
heap = []
for num, freq in count.items():
heapq.heappush(heap, (freq, num))
# If heap exceeds size k, remove the minimum
if len(heap) > k:
heapq.heappop(heap)
# Step 3: Extract elements from heap
return [num for freq, num in heap]
explanation: |
**Time Complexity:** O(n log k) — We push each unique element onto a heap of size at most `k`.
**Space Complexity:** O(n) — Hash map for counts, plus O(k) for the heap.
The min heap approach maintains only the top `k` elements at any time. When we see a new element, if its frequency is higher than the minimum in our heap (and the heap is full), we replace it. This is more efficient than a max heap of all elements when `k` is small.
- approach_name: Sorting
is_optimal: false
code: |
def top_k_frequent(nums: list[int], k: int) -> list[int]:
# Step 1: Count frequency of each element
count = {}
for num in nums:
count[num] = count.get(num, 0) + 1
# Step 2: Sort by frequency (descending) and take top k
sorted_elements = sorted(count.keys(), key=lambda x: count[x], reverse=True)
return sorted_elements[:k]
explanation: |
**Time Complexity:** O(n log n) — Sorting dominates.
**Space Complexity:** O(n) — Hash map and sorted list.
This straightforward approach counts frequencies then sorts. While simple to implement, it doesn't meet the follow-up requirement of better than `O(n log n)`. Included here to contrast with the optimal solutions.

View File

@@ -0,0 +1,168 @@
title: Transpose Matrix
slug: transpose-matrix
difficulty: easy
leetcode_id: 867
leetcode_url: https://leetcode.com/problems/transpose-matrix/
categories:
- arrays
patterns:
- matrix-traversal
description: |
Given a 2D integer array `matrix`, return *the **transpose** of* `matrix`.
The **transpose** of a matrix is the matrix flipped over its main diagonal, switching the matrix's row and column indices.
In other words, for every element at position `matrix[i][j]`, its transposed position becomes `result[j][i]`.
constraints: |
- `m == matrix.length`
- `n == matrix[i].length`
- `1 <= m, n <= 1000`
- `1 <= m * n <= 10^5`
- `-10^9 <= matrix[i][j] <= 10^9`
examples:
- input: "matrix = [[1,2,3],[4,5,6],[7,8,9]]"
output: "[[1,4,7],[2,5,8],[3,6,9]]"
explanation: "The 3x3 matrix is flipped along its main diagonal. Row 0 becomes column 0, row 1 becomes column 1, etc."
- input: "matrix = [[1,2,3],[4,5,6]]"
output: "[[1,4],[2,5],[3,6]]"
explanation: "The 2x3 matrix becomes a 3x2 matrix. Each row of the original becomes a column in the result."
explanation:
intuition: |
Imagine physically rotating a matrix 90 degrees clockwise, then flipping it horizontally — that's essentially what transposing does, but there's an easier way to think about it.
Picture each element in the matrix as having a "home address" of `(row, column)`. When you transpose, every element simply **swaps its row and column indices**. The element at `(0, 2)` moves to `(2, 0)`. The element at `(1, 0)` moves to `(0, 1)`.
Think of it like this: if you were reading the original matrix row by row from left to right, the transposed matrix reads the original *column by column* from top to bottom.
The key insight is that a matrix with dimensions `m × n` becomes `n × m` after transposition. This means we need to create a **new matrix** with swapped dimensions — we can't just rearrange elements in place (unless the matrix is square).
approach: |
We solve this using a **Direct Mapping Approach**:
**Step 1: Determine the dimensions**
- Get `m` (number of rows) and `n` (number of columns) from the original matrix
- The result will have dimensions `n × m` (rows and columns swapped)
&nbsp;
**Step 2: Create the result matrix**
- Initialise a new matrix with `n` rows and `m` columns
- This can be done by creating `n` empty lists, each eventually containing `m` elements
&nbsp;
**Step 3: Copy elements with swapped indices**
- Iterate through each element in the original matrix at position `(i, j)`
- Place it in the result matrix at position `(j, i)`
- This naturally swaps rows and columns
&nbsp;
**Step 4: Return the transposed matrix**
- After processing all elements, return the result matrix
&nbsp;
This approach works because the mathematical definition of transpose is exactly this index swap: `result[j][i] = matrix[i][j]` for all valid `i` and `j`.
common_pitfalls:
- title: Confusing Transpose with Rotation
description: |
A common mistake is thinking transpose means rotating 90 degrees. They're related but different:
- **Transpose**: Swap row and column indices — `(i,j)` → `(j,i)`
- **Rotate 90° clockwise**: More complex — `(i,j)` → `(j, m-1-i)`
For a 3x3 matrix `[[1,2,3],[4,5,6],[7,8,9]]`:
- Transpose gives `[[1,4,7],[2,5,8],[3,6,9]]`
- 90° rotation gives `[[7,4,1],[8,5,2],[9,6,3]]`
wrong_approach: "Applying rotation logic instead of index swap"
correct_approach: "Simply swap indices: result[j][i] = matrix[i][j]"
- title: In-Place Modification for Non-Square Matrices
description: |
With a square matrix (`n × n`), you might be tempted to swap elements in place. But for a non-square matrix (`m × n` where `m ≠ n`), the dimensions change!
A `2 × 3` matrix becomes `3 × 2`. You cannot do this in place because the result has a different shape than the input. Always create a new matrix with the transposed dimensions.
wrong_approach: "Trying to swap elements in place for m × n matrix"
correct_approach: "Create a new n × m matrix and map elements"
- title: Wrong Dimension Order
description: |
When creating the result matrix, remember the dimensions flip. If the original is `m` rows by `n` columns, the result is `n` rows by `m` columns.
Getting this wrong leads to index out of bounds errors or incorrectly shaped output.
wrong_approach: "Creating result with same dimensions as input"
correct_approach: "Result dimensions are (n, m) when input is (m, n)"
key_takeaways:
- "**Index swapping pattern**: Transposition is simply `result[j][i] = matrix[i][j]` — a fundamental matrix operation"
- "**Dimension awareness**: Always consider how matrix operations change dimensions — transpose swaps `m × n` to `n × m`"
- "**Foundation for linear algebra**: Transpose is essential for operations like computing dot products, matrix multiplication, and solving linear systems"
- "**Applicable to many problems**: Understanding matrix traversal patterns helps with rotation, spiral order, and diagonal traversal problems"
time_complexity: "O(m × n). We visit each element in the matrix exactly once to copy it to the new position."
space_complexity: "O(m × n). We create a new matrix to store the transposed result, which has the same total number of elements."
solutions:
- approach_name: Direct Mapping
is_optimal: true
code: |
def transpose(matrix: list[list[int]]) -> list[list[int]]:
# Get original dimensions
m, n = len(matrix), len(matrix[0])
# Create result matrix with swapped dimensions (n rows, m columns)
# Each position result[j][i] will hold matrix[i][j]
result = [[0] * m for _ in range(n)]
# Copy each element with swapped indices
for i in range(m):
for j in range(n):
result[j][i] = matrix[i][j]
return result
explanation: |
**Time Complexity:** O(m × n) — We iterate through every element once.
**Space Complexity:** O(m × n) — We store all elements in a new matrix.
This is optimal because we must at minimum read every element and write every element to produce the output, which requires O(m × n) operations.
- approach_name: List Comprehension (Pythonic)
is_optimal: true
code: |
def transpose(matrix: list[list[int]]) -> list[list[int]]:
# zip(*matrix) unpacks rows and zips them column-wise
# Each tuple from zip becomes a row in the transposed matrix
return [list(row) for row in zip(*matrix)]
explanation: |
**Time Complexity:** O(m × n) — Same as direct mapping, just more concise.
**Space Complexity:** O(m × n) — Creates a new matrix for the result.
This Pythonic approach uses `zip(*matrix)` to unpack the matrix rows and re-group elements by column index. The `*` operator unpacks the rows, and `zip` collects elements at the same position from each row — exactly what transpose does. We convert each tuple to a list for the final result.
- approach_name: Using NumPy
is_optimal: true
code: |
import numpy as np
def transpose(matrix: list[list[int]]) -> list[list[int]]:
# NumPy provides a built-in transpose operation
# .T is shorthand for np.transpose()
return np.array(matrix).T.tolist()
explanation: |
**Time Complexity:** O(m × n) — NumPy optimises the operation internally.
**Space Complexity:** O(m × n) — Creates the result array.
While not typically allowed in coding interviews, this shows how common matrix transpose is — it's built into numerical computing libraries. NumPy's `.T` attribute returns a transposed view of the array, and `tolist()` converts it back to a Python list of lists.

View File

@@ -0,0 +1,179 @@
title: Unique Number of Occurrences
slug: unique-number-of-occurrences
difficulty: easy
leetcode_id: 1207
leetcode_url: https://leetcode.com/problems/unique-number-of-occurrences/
categories:
- arrays
- hash-tables
patterns:
- prefix-sum
description: |
Given an array of integers `arr`, return `true` *if the number of occurrences of each value in the array is **unique***, or `false` *otherwise*.
In other words, no two different values in the array should appear the same number of times.
constraints: |
- `1 <= arr.length <= 1000`
- `-1000 <= arr[i] <= 1000`
examples:
- input: "arr = [1,2,2,1,1,3]"
output: "true"
explanation: "The value 1 has 3 occurrences, 2 has 2 occurrences, and 3 has 1 occurrence. No two values have the same number of occurrences."
- input: "arr = [1,2]"
output: "false"
explanation: "Both 1 and 2 appear exactly once, so their occurrence counts are not unique."
- input: "arr = [-3,0,1,-3,1,1,1,-3,10,0]"
output: "true"
explanation: "The value 1 appears 4 times, -3 appears 3 times, 0 appears 2 times, and 10 appears 1 time. All counts are unique."
explanation:
intuition: |
This problem asks us to verify a **uniqueness property about frequencies**, not about the values themselves.
Think of it like this: imagine you're organising a party and counting how many guests ordered each dish. You want to check if every dish has a *different* number of orders — no two dishes should be equally popular.
The key insight is that this is a **two-step counting problem**:
1. First, count how many times each value appears (value → frequency)
2. Then, check if all those frequencies are unique (no duplicates in the frequency list)
A hash map is perfect for the first step (counting), and a set is perfect for the second step (checking uniqueness). Since sets automatically reject duplicates, we can simply compare the number of unique frequencies to the total number of frequencies.
approach: |
We solve this using a **Hash Map + Set Approach**:
**Step 1: Count occurrences of each value**
- Create a hash map (dictionary) to store each value and its count
- Iterate through the array, incrementing the count for each value
- After this step, we have a mapping like `{1: 3, 2: 2, 3: 1}` for the first example
&nbsp;
**Step 2: Extract the frequency counts**
- Get all the count values from the hash map
- For the first example, this gives us `[3, 2, 1]`
&nbsp;
**Step 3: Check if all frequencies are unique**
- Convert the list of frequencies to a set (which removes duplicates)
- Compare the size of the set to the size of the original frequency list
- If they're equal, all frequencies were unique; if the set is smaller, there were duplicates
&nbsp;
**Step 4: Return the result**
- Return `True` if the lengths match, `False` otherwise
common_pitfalls:
- title: Confusing Value Uniqueness with Frequency Uniqueness
description: |
The problem asks if the **occurrence counts** are unique, not if the values themselves are unique.
For example, `[1, 2]` has all unique values, but both values appear exactly once. Since the frequencies `[1, 1]` are not unique, the answer is `false`.
Make sure you're checking uniqueness of the *counts*, not the original array elements.
wrong_approach: "Checking if array values are unique"
correct_approach: "Counting frequencies, then checking if those frequencies are unique"
- title: Using a List Instead of a Set
description: |
Some solutions try to check for duplicate frequencies by iterating through a list and comparing elements. This works but is unnecessarily complex.
Using a set is the idiomatic way to check for uniqueness — just compare `len(frequencies)` with `len(set(frequencies))`.
wrong_approach: "Nested loops to find duplicate frequencies"
correct_approach: "Convert to set and compare lengths"
- title: Forgetting Edge Cases
description: |
Consider edge cases:
- Single element array: `[5]` → frequency is `{5: 1}` → only one count → return `true`
- All same elements: `[2, 2, 2]` → frequency is `{2: 3}` → only one count → return `true`
Both cases have trivially unique frequencies since there's only one distinct value.
key_takeaways:
- "**Two-phase counting pattern**: Many problems require counting elements first, then analysing the counts themselves"
- "**Set for uniqueness**: Converting a collection to a set and comparing lengths is the standard way to check for duplicates"
- "**Hash map for frequency**: `Counter` or dictionary counting is a fundamental pattern for frequency-based problems"
- "**Problem decomposition**: Breaking 'unique occurrences' into 'count frequencies' + 'check uniqueness' makes the solution clear"
time_complexity: "O(n). We iterate through the array once to count frequencies, then iterate through the frequency values (at most n unique values) to build the set."
space_complexity: "O(n). In the worst case, all elements are unique, so the hash map stores n key-value pairs, and the set stores n frequency values."
solutions:
- approach_name: Hash Map and Set
is_optimal: true
code: |
from collections import Counter
def unique_occurrences(arr: list[int]) -> bool:
# Step 1: Count occurrences of each value
frequency = Counter(arr)
# Step 2: Get all the count values
counts = frequency.values()
# Step 3: Check if all counts are unique
# If set size equals list size, no duplicates exist
return len(counts) == len(set(counts))
explanation: |
**Time Complexity:** O(n) — One pass to count, one pass to build the set.
**Space Complexity:** O(n) — Hash map and set each store at most n elements.
We use Python's `Counter` for concise frequency counting, then leverage the set's uniqueness property. If converting counts to a set reduces the size, there were duplicate frequencies.
- approach_name: Manual Dictionary Counting
is_optimal: true
code: |
def unique_occurrences(arr: list[int]) -> bool:
# Count occurrences using a dictionary
frequency = {}
for num in arr:
# Increment count, defaulting to 0 if not seen
frequency[num] = frequency.get(num, 0) + 1
# Get the frequency counts
counts = list(frequency.values())
# Check uniqueness: set removes duplicates
return len(counts) == len(set(counts))
explanation: |
**Time Complexity:** O(n) — Same as the Counter approach.
**Space Complexity:** O(n) — Same space usage.
This approach shows the manual counting logic without relying on `Counter`. It's useful for understanding what happens under the hood or when `Counter` isn't available.
- approach_name: Sorting Approach
is_optimal: false
code: |
def unique_occurrences(arr: list[int]) -> bool:
# Count occurrences
frequency = {}
for num in arr:
frequency[num] = frequency.get(num, 0) + 1
# Sort the counts
counts = sorted(frequency.values())
# Check adjacent elements for duplicates
for i in range(1, len(counts)):
if counts[i] == counts[i - 1]:
return False
return True
explanation: |
**Time Complexity:** O(n log n) — Due to sorting the frequency counts.
**Space Complexity:** O(n) — Storing frequencies and sorted counts.
This approach sorts the counts and checks for adjacent duplicates. While correct, sorting adds unnecessary overhead compared to using a set. Included to show an alternative that doesn't use sets.

View File

@@ -0,0 +1,230 @@
title: Unique Paths II
slug: unique-paths-ii
difficulty: medium
leetcode_id: 63
leetcode_url: https://leetcode.com/problems/unique-paths-ii/
categories:
- arrays
- dynamic-programming
patterns:
- dynamic-programming
- matrix-traversal
description: |
You are given an `m x n` integer array `grid`. There is a robot initially located at the **top-left corner** (i.e., `grid[0][0]`). The robot tries to move to the **bottom-right corner** (i.e., `grid[m - 1][n - 1]`). The robot can only move either down or right at any point in time.
An obstacle and space are marked as `1` or `0` respectively in `grid`. A path that the robot takes cannot include **any** square that is an obstacle.
Return *the number of possible unique paths that the robot can take to reach the bottom-right corner*.
The testcases are generated so that the answer will be less than or equal to `2 * 10^9`.
constraints: |
- `m == obstacleGrid.length`
- `n == obstacleGrid[i].length`
- `1 <= m, n <= 100`
- `obstacleGrid[i][j]` is `0` or `1`
examples:
- input: "obstacleGrid = [[0,0,0],[0,1,0],[0,0,0]]"
output: "2"
explanation: "There is one obstacle in the middle of the 3x3 grid. There are two ways to reach the bottom-right corner: Right -> Right -> Down -> Down, or Down -> Down -> Right -> Right."
- input: "obstacleGrid = [[0,1],[0,0]]"
output: "1"
explanation: "There is one obstacle blocking the top row. The only path is Down -> Right."
explanation:
intuition: |
Imagine you're navigating a city grid where some intersections are blocked by construction. You start at the top-left and need to reach the bottom-right, but you can only travel right or down (no backtracking).
This is a classic **counting problem** with a key insight: the number of ways to reach any cell equals the sum of ways to reach the cells directly above it and to its left. Why? Because those are the only two directions you could have come from.
Think of it like this: if there are 3 ways to reach the cell above you, and 2 ways to reach the cell to your left, then there are exactly 3 + 2 = 5 ways to reach your current cell. This is the **principle of addition** in combinatorics.
The obstacle twist adds one modification: if a cell contains an obstacle, there are **zero** ways to reach it (and zero ways to pass through it). This "blocks" all paths that would have gone through that cell.
approach: |
We solve this using **Dynamic Programming** with a 2D table:
**Step 1: Handle edge cases**
- If the starting cell `grid[0][0]` is an obstacle, return `0` immediately (no paths exist)
- If the destination cell `grid[m-1][n-1]` is an obstacle, return `0`
&nbsp;
**Step 2: Create a DP table**
- `dp[i][j]`: Number of unique paths to reach cell `(i, j)`
- Initialise `dp[0][0] = 1` (one way to "reach" the starting position: start there)
&nbsp;
**Step 3: Fill the first row**
- For each cell in the first row, you can only arrive from the left
- If there's no obstacle: `dp[0][j] = dp[0][j-1]`
- If there's an obstacle: `dp[0][j] = 0` (and all cells to the right become unreachable)
&nbsp;
**Step 4: Fill the first column**
- For each cell in the first column, you can only arrive from above
- If there's no obstacle: `dp[i][0] = dp[i-1][0]`
- If there's an obstacle: `dp[i][0] = 0` (and all cells below become unreachable)
&nbsp;
**Step 5: Fill the rest of the table**
- For each cell `(i, j)` where `i > 0` and `j > 0`:
- If it's an obstacle: `dp[i][j] = 0`
- Otherwise: `dp[i][j] = dp[i-1][j] + dp[i][j-1]` (sum of paths from above and left)
&nbsp;
**Step 6: Return the answer**
- Return `dp[m-1][n-1]`, the number of paths to the bottom-right corner
common_pitfalls:
- title: Forgetting to Check the Start and End Cells
description: |
If the starting cell `grid[0][0]` or ending cell `grid[m-1][n-1]` contains an obstacle, the answer is immediately `0`. Many solutions forget this check and proceed to fill the DP table incorrectly.
Always handle these edge cases upfront before any DP logic.
wrong_approach: "Start filling DP table without checking start/end obstacles"
correct_approach: "Check grid[0][0] and grid[m-1][n-1] first, return 0 if either is blocked"
- title: Not Propagating Zero Through First Row/Column
description: |
Once you encounter an obstacle in the first row, **all cells to its right** become unreachable (since you can only come from the left). The same applies to the first column going downward.
For example, with `[[0, 0, 1, 0]]`, after the obstacle at index 2, indices 3 and beyond have `0` paths, not `1`.
A common mistake is to reset the count only for the obstacle cell itself, allowing subsequent cells to incorrectly inherit paths.
wrong_approach: "Only set obstacle cell to 0, continue counting for cells after it"
correct_approach: "Once an obstacle appears in row 0 or column 0, all subsequent cells in that row/column have 0 paths"
- title: Confusing Obstacle Value with Path Count
description: |
The grid uses `1` to mark obstacles and `0` for open spaces. The DP table uses numbers representing *path counts*. Don't confuse these two meanings.
A cell with `grid[i][j] = 0` (no obstacle) might have `dp[i][j] = 5` (five paths reach it).
wrong_approach: "Mixing up grid values with DP table values"
correct_approach: "Grid indicates obstacles (0/1); DP table counts paths (0 to 2*10^9)"
key_takeaways:
- "**DP for counting**: When asked to count paths/combinations, DP often applies. The number of ways to reach a state equals the sum of ways to reach predecessor states."
- "**Obstacle handling**: Obstacles act as 'zero propagators' — they block all paths through them, which can cascade through dependent cells."
- "**Space optimisation possible**: Since each row only depends on the current and previous row, you can reduce space from O(m*n) to O(n) using a 1D array."
- "**Foundation for harder problems**: This pattern extends to problems with different movement rules, multiple obstacles, or weighted paths."
time_complexity: "O(m * n). We visit each cell in the grid exactly once to compute its path count."
space_complexity: "O(m * n) for the standard DP solution. Can be optimised to O(n) by using a single row array, since each cell only depends on the cell above and to the left."
solutions:
- approach_name: Dynamic Programming (2D Table)
is_optimal: true
code: |
def unique_paths_with_obstacles(obstacle_grid: list[list[int]]) -> int:
m, n = len(obstacle_grid), len(obstacle_grid[0])
# If start or end is blocked, no paths exist
if obstacle_grid[0][0] == 1 or obstacle_grid[m - 1][n - 1] == 1:
return 0
# dp[i][j] = number of unique paths to reach cell (i, j)
dp = [[0] * n for _ in range(m)]
# Starting position: one way to be here (start here)
dp[0][0] = 1
# Fill first column: can only come from above
for i in range(1, m):
if obstacle_grid[i][0] == 1:
dp[i][0] = 0 # Blocked, and all below become 0
else:
dp[i][0] = dp[i - 1][0]
# Fill first row: can only come from the left
for j in range(1, n):
if obstacle_grid[0][j] == 1:
dp[0][j] = 0 # Blocked, and all to the right become 0
else:
dp[0][j] = dp[0][j - 1]
# Fill the rest: sum of paths from above and left
for i in range(1, m):
for j in range(1, n):
if obstacle_grid[i][j] == 1:
dp[i][j] = 0 # Obstacle: no paths through here
else:
dp[i][j] = dp[i - 1][j] + dp[i][j - 1]
return dp[m - 1][n - 1]
explanation: |
**Time Complexity:** O(m * n) — We iterate through every cell once.
**Space Complexity:** O(m * n) — We store a full 2D DP table.
This solution builds up the path counts from the starting cell to every reachable cell. Each cell's value represents the total number of unique paths that can reach it, considering obstacles that block certain routes.
- approach_name: Space-Optimised DP (1D Array)
is_optimal: true
code: |
def unique_paths_with_obstacles(obstacle_grid: list[list[int]]) -> int:
m, n = len(obstacle_grid), len(obstacle_grid[0])
# If start or end is blocked, no paths exist
if obstacle_grid[0][0] == 1 or obstacle_grid[m - 1][n - 1] == 1:
return 0
# dp[j] = number of paths to reach column j in the current row
dp = [0] * n
dp[0] = 1 # Starting position
for i in range(m):
for j in range(n):
if obstacle_grid[i][j] == 1:
dp[j] = 0 # Blocked cell
elif j > 0:
# dp[j] already has value from row above (paths from top)
# Add dp[j-1] for paths from the left
dp[j] += dp[j - 1]
# If j == 0 and no obstacle, dp[j] keeps its value from above
return dp[n - 1]
explanation: |
**Time Complexity:** O(m * n) — Same iteration through all cells.
**Space Complexity:** O(n) — Only a single row is stored.
This optimisation works because when computing row `i`, we only need values from row `i-1` (stored in `dp[j]` before update) and the current row to the left (`dp[j-1]` after update). By processing left-to-right, `dp[j]` naturally transitions from "paths from above" to "total paths to this cell".
- approach_name: Brute Force (Recursion)
is_optimal: false
code: |
def unique_paths_with_obstacles(obstacle_grid: list[list[int]]) -> int:
m, n = len(obstacle_grid), len(obstacle_grid[0])
def count_paths(i: int, j: int) -> int:
# Out of bounds or obstacle
if i >= m or j >= n or obstacle_grid[i][j] == 1:
return 0
# Reached destination
if i == m - 1 and j == n - 1:
return 1
# Sum paths going right and going down
return count_paths(i, j + 1) + count_paths(i + 1, j)
return count_paths(0, 0)
explanation: |
**Time Complexity:** O(2^(m+n)) — Exponential due to overlapping subproblems.
**Space Complexity:** O(m + n) — Recursion stack depth.
This naive recursive approach explores every possible path by trying both directions at each cell. Without memoisation, it recomputes the same subproblems many times, making it impractical for grids larger than about 10x10. Included to show why DP is essential.

View File

@@ -0,0 +1,188 @@
title: Unique Paths
slug: unique-paths
difficulty: medium
leetcode_id: 62
leetcode_url: https://leetcode.com/problems/unique-paths/
categories:
- arrays
- dynamic-programming
- math
patterns:
- dynamic-programming
description: |
There is a robot on an `m x n` grid. The robot is initially located at the **top-left corner** (i.e., `grid[0][0]`). The robot tries to move to the **bottom-right corner** (i.e., `grid[m - 1][n - 1]`). The robot can only move either down or right at any point in time.
Given the two integers `m` and `n`, return *the number of possible unique paths that the robot can take to reach the bottom-right corner*.
The test cases are generated so that the answer will be less than or equal to `2 * 10^9`.
constraints: |
- `1 <= m, n <= 100`
examples:
- input: "m = 3, n = 7"
output: "28"
explanation: "There are 28 unique paths from the top-left to the bottom-right corner of a 3x7 grid."
- input: "m = 3, n = 2"
output: "3"
explanation: "From the top-left corner, there are 3 ways to reach the bottom-right corner: Right -> Down -> Down, Down -> Down -> Right, and Down -> Right -> Down."
explanation:
intuition: |
Imagine a city grid where you can only walk east or south. You start at the northwest corner and want to reach the southeast corner. How many different routes can you take?
The key insight is that **every path has the same total length**: to go from `(0, 0)` to `(m-1, n-1)`, you must make exactly `m-1` moves down and `n-1` moves right, regardless of the order. The problem becomes: in how many different ways can you arrange these moves?
Think of it like this: at each cell, the number of ways to reach it equals the sum of ways to reach the cell above it plus the cell to the left of it. Why? Because you can only arrive from those two directions! This is the **optimal substructure** property that makes dynamic programming work.
For cells in the first row, there's only one way to reach them (keep going right). Similarly, for cells in the first column, there's only one way (keep going down). From there, we can build up the solution cell by cell.
approach: |
We solve this using a **2D Dynamic Programming** approach:
**Step 1: Create a DP table**
- Create a 2D array `dp` of size `m x n`
- `dp[i][j]` represents the number of unique paths to reach cell `(i, j)`
&nbsp;
**Step 2: Initialise the base cases**
- Fill the first row with `1`: there's only one way to reach any cell in the first row (all right moves)
- Fill the first column with `1`: there's only one way to reach any cell in the first column (all down moves)
&nbsp;
**Step 3: Fill the DP table**
- For each cell `(i, j)` where `i > 0` and `j > 0`:
- `dp[i][j] = dp[i-1][j] + dp[i][j-1]`
- This sums the paths from the cell above and the cell to the left
&nbsp;
**Step 4: Return the result**
- Return `dp[m-1][n-1]`, which contains the total number of unique paths to the destination
common_pitfalls:
- title: Off-by-One Errors
description: |
A common mistake is confusing grid dimensions with indices. If the grid is `m x n`, the bottom-right corner is at index `(m-1, n-1)`, not `(m, n)`.
Similarly, when iterating, ensure your loops go from `0` to `m-1` and `0` to `n-1` respectively.
wrong_approach: "Using m and n directly as indices"
correct_approach: "Use m-1 and n-1 for the destination cell"
- title: Forgetting Base Case Initialisation
description: |
If you don't properly initialise the first row and first column to `1`, your DP will produce incorrect results. Cells like `dp[0][j]` with an uninitialised `dp[0][j-1]` will give wrong values.
The base case is fundamental: there's exactly one way to reach any cell along the edges (all moves in one direction).
wrong_approach: "Starting iteration from (0,0) without initialisation"
correct_approach: "Initialise first row and first column to 1 before the main DP loop"
- title: Using Recursion Without Memoisation
description: |
A naive recursive solution without memoisation has exponential time complexity O(2^(m+n)). For `m = n = 100`, this would never complete.
Each cell's value depends on previously computed values, so either use bottom-up DP or top-down recursion with memoisation.
wrong_approach: "Plain recursion recalculating the same cells"
correct_approach: "Bottom-up DP or memoised recursion"
key_takeaways:
- "**Classic 2D DP pattern**: When counting paths in a grid with restricted movement, think about how many ways you can arrive at each cell"
- "**Optimal substructure**: The solution to a cell depends only on solutions to smaller subproblems (cells above and to the left)"
- "**Space optimisation possible**: Since each row only depends on the previous row, you can reduce space from O(m*n) to O(n)"
- "**Mathematical alternative**: This is equivalent to choosing `m-1` down moves from `m+n-2` total moves, giving C(m+n-2, m-1)"
time_complexity: "O(m * n). We fill each cell in the `m x n` DP table exactly once."
space_complexity: "O(m * n). We store the number of paths for every cell in the grid. This can be optimised to O(n) by only keeping the previous row."
solutions:
- approach_name: 2D Dynamic Programming
is_optimal: true
code: |
def unique_paths(m: int, n: int) -> int:
# Create DP table where dp[i][j] = paths to reach (i, j)
dp = [[1] * n for _ in range(m)]
# Fill the table - first row and column are already 1
for i in range(1, m):
for j in range(1, n):
# Paths from above + paths from left
dp[i][j] = dp[i - 1][j] + dp[i][j - 1]
# Bottom-right corner has our answer
return dp[m - 1][n - 1]
explanation: |
**Time Complexity:** O(m * n) — We visit each cell once.
**Space Complexity:** O(m * n) — We store the entire DP table.
We initialise the table with 1s (handling base cases implicitly), then iterate through each cell, summing the paths from above and from the left.
- approach_name: Space-Optimised DP
is_optimal: true
code: |
def unique_paths(m: int, n: int) -> int:
# Only need to track the current row
dp = [1] * n
# Process each row
for i in range(1, m):
for j in range(1, n):
# dp[j] currently holds value from row above
# dp[j-1] holds value from left (already updated this row)
dp[j] += dp[j - 1]
return dp[n - 1]
explanation: |
**Time Complexity:** O(m * n) — Same iteration as 2D approach.
**Space Complexity:** O(n) — Only storing one row at a time.
Since each cell only depends on the cell above and the cell to the left, we can reuse a single row. When we update `dp[j]`, it already contains the value from the previous row (the "above" value), and `dp[j-1]` has already been updated for the current row (the "left" value).
- approach_name: Combinatorics (Mathematical)
is_optimal: true
code: |
from math import comb
def unique_paths(m: int, n: int) -> int:
# Total moves needed: (m-1) down + (n-1) right = m+n-2 moves
# Choose which (m-1) of those are down moves
return comb(m + n - 2, m - 1)
explanation: |
**Time Complexity:** O(min(m, n)) — Computing the binomial coefficient.
**Space Complexity:** O(1) — Only storing the result.
Every path consists of exactly `m-1` down moves and `n-1` right moves. The number of unique paths equals the number of ways to arrange these moves, which is the binomial coefficient C(m+n-2, m-1). Python's `math.comb` computes this efficiently.
- approach_name: Recursive with Memoisation
is_optimal: false
code: |
from functools import lru_cache
def unique_paths(m: int, n: int) -> int:
@lru_cache(maxsize=None)
def count_paths(i: int, j: int) -> int:
# Base case: reached the destination
if i == m - 1 and j == n - 1:
return 1
# Out of bounds
if i >= m or j >= n:
return 0
# Sum paths going down and going right
return count_paths(i + 1, j) + count_paths(i, j + 1)
return count_paths(0, 0)
explanation: |
**Time Complexity:** O(m * n) — Each state computed once due to memoisation.
**Space Complexity:** O(m * n) — Cache stores all states, plus O(m + n) recursion stack.
This top-down approach starts from the origin and explores all paths recursively. Memoisation prevents redundant calculations. While correct, the bottom-up DP is generally preferred as it avoids recursion overhead.

View File

@@ -0,0 +1,172 @@
title: Valid Anagram
slug: valid-anagram
difficulty: easy
leetcode_id: 242
leetcode_url: https://leetcode.com/problems/valid-anagram/
categories:
- strings
- hash-tables
- sorting
patterns:
- prefix-sum
function_signature: "def is_anagram(s: str, t: str) -> bool:"
test_cases:
visible:
- input: { s: "anagram", t: "nagaram" }
expected: true
- input: { s: "rat", t: "car" }
expected: false
hidden:
- input: { s: "a", t: "a" }
expected: true
- input: { s: "ab", t: "a" }
expected: false
- input: { s: "listen", t: "silent" }
expected: true
- input: { s: "hello", t: "world" }
expected: false
description: |
Given two strings `s` and `t`, return `true` if `t` is an *anagram* of `s`, and `false` otherwise.
An **anagram** is a word or phrase formed by rearranging the letters of a different word or phrase, using all the original letters exactly once.
constraints: |
- `1 <= s.length, t.length <= 5 * 10^4`
- `s` and `t` consist of lowercase English letters.
examples:
- input: 's = "anagram", t = "nagaram"'
output: "true"
explanation: "Both strings contain exactly the same characters: three 'a's, one 'n', one 'g', one 'r', and one 'm'."
- input: 's = "rat", t = "car"'
output: "false"
explanation: "The strings have different characters: 'rat' has a 't' while 'car' has a 'c'."
explanation:
intuition: |
Think of each string as a **bag of letters**. Two words are anagrams if and only if they contain the exact same letters with the exact same frequencies.
Imagine you have two piles of Scrabble tiles. To check if they're anagrams, you could sort both piles alphabetically and see if they match perfectly. Alternatively, you could count how many of each letter appears in the first pile, then verify the second pile has identical counts.
The key insight is that **order doesn't matter** — only the character frequencies. This transforms a string comparison problem into a counting problem, which we can solve efficiently with a hash map or a fixed-size array.
approach: |
We solve this using a **Character Frequency Count** approach:
**Step 1: Check lengths**
- If `s` and `t` have different lengths, they cannot be anagrams — return `false` immediately
- This is a quick optimisation that avoids unnecessary work
&nbsp;
**Step 2: Count character frequencies**
- Create a frequency array of size 26 (one slot for each lowercase letter)
- Iterate through both strings simultaneously
- For each character in `s`, increment its count
- For each character in `t`, decrement its count
- This "cancels out" matching characters
&nbsp;
**Step 3: Verify all counts are zero**
- After processing both strings, check if all counts are exactly `0`
- Any non-zero count means one string has more or fewer of that character
- Return `true` only if all 26 counts are zero
&nbsp;
This approach works because incrementing for `s` and decrementing for `t` means a perfect match leaves every count at zero.
common_pitfalls:
- title: Forgetting the Length Check
description: |
Strings of different lengths cannot be anagrams. While the counting approach will still produce the correct answer, checking lengths first is an O(1) optimisation that can exit early.
For example, comparing "abc" to "abcd" would require counting 7 characters before discovering the mismatch. The length check catches this instantly.
wrong_approach: "Skipping the length comparison"
correct_approach: "Check if len(s) != len(t) first"
- title: Using a Dictionary for Lowercase-Only Input
description: |
When the problem guarantees lowercase English letters only, using a Python dictionary (hash map) is overkill. A fixed-size array of 26 integers is faster and uses less memory.
The array approach has O(1) access per character and predictable memory usage. Use `ord(c) - ord('a')` to map characters to indices 0-25.
wrong_approach: "Using collections.Counter for simple lowercase strings"
correct_approach: "Use a fixed-size array when character set is known"
- title: Sorting When Counting Suffices
description: |
Sorting both strings and comparing them works correctly, but it's O(n log n) time complexity. The counting approach achieves O(n) time, which matters for large inputs.
With constraints up to `5 * 10^4` characters, the difference between O(n) and O(n log n) is significant.
wrong_approach: "sorted(s) == sorted(t)"
correct_approach: "Count frequencies in O(n) time"
key_takeaways:
- "**Character frequency pattern**: Many string problems reduce to counting character occurrences"
- "**Fixed-size array optimisation**: When the character set is bounded (e.g., 26 lowercase letters), arrays beat hash maps"
- "**Early exit optimisation**: Simple checks like length comparison can avoid unnecessary computation"
- "**Related problems**: Group Anagrams, Find All Anagrams in a String, and Permutation in String all use similar counting techniques"
time_complexity: "O(n). We iterate through both strings once, where `n` is the length of the strings."
space_complexity: "O(1). We use a fixed-size array of 26 integers, regardless of input size. This is technically O(26) = O(1) constant space."
solutions:
- approach_name: Character Frequency Count
is_optimal: true
code: |
def is_anagram(s: str, t: str) -> bool:
# Different lengths can't be anagrams
if len(s) != len(t):
return False
# Fixed-size array for 26 lowercase letters
count = [0] * 26
# Increment for s, decrement for t
for i in range(len(s)):
count[ord(s[i]) - ord('a')] += 1
count[ord(t[i]) - ord('a')] -= 1
# All counts should be zero for anagrams
return all(c == 0 for c in count)
explanation: |
**Time Complexity:** O(n) — Single pass through both strings.
**Space Complexity:** O(1) — Fixed 26-element array regardless of input size.
We use the increment/decrement trick: adding for characters in `s` and subtracting for characters in `t`. If they're anagrams, every character "cancels out" to zero.
- approach_name: Hash Map Counter
is_optimal: false
code: |
from collections import Counter
def is_anagram(s: str, t: str) -> bool:
# Counter creates a dictionary of character frequencies
return Counter(s) == Counter(t)
explanation: |
**Time Complexity:** O(n) — Building each Counter takes linear time.
**Space Complexity:** O(k) — Where k is the number of unique characters (at most 26 for lowercase).
This approach is more readable and works with any character set (including Unicode). Python's `Counter` class handles the frequency counting automatically. The trade-off is slightly higher memory overhead compared to the array approach.
- approach_name: Sorting
is_optimal: false
code: |
def is_anagram(s: str, t: str) -> bool:
# Sort both strings and compare
return sorted(s) == sorted(t)
explanation: |
**Time Complexity:** O(n log n) — Sorting dominates the runtime.
**Space Complexity:** O(n) — Python's sorted() creates new lists.
This is the most intuitive approach: if two strings are anagrams, their sorted versions must be identical. While correct, it's slower than counting for large inputs. Useful for understanding the problem but not optimal for interviews.

View File

@@ -0,0 +1,203 @@
title: Valid Palindrome II
slug: valid-palindrome-ii
difficulty: easy
leetcode_id: 680
leetcode_url: https://leetcode.com/problems/valid-palindrome-ii/
categories:
- strings
- two-pointers
patterns:
- two-pointers
description: |
Given a string `s`, return `true` *if the* `s` *can be a palindrome after deleting **at most one** character from it*.
constraints: |
- `1 <= s.length <= 10^5`
- `s` consists of lowercase English letters.
examples:
- input: 's = "aba"'
output: "true"
explanation: "The string is already a palindrome, so no deletion is needed."
- input: 's = "abca"'
output: "true"
explanation: "You could delete the character 'c' to get 'aba', which is a palindrome."
- input: 's = "abc"'
output: "false"
explanation: "No matter which character you delete, you cannot form a palindrome."
explanation:
intuition: |
Imagine checking if a word reads the same forwards and backwards by placing two fingers at opposite ends and moving them towards the centre.
For a regular palindrome check, if the characters under your fingers ever mismatch, the string fails immediately. But here we have a **second chance**: we're allowed to remove *one* character and try again.
Think of it like this: when you encounter your first mismatch at positions `left` and `right`, you have two options to "fix" it:
- **Skip the left character**: Check if the substring from `left + 1` to `right` is a palindrome
- **Skip the right character**: Check if the substring from `left` to `right - 1` is a palindrome
If either option produces a valid palindrome, the answer is `true`. This greedy approach works because you only get one deletion, so when characters don't match, you must decide immediately which one to skip.
approach: |
We solve this using a **Two Pointer Approach with One Chance**:
**Step 1: Set up two pointers**
- `left`: Start at index `0` (beginning of string)
- `right`: Start at index `len(s) - 1` (end of string)
&nbsp;
**Step 2: Move pointers inward while characters match**
- While `left < right`, compare `s[left]` and `s[right]`
- If they match, move both pointers inward (`left += 1`, `right -= 1`)
- If they don't match, we've found a problem — proceed to Step 3
&nbsp;
**Step 3: Handle the first mismatch**
- When `s[left] != s[right]`, try both deletion options:
- Check if `s[left+1:right+1]` is a palindrome (skip left character)
- Check if `s[left:right]` is a palindrome (skip right character)
- If either substring is a palindrome, return `true`
- If neither works, return `false`
&nbsp;
**Step 4: Return true if no mismatch found**
- If we complete the loop without finding a mismatch, the string is already a palindrome — return `true`
&nbsp;
This works because we greedily match characters from outside in, and only use our one deletion when absolutely necessary.
common_pitfalls:
- title: Checking All Possible Deletions
description: |
A naive approach is to try deleting each character one by one and check if the result is a palindrome:
```
for i in range(n):
if is_palindrome(s[:i] + s[i+1:]):
return True
```
This results in **O(n^2) time complexity** because you create `n` substrings and check each one in O(n) time. With `s.length <= 10^5`, this approach will be too slow.
Instead, use two pointers to find the *exact* position where a deletion might help, then only check those two possibilities.
wrong_approach: "Try deleting each character one by one"
correct_approach: "Two pointers to find the mismatch, then check two substrings"
- title: Only Trying One Deletion Option
description: |
When you find a mismatch at positions `left` and `right`, you might be tempted to only try skipping one of them. For example, always skipping the left character.
Consider `s = "aguokepatgbnvfqmgmlcupuufxoohdfpgjdmysgvhmvffcnqxjjxqncffvmhvgsymdj"`. When you hit the first mismatch, skipping the wrong character leads to failure even though the string *can* become a palindrome.
Always try **both** options: skip left OR skip right.
wrong_approach: "Only try skipping one character at mismatch"
correct_approach: "Try both skip-left and skip-right, return true if either works"
- title: Creating New Strings for Palindrome Check
description: |
Using string slicing like `s[left+1:right+1]` creates a new string, which uses O(n) space. While this works, you can optimise by passing indices to a helper function that checks palindrome in-place.
For this problem, the O(n) space from slicing is acceptable, but in interviews, mentioning the in-place optimisation shows depth of understanding.
key_takeaways:
- "**Two pointers for palindrome**: The classic technique of comparing from both ends works here with a twist — you get one 'undo' when characters don't match"
- "**Greedy decision point**: When you hit a mismatch, you must try both deletion options since you can't know in advance which will succeed"
- "**Building on fundamentals**: This problem extends the basic palindrome check — many interview problems are variations of simpler ones"
- "**Early termination**: If no mismatch is found, the string is already a palindrome — no deletion needed"
time_complexity: "O(n). We traverse the string at most twice — once with the main two pointers, and potentially once more to verify a substring after a mismatch."
space_complexity: "O(n) with string slicing for the substring palindrome check, or O(1) if using index-based helper function."
solutions:
- approach_name: Two Pointers with Helper Function
is_optimal: true
code: |
def valid_palindrome(s: str) -> bool:
def is_palindrome(left: int, right: int) -> bool:
"""Check if s[left:right+1] is a palindrome using indices."""
while left < right:
if s[left] != s[right]:
return False
left += 1
right -= 1
return True
left, right = 0, len(s) - 1
while left < right:
if s[left] != s[right]:
# Mismatch found — try skipping left or right character
return is_palindrome(left + 1, right) or is_palindrome(left, right - 1)
left += 1
right -= 1
# No mismatch found — already a palindrome
return True
explanation: |
**Time Complexity:** O(n) — We scan through the string at most twice.
**Space Complexity:** O(1) — We only use pointer variables; no extra space proportional to input size.
The helper function checks if a substring is a palindrome using indices, avoiding string slicing. When we find a mismatch, we try both options (skip left or skip right) and return true if either produces a palindrome.
- approach_name: Two Pointers with String Slicing
is_optimal: false
code: |
def valid_palindrome(s: str) -> bool:
def is_palindrome(substring: str) -> bool:
"""Check if a string is a palindrome."""
return substring == substring[::-1]
left, right = 0, len(s) - 1
while left < right:
if s[left] != s[right]:
# Try removing left character or right character
skip_left = s[left + 1:right + 1]
skip_right = s[left:right]
return is_palindrome(skip_left) or is_palindrome(skip_right)
left += 1
right -= 1
return True
explanation: |
**Time Complexity:** O(n) — String reversal and comparison are both O(n).
**Space Complexity:** O(n) — String slicing creates new strings.
This version is more readable but uses extra space for the substring copies. The logic is identical: find the first mismatch, then check if either deletion option creates a palindrome.
- approach_name: Brute Force
is_optimal: false
code: |
def valid_palindrome(s: str) -> bool:
def is_palindrome(string: str) -> bool:
return string == string[::-1]
# Check if already a palindrome
if is_palindrome(s):
return True
# Try deleting each character
for i in range(len(s)):
# Create string with character at index i removed
modified = s[:i] + s[i + 1:]
if is_palindrome(modified):
return True
return False
explanation: |
**Time Complexity:** O(n^2) — We try n deletions, each requiring O(n) palindrome check.
**Space Complexity:** O(n) — Creating modified strings.
This brute force approach is correct but too slow for large inputs. It's included to illustrate why the two-pointer optimisation is necessary. With `n = 10^5`, this would perform up to 10 billion operations.

View File

@@ -0,0 +1,179 @@
title: Valid Palindrome
slug: valid-palindrome
difficulty: easy
leetcode_id: 125
leetcode_url: https://leetcode.com/problems/valid-palindrome/
categories:
- strings
- two-pointers
patterns:
- two-pointers
function_signature: "def is_palindrome(s: str) -> bool:"
test_cases:
visible:
- input: { s: "A man, a plan, a canal: Panama" }
expected: true
- input: { s: "race a car" }
expected: false
- input: { s: " " }
expected: true
hidden:
- input: { s: "a" }
expected: true
- input: { s: "ab" }
expected: false
- input: { s: "Aa" }
expected: true
description: |
A phrase is a **palindrome** if, after converting all uppercase letters into lowercase letters and removing all non-alphanumeric characters, it reads the same forward and backward. Alphanumeric characters include letters and numbers.
Given a string `s`, return `true` *if it is a **palindrome***, or `false` *otherwise*.
constraints: |
- `1 <= s.length <= 2 * 10^5`
- `s` consists only of printable ASCII characters.
examples:
- input: 's = "A man, a plan, a canal: Panama"'
output: "true"
explanation: '"amanaplanacanalpanama" is a palindrome.'
- input: 's = "race a car"'
output: "false"
explanation: '"raceacar" is not a palindrome.'
- input: 's = " "'
output: "true"
explanation: 's is an empty string "" after removing non-alphanumeric characters. Since an empty string reads the same forward and backward, it is a palindrome.'
explanation:
intuition: |
Imagine you have a word written on a piece of paper, and you want to check if it reads the same when you flip the paper upside down (ignoring case and punctuation).
The key insight is that a palindrome has **mirror symmetry** — the first character matches the last, the second matches the second-to-last, and so on. You only need to compare characters **from opposite ends** moving toward the middle.
Think of it like this: place one finger at the start of the string and another at the end. Move them toward each other, comparing alphanumeric characters as you go. If any pair doesn't match (after normalising case), it's not a palindrome. If they meet in the middle without a mismatch, it is.
The clever part is **skipping non-alphanumeric characters** on the fly. Instead of building a cleaned string first, we simply advance our pointers past any characters that aren't letters or digits.
approach: |
We solve this using the **Two Pointers** technique:
**Step 1: Initialise two pointers**
- `left`: Starts at index `0` (beginning of the string)
- `right`: Starts at index `len(s) - 1` (end of the string)
&nbsp;
**Step 2: Move pointers toward each other**
- While `left < right`:
- Skip non-alphanumeric characters by advancing `left` while `s[left]` is not alphanumeric
- Skip non-alphanumeric characters by decrementing `right` while `s[right]` is not alphanumeric
- Compare `s[left].lower()` with `s[right].lower()`
- If they don't match, return `false` immediately
- If they match, move `left` forward and `right` backward
&nbsp;
**Step 3: Return the result**
- If the loop completes without finding a mismatch, return `true`
&nbsp;
This approach processes the string in-place without creating a cleaned copy, achieving O(1) space complexity while maintaining O(n) time.
common_pitfalls:
- title: Creating a Cleaned String First
description: |
A common approach is to first filter the string to keep only alphanumeric characters, convert to lowercase, then check if it equals its reverse:
```python
cleaned = ''.join(c.lower() for c in s if c.isalnum())
return cleaned == cleaned[::-1]
```
While this works and is easy to understand, it uses **O(n) extra space** for the cleaned string. The two-pointer approach achieves the same result with O(1) space by processing in-place.
wrong_approach: "Filter and reverse comparison"
correct_approach: "Two pointers comparing in-place"
- title: Forgetting to Skip Non-Alphanumeric Characters
description: |
If you simply compare `s[left]` and `s[right]` without skipping punctuation and spaces, you'll get wrong answers.
For example, `"A man, a plan, a canal: Panama"` would fail because `'A'` would be compared with `'a'` (correct), but then `' '` would be compared with `'m'` (incorrect).
Always advance the pointers past non-alphanumeric characters before comparing.
- title: Case Sensitivity
description: |
Forgetting to convert characters to the same case before comparison will cause failures.
`'A'` and `'a'` should be considered equal, but in ASCII they have different values (65 vs 97). Always use `.lower()` or `.upper()` when comparing.
- title: Boundary Conditions in Pointer Movement
description: |
When skipping non-alphanumeric characters, ensure you don't move the pointer out of bounds. Always check `left < right` during the skip loops, not just in the main loop.
For a string like `".,,"`, both pointers need to skip all characters without causing an index error.
key_takeaways:
- "**Two Pointers pattern**: Comparing elements from opposite ends is a fundamental technique for palindrome problems and many array/string problems"
- "**In-place processing**: Skipping unwanted characters during traversal is more space-efficient than building a filtered copy"
- "**Normalisation**: When comparing text, always consider case sensitivity and which characters should be included"
- "**Foundation for variants**: This technique extends to problems like Valid Palindrome II (remove at most one character) and checking palindromes in linked lists"
time_complexity: "O(n). Each character is visited at most once by either the left or right pointer."
space_complexity: "O(1). We only use two integer pointers regardless of input size."
solutions:
- approach_name: Two Pointers
is_optimal: true
code: |
def is_palindrome(s: str) -> bool:
left = 0
right = len(s) - 1
while left < right:
# Skip non-alphanumeric characters from the left
while left < right and not s[left].isalnum():
left += 1
# Skip non-alphanumeric characters from the right
while left < right and not s[right].isalnum():
right -= 1
# Compare characters (case-insensitive)
if s[left].lower() != s[right].lower():
return False
# Move pointers toward the center
left += 1
right -= 1
return True
explanation: |
**Time Complexity:** O(n) — Each character is visited at most once.
**Space Complexity:** O(1) — Only two integer pointers used.
We use two pointers starting from opposite ends. For each iteration, we skip any non-alphanumeric characters, then compare the alphanumeric characters (case-insensitively). If all pairs match, the string is a palindrome.
- approach_name: Filter and Reverse
is_optimal: false
code: |
def is_palindrome(s: str) -> bool:
# Build a cleaned string with only lowercase alphanumeric chars
cleaned = ''.join(char.lower() for char in s if char.isalnum())
# Compare with its reverse
return cleaned == cleaned[::-1]
explanation: |
**Time Complexity:** O(n) — One pass to filter, one pass to reverse, one pass to compare.
**Space Complexity:** O(n) — The cleaned string and its reverse each take O(n) space.
This approach is more intuitive but less space-efficient. We first create a cleaned version of the string containing only lowercase alphanumeric characters, then check if it equals its reverse. While correct, the two-pointer approach is preferred for better space efficiency.

View File

@@ -0,0 +1,246 @@
title: Valid Parenthesis String
slug: valid-parenthesis-string
difficulty: medium
leetcode_id: 678
leetcode_url: https://leetcode.com/problems/valid-parenthesis-string/
categories:
- strings
- stack
- dynamic-programming
patterns:
- greedy
- dynamic-programming
description: |
Given a string `s` containing only three types of characters: `'('`, `')'` and `'*'`, return `true` *if* `s` *is **valid***.
The following rules define a **valid** string:
- Any left parenthesis `'('` must have a corresponding right parenthesis `')'`.
- Any right parenthesis `')'` must have a corresponding left parenthesis `'('`.
- Left parenthesis `'('` must go before the corresponding right parenthesis `')'`.
- `'*'` could be treated as a single right parenthesis `')'` or a single left parenthesis `'('` or an empty string `""`.
constraints: |
- `1 <= s.length <= 100`
- `s[i]` is `'('`, `')'` or `'*'`.
examples:
- input: 's = "()"'
output: "true"
explanation: "Standard valid parentheses with matching open and close."
- input: 's = "(*)"'
output: "true"
explanation: "The '*' can be treated as an empty string, leaving valid '()'."
- input: 's = "(*))"'
output: "true"
explanation: "The '*' can be treated as '(', making the string '(())' which is valid."
explanation:
intuition: |
Think of this problem as **tracking the possible number of unmatched open parentheses** at any point in the string.
Without wildcards, validating parentheses is straightforward: maintain a counter that increases for `'('` and decreases for `')'`. If it ever goes negative or doesn't end at zero, the string is invalid.
The `'*'` wildcard complicates things because it can be any of three things: `'('`, `')'`, or empty. Instead of tracking a single count, we need to track a **range of possibilities**.
Imagine you're walking through the string left to right. At each position, the number of unmatched `'('` could be anywhere within a range:
- **Minimum count** (`lo`): The fewest unmatched `'('` we could have (if we treat `'*'` as `)` or empty when helpful)
- **Maximum count** (`hi`): The most unmatched `'('` we could have (if we treat `'*'` as `(` when helpful)
As long as there exists *some* valid interpretation (i.e., the range includes zero at the end), the string is valid. The key insight is that we don't need to try all 3<sup>n</sup> combinations — we just need to track the bounds.
approach: |
We solve this using a **Greedy Range Tracking** approach:
**Step 1: Initialise two counters**
- `lo`: Set to `0`, representing the minimum possible unmatched `'('`
- `hi`: Set to `0`, representing the maximum possible unmatched `'('`
&nbsp;
**Step 2: Iterate through each character**
- For `'('`: Both `lo` and `hi` increase by 1 (we must have one more unmatched open)
- For `')'`: Both `lo` and `hi` decrease by 1 (we close one open parenthesis)
- For `'*'`: `lo` decreases by 1 (treat as `)` or empty), `hi` increases by 1 (treat as `(`)
&nbsp;
**Step 3: Maintain validity of the range**
- If `hi` goes negative, we have too many `)` that can't be matched — return `false`
- Keep `lo` at least 0 (we can't have negative unmatched opens in reality; this just means we'd treat some `'*'` differently)
&nbsp;
**Step 4: Check final state**
- If `lo == 0` at the end, there's a valid way to interpret the wildcards
- Return `lo == 0`
&nbsp;
This greedy approach works because we're tracking all possible valid states simultaneously through the range `[lo, hi]`. If zero falls within this range at the end, we can construct a valid interpretation.
common_pitfalls:
- title: Trying All Combinations
description: |
A naive approach might try all possible interpretations of each `'*'` character, leading to **O(3^n) time complexity** where `n` is the number of wildcards.
With up to 100 characters potentially being wildcards, this would be astronomically slow. The range-tracking approach reduces this to O(n) by recognising we only need to track bounds, not enumerate possibilities.
wrong_approach: "Recursively try all 3 options for each '*'"
correct_approach: "Track min/max range of possible open counts"
- title: Only Tracking One Counter
description: |
Using a single counter like regular parenthesis validation won't work. Consider `"(*)"`:
- If we always treat `'*'` as empty: `"()"` → valid
- If we always treat `'*'` as `'('`: `"(()"` → invalid
- If we always treat `'*'` as `')'`: `"())"` → invalid
We need to consider that different `'*'` characters might need different interpretations.
wrong_approach: "Single counter with fixed '*' interpretation"
correct_approach: "Two counters tracking the range of possibilities"
- title: Forgetting to Clamp the Minimum
description: |
When `'*'` is treated as `')'`, `lo` might go negative. But a negative count of unmatched `'('` doesn't make sense in reality — it just means we'd treat fewer `'*'` as `)`.
If we don't clamp `lo` to at least 0, we'll get incorrect results. For example, `"*"` should be valid (treat as empty), but without clamping, `lo` would be -1.
wrong_approach: "Let lo go negative without correction"
correct_approach: "Use lo = max(lo, 0) after each step"
key_takeaways:
- "**Range tracking**: When a problem has multiple valid states, track the bounds rather than enumerating all possibilities"
- "**Greedy with bounds**: This pattern of maintaining `[lo, hi]` range appears in other problems involving wildcards or uncertain values"
- "**Linear scan suffices**: Even with exponential possible interpretations, clever state tracking reduces complexity to O(n)"
- "**Extends classic pattern**: This builds on the basic parenthesis validation pattern by adding flexibility for wildcards"
time_complexity: "O(n). We traverse the string exactly once, performing constant-time operations at each character."
space_complexity: "O(1). We only use two integer variables (`lo` and `hi`), regardless of input size."
solutions:
- approach_name: Greedy Range Tracking
is_optimal: true
code: |
def check_valid_string(s: str) -> bool:
# lo = minimum possible unmatched '('
# hi = maximum possible unmatched '('
lo = 0
hi = 0
for char in s:
if char == '(':
# Must have one more unmatched open
lo += 1
hi += 1
elif char == ')':
# Close one open parenthesis
lo -= 1
hi -= 1
else: # char == '*'
# '*' as ')' or empty decreases lo
# '*' as '(' increases hi
lo -= 1
hi += 1
# Too many ')' that can't be matched
if hi < 0:
return False
# Can't have negative unmatched '(' in reality
lo = max(lo, 0)
# Valid if we can end with zero unmatched '('
return lo == 0
explanation: |
**Time Complexity:** O(n) — Single pass through the string.
**Space Complexity:** O(1) — Only two integer variables used.
We track the range of possible unmatched open parentheses. At each step, `lo` represents the minimum (assuming wildcards help close) and `hi` represents the maximum (assuming wildcards add opens). If zero is achievable at the end, the string is valid.
- approach_name: Two Stack
is_optimal: false
code: |
def check_valid_string(s: str) -> bool:
# Store indices of unmatched '(' and '*'
open_stack = []
star_stack = []
for i, char in enumerate(s):
if char == '(':
open_stack.append(i)
elif char == '*':
star_stack.append(i)
else: # char == ')'
# Try to match with '(' first, then '*'
if open_stack:
open_stack.pop()
elif star_stack:
star_stack.pop()
else:
# No '(' or '*' to match
return False
# Match remaining '(' with '*' that comes after
while open_stack and star_stack:
# '(' must come before '*' for valid match
if open_stack[-1] > star_stack[-1]:
return False
open_stack.pop()
star_stack.pop()
# Valid only if all '(' are matched
return len(open_stack) == 0
explanation: |
**Time Complexity:** O(n) — Single pass plus stack cleanup.
**Space Complexity:** O(n) — Stacks can store up to n indices.
This approach uses two stacks to track positions of unmatched `'('` and `'*'`. When we see `')'`, we prefer matching with `'('`. After the scan, we try to match remaining `'('` with `'*'` that appear to their right. While correct, it uses more space than the greedy approach.
- approach_name: Dynamic Programming
is_optimal: false
code: |
def check_valid_string(s: str) -> bool:
n = len(s)
# dp[i][j] = True if s[i:] is valid with j unmatched '('
# Use memoisation for efficiency
memo = {}
def dp(i: int, open_count: int) -> bool:
# Base case: end of string
if i == n:
return open_count == 0
# Too many unmatched ')' seen
if open_count < 0:
return False
if (i, open_count) in memo:
return memo[(i, open_count)]
char = s[i]
if char == '(':
result = dp(i + 1, open_count + 1)
elif char == ')':
result = dp(i + 1, open_count - 1)
else: # char == '*'
# Try all three options
result = (dp(i + 1, open_count + 1) or # as '('
dp(i + 1, open_count - 1) or # as ')'
dp(i + 1, open_count)) # as empty
memo[(i, open_count)] = result
return result
return dp(0, 0)
explanation: |
**Time Complexity:** O(n^2) — At most n positions times n possible open counts.
**Space Complexity:** O(n^2) — Memoisation table size.
This recursive approach with memoisation explores all possibilities but caches results. For each position and open count, we determine if the remaining string can be valid. While correct and more intuitive, it's less efficient than the greedy approach for this problem.

View File

@@ -0,0 +1,238 @@
title: Valid Sudoku
slug: valid-sudoku
difficulty: medium
leetcode_id: 36
leetcode_url: https://leetcode.com/problems/valid-sudoku/
categories:
- arrays
- hash-tables
patterns:
- matrix-traversal
description: |
Determine if a `9 x 9` Sudoku board is valid. Only the filled cells need to be validated **according to the following rules**:
1. Each row must contain the digits `1-9` without repetition.
2. Each column must contain the digits `1-9` without repetition.
3. Each of the nine `3 x 3` sub-boxes of the grid must contain the digits `1-9` without repetition.
**Note:**
- A Sudoku board (partially filled) could be valid but is not necessarily solvable.
- Only the filled cells need to be validated according to the mentioned rules.
constraints: |
- `board.length == 9`
- `board[i].length == 9`
- `board[i][j]` is a digit `1-9` or `'.'`
examples:
- input: 'board = [["5","3",".",".","7",".",".",".","."],["6",".",".","1","9","5",".",".","."],[".","9","8",".",".",".",".","6","."],["8",".",".",".","6",".",".",".","3"],["4",".",".","8",".","3",".",".","1"],["7",".",".",".","2",".",".",".","6"],[".","6",".",".",".",".","2","8","."],[".",".",".","4","1","9",".",".","5"],[".",".",".",".","8",".",".","7","9"]]'
output: "true"
explanation: "The board satisfies all three rules: no row, column, or 3x3 sub-box contains duplicate digits."
- input: 'board = [["8","3",".",".","7",".",".",".","."],["6",".",".","1","9","5",".",".","."],[".","9","8",".",".",".",".","6","."],["8",".",".",".","6",".",".",".","3"],["4",".",".","8",".","3",".",".","1"],["7",".",".",".","2",".",".",".","6"],[".","6",".",".",".",".","2","8","."],[".",".",".","4","1","9",".",".","5"],[".",".",".",".","8",".",".","7","9"]]'
output: "false"
explanation: "The top-left 3x3 sub-box contains two 8's (at positions [0][0] and [3][0]), making it invalid."
explanation:
intuition: |
Think of this problem as a **triple bookkeeping challenge**. Imagine you're a Sudoku referee checking whether a partially filled board follows the rules — you need to keep track of which numbers have appeared in each row, each column, and each 3x3 box.
The key insight is that you don't need to solve the Sudoku — you only need to verify that no rule is currently broken. This means checking for **duplicates** in three different contexts simultaneously.
Picture walking through the board cell by cell. For each filled cell, you ask three questions:
- "Have I seen this number in this row before?"
- "Have I seen this number in this column before?"
- "Have I seen this number in this 3x3 box before?"
If the answer to any of these is "yes," the board is invalid. If you finish scanning all cells without finding a conflict, the board is valid.
The clever part is figuring out which 3x3 box a cell belongs to. For a cell at position `(row, col)`, its box index can be computed as `(row // 3, col // 3)`. This maps the 9x9 grid into a 3x3 grid of boxes.
approach: |
We solve this using **Hash Sets for Tracking**:
**Step 1: Initialise tracking structures**
- `rows`: A list of 9 sets, where `rows[i]` tracks digits seen in row `i`
- `cols`: A list of 9 sets, where `cols[j]` tracks digits seen in column `j`
- `boxes`: A list of 9 sets, where `boxes[k]` tracks digits seen in box `k`
&nbsp;
**Step 2: Iterate through every cell**
- For each cell at `(row, col)`, skip if it contains `'.'`
- Calculate the box index: `box_idx = (row // 3) * 3 + (col // 3)`
- This formula maps the 3x3 sub-grids to indices 0-8
&nbsp;
**Step 3: Check for duplicates**
- If the digit is already in `rows[row]`, `cols[col]`, or `boxes[box_idx]`, return `False`
- Otherwise, add the digit to all three sets
&nbsp;
**Step 4: Return the result**
- If we complete the iteration without finding duplicates, return `True`
common_pitfalls:
- title: Incorrect Box Index Calculation
description: |
The trickiest part is mapping `(row, col)` to the correct box index. A common mistake is using `row // 3 + col // 3`, which doesn't uniquely identify boxes.
For example, cell `(0, 3)` and cell `(1, 0)` would both map to box index `1` with the wrong formula, but they're actually in different boxes.
The correct formula is `(row // 3) * 3 + (col // 3)`:
- `row // 3` gives the box row (0, 1, or 2)
- `col // 3` gives the box column (0, 1, or 2)
- Combining them: `box_row * 3 + box_col` gives indices 0-8
wrong_approach: "box_idx = row // 3 + col // 3"
correct_approach: "box_idx = (row // 3) * 3 + (col // 3)"
- title: Checking Empty Cells
description: |
Empty cells (containing `'.'`) should be skipped entirely. A common error is forgetting to check for empty cells, which might cause issues if `'.'` gets added to your tracking sets.
Always check `if cell == '.'` and `continue` before processing.
wrong_approach: "Processing all cells including empty ones"
correct_approach: "Skip cells containing '.'"
- title: Using Lists Instead of Sets
description: |
Using lists with `in` checks results in O(n) lookup time per check. With sets, membership testing is O(1) on average.
While this doesn't change the overall O(81) = O(1) complexity for a fixed 9x9 board, using sets is the idiomatic and efficient approach for duplicate detection.
wrong_approach: "rows = [[] for _ in range(9)]"
correct_approach: "rows = [set() for _ in range(9)]"
key_takeaways:
- "**Hash sets for duplicate detection**: When checking for duplicates across multiple dimensions, use separate sets for each dimension"
- "**2D to 1D index mapping**: The formula `(row // 3) * 3 + (col // 3)` is a common pattern for mapping 2D sub-grids to unique indices"
- "**Validation vs solving**: This problem only validates current state — it doesn't require backtracking or solving the puzzle"
- "**Fixed-size optimisation**: Since the board is always 9x9, the complexity is technically O(1), but the algorithm generalises to larger grids"
time_complexity: "O(81) = O(1). We iterate through each of the 81 cells exactly once. For a general `n x n` board, this would be O(n^2)."
space_complexity: "O(81) = O(1). We use 27 sets (9 rows + 9 columns + 9 boxes), each storing at most 9 digits. For a general `n x n` board, this would be O(n^2)."
solutions:
- approach_name: Hash Set Tracking
is_optimal: true
code: |
def is_valid_sudoku(board: list[list[str]]) -> bool:
# Initialise sets for each row, column, and 3x3 box
rows = [set() for _ in range(9)]
cols = [set() for _ in range(9)]
boxes = [set() for _ in range(9)]
for row in range(9):
for col in range(9):
digit = board[row][col]
# Skip empty cells
if digit == '.':
continue
# Calculate which 3x3 box this cell belongs to
box_idx = (row // 3) * 3 + (col // 3)
# Check if digit already exists in row, column, or box
if digit in rows[row]:
return False
if digit in cols[col]:
return False
if digit in boxes[box_idx]:
return False
# Add digit to all three tracking sets
rows[row].add(digit)
cols[col].add(digit)
boxes[box_idx].add(digit)
# No duplicates found
return True
explanation: |
**Time Complexity:** O(1) — We always iterate through exactly 81 cells.
**Space Complexity:** O(1) — We use a fixed number of sets (27 total) with at most 9 elements each.
This solution uses hash sets to efficiently track which digits have been seen in each row, column, and 3x3 box. The key insight is computing the box index from the cell coordinates using integer division.
- approach_name: Single Set with Encoded Keys
is_optimal: true
code: |
def is_valid_sudoku(board: list[list[str]]) -> bool:
seen = set()
for row in range(9):
for col in range(9):
digit = board[row][col]
if digit == '.':
continue
# Create unique keys for row, column, and box
row_key = (digit, 'row', row)
col_key = (digit, 'col', col)
box_key = (digit, 'box', row // 3, col // 3)
# Check if any key already exists
if row_key in seen or col_key in seen or box_key in seen:
return False
# Add all three keys
seen.add(row_key)
seen.add(col_key)
seen.add(box_key)
return True
explanation: |
**Time Complexity:** O(1) — Same iteration through 81 cells.
**Space Complexity:** O(1) — Single set with at most 243 entries (81 cells x 3 keys each).
This alternative uses a single set with encoded tuples to distinguish between row, column, and box constraints. The tuple structure ensures uniqueness: `('5', 'row', 0)` means "digit 5 in row 0". This approach is more compact in code but uses slightly more memory per entry due to tuple overhead.
- approach_name: Bitmask Tracking
is_optimal: true
code: |
def is_valid_sudoku(board: list[list[str]]) -> bool:
# Use integers as bitmasks (bit i represents digit i)
rows = [0] * 9
cols = [0] * 9
boxes = [0] * 9
for row in range(9):
for col in range(9):
digit = board[row][col]
if digit == '.':
continue
# Convert digit to bit position (1-9 -> bits 1-9)
bit = 1 << int(digit)
box_idx = (row // 3) * 3 + (col // 3)
# Check if bit is already set in any mask
if rows[row] & bit:
return False
if cols[col] & bit:
return False
if boxes[box_idx] & bit:
return False
# Set the bit in all three masks
rows[row] |= bit
cols[col] |= bit
boxes[box_idx] |= bit
return True
explanation: |
**Time Complexity:** O(1) — Same iteration through 81 cells.
**Space Complexity:** O(1) — Uses 27 integers instead of 27 sets.
This solution uses bitmasks for more memory-efficient tracking. Each integer represents a set of digits using bits: bit `i` being set means digit `i` has been seen. Bitwise AND (`&`) checks membership, and bitwise OR (`|=`) adds elements. This approach is faster in practice due to CPU-level bit operations.

View File

@@ -0,0 +1,179 @@
title: Validate Binary Search Tree
slug: validate-binary-search-tree
difficulty: medium
leetcode_id: 98
leetcode_url: https://leetcode.com/problems/validate-binary-search-tree/
categories:
- trees
- recursion
patterns:
- dfs
- tree-traversal
description: |
Given the `root` of a binary tree, determine if it is a valid **binary search tree (BST)**.
A valid BST is defined as follows:
- The left subtree of a node contains only nodes with keys **strictly less than** the node's key.
- The right subtree of a node contains only nodes with keys **strictly greater than** the node's key.
- Both the left and right subtrees must also be binary search trees.
constraints: |
- The number of nodes in the tree is in the range `[1, 10^4]`
- `-2^31 <= Node.val <= 2^31 - 1`
examples:
- input: "root = [2,1,3]"
output: "true"
explanation: "Left child 1 < root 2 < right child 3. Valid BST."
- input: "root = [5,1,4,null,null,3,6]"
output: "false"
explanation: "The root's right child is 4, which is less than root 5. Even though 4's children (3, 6) satisfy local constraints, 3 should be > 5 to be in the right subtree. Invalid BST."
explanation:
intuition: |
The naive approach is to check if each node satisfies `left.val < node.val < right.val`. But this only checks **local** constraints. BST requires **global** constraints!
Think of it like this: every node in the right subtree must be greater than the root — not just the immediate right child. In `[5,1,4,null,null,3,6]`, the value 3 is correctly less than its parent 4, but it's in the right subtree of 5, so it should be greater than 5!
The key insight is to pass down **valid ranges** as we traverse. When we go left, we tighten the upper bound. When we go right, we tighten the lower bound.
For example, starting at root 5:
- Left subtree must have values in `(-∞, 5)`
- Right subtree must have values in `(5, +∞)`
- In the right subtree, going left to node 3 requires values in `(5, 4)` — but 3 < 5, so it fails!
approach: |
We solve this using **DFS with Range Validation**:
**Step 1: Define the recursive validation function**
- `validate(node, min_val, max_val)` returns True if the subtree rooted at `node` is a valid BST within the range `(min_val, max_val)`
- Start with the root and range `(-∞, +∞)`
&nbsp;
**Step 2: Base case**
- If `node` is None, return True (empty tree is valid)
&nbsp;
**Step 3: Check the current node**
- If `node.val <= min_val` or `node.val >= max_val`, return False
- The node violates its required range
&nbsp;
**Step 4: Recurse with updated ranges**
- Validate left subtree with range `(min_val, node.val)` — must be less than current node
- Validate right subtree with range `(node.val, max_val)` — must be greater than current node
- Return True only if both subtrees are valid
&nbsp;
This ensures every node satisfies constraints from all its ancestors, not just its parent.
common_pitfalls:
- title: Only Checking Immediate Parent-Child Relationships
description: |
Checking `left.val < node.val` and `node.val < right.val` only validates local constraints. A node deep in a subtree might violate constraints from ancestors.
For example, in `[5,1,4,null,null,3,6]`, node 3 is correctly less than its parent 4, but it's in the right subtree of 5 and should be greater than 5.
wrong_approach: "Only checking node.left.val < node.val < node.right.val"
correct_approach: "Pass min/max bounds down through recursion"
- title: Using Integer Min/Max as Initial Bounds
description: |
Node values can be at integer boundaries (`-2^31` to `2^31-1`). Using `-2^31` as the initial lower bound would reject a valid node with that value.
Use `float('-inf')` and `float('inf')` or `None` with special handling.
wrong_approach: "validate(root, -2**31, 2**31-1)"
correct_approach: "validate(root, float('-inf'), float('inf'))"
- title: Using Non-Strict Comparisons
description: |
The BST definition requires **strictly** less than and **strictly** greater than. Equal values are not allowed.
Use `<` and `>`, not `<=` and `>=`.
wrong_approach: "if node.val <= max_val and node.val >= min_val"
correct_approach: "if min_val < node.val < max_val"
key_takeaways:
- "**BST is a global property**: Every node must satisfy constraints from ALL ancestors, not just its parent"
- "**Range propagation**: Pass valid ranges down during recursion to enforce global constraints"
- "**Inorder traversal alternative**: BST's inorder traversal produces a strictly increasing sequence"
- "**Handle boundary values**: Use infinity or None for initial bounds to handle edge cases"
time_complexity: "O(n). We visit each node exactly once."
space_complexity: "O(h). Recursion stack depth equals tree height — O(log n) for balanced trees, O(n) for skewed trees."
solutions:
- approach_name: DFS with Range Validation
is_optimal: true
code: |
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def is_valid_bst(root: TreeNode | None) -> bool:
def validate(node: TreeNode | None, min_val: float, max_val: float) -> bool:
# Empty tree is valid
if not node:
return True
# Check if current node violates its range
if node.val <= min_val or node.val >= max_val:
return False
# Validate subtrees with tightened bounds
# Left subtree: values must be < node.val
# Right subtree: values must be > node.val
return (validate(node.left, min_val, node.val) and
validate(node.right, node.val, max_val))
# Start with infinite bounds
return validate(root, float('-inf'), float('inf'))
explanation: |
**Time Complexity:** O(n) — Visit each node once.
**Space Complexity:** O(h) — Recursion stack depth (tree height).
We pass valid ranges down the tree. Going left tightens the upper bound (must be less than parent). Going right tightens the lower bound (must be greater than parent). Each node is checked against accumulated constraints from all ancestors.
- approach_name: Inorder Traversal
is_optimal: true
code: |
def is_valid_bst(root: TreeNode | None) -> bool:
prev = float('-inf')
def inorder(node: TreeNode | None) -> bool:
nonlocal prev
if not node:
return True
# Traverse left subtree
if not inorder(node.left):
return False
# Check current node against previous value
if node.val <= prev:
return False
prev = node.val
# Traverse right subtree
return inorder(node.right)
return inorder(root)
explanation: |
**Time Complexity:** O(n) — Visit each node once.
**Space Complexity:** O(h) — Recursion stack depth.
Inorder traversal of a valid BST produces a strictly increasing sequence. We track the previous value and ensure each node is greater than it. If any node fails this check, the tree is not a valid BST.

View File

@@ -0,0 +1,183 @@
title: Verifying an Alien Dictionary
slug: verifying-an-alien-dictionary
difficulty: easy
leetcode_id: 953
leetcode_url: https://leetcode.com/problems/verifying-an-alien-dictionary/
categories:
- arrays
- strings
- hash-tables
patterns:
- two-pointers
description: |
In an alien language, surprisingly, they also use English lowercase letters, but possibly in a different `order`. The `order` of the alphabet is some permutation of lowercase letters.
Given a sequence of `words` written in the alien language, and the `order` of the alphabet, return `true` if and only if the given `words` are sorted lexicographically in this alien language.
constraints: |
- `1 <= words.length <= 100`
- `1 <= words[i].length <= 20`
- `order.length == 26`
- All characters in `words[i]` and `order` are English lowercase letters
examples:
- input: 'words = ["hello","leetcode"], order = "hlabcdefgijkmnopqrstuvwxyz"'
output: "true"
explanation: "As 'h' comes before 'l' in this language, the sequence is sorted."
- input: 'words = ["word","world","row"], order = "worldabcefghijkmnpqstuvxyz"'
output: "false"
explanation: "As 'd' comes after 'l' in this language, words[0] > words[1], hence the sequence is unsorted."
- input: 'words = ["apple","app"], order = "abcdefghijklmnopqrstuvwxyz"'
output: "false"
explanation: "The first three characters 'app' match, and the second string is shorter. According to lexicographical rules 'apple' > 'app', because 'l' > '∅' (the blank character is less than any other character)."
explanation:
intuition: |
Imagine you're a librarian in an alien library, tasked with checking if books are shelved in alphabetical order — but the alphabet itself is different!
In our familiar English, we know `a < b < c < ... < z`. But what if the order was `h < l < a < b < ...`? Then "hello" would come before "leetcode" because `h` precedes `l` in this alien alphabet.
The core insight is that **verifying sorted order is just comparing adjacent pairs**. If every consecutive pair of words is in the correct order, the entire list must be sorted. You don't need to compare every word with every other word — just check neighbours.
Think of it like dominos: if word 1 ≤ word 2, and word 2 ≤ word 3, and so on, then the whole sequence is sorted. One "out of order" pair breaks the chain.
The second key insight is that comparing two words follows the same logic as comparing strings in any language: compare character by character until you find a difference, then the word with the "smaller" character comes first. If one word is a prefix of another, the shorter one comes first.
approach: |
We solve this by **building a character priority map** and then **comparing adjacent word pairs**:
**Step 1: Build the character order map**
- Create a dictionary mapping each character to its position in the alien alphabet
- `order_map[char] = index` gives us O(1) lookups for character priority
- Example: if `order = "hlabcdefgijkmnopqrstuvwxyz"`, then `order_map['h'] = 0`, `order_map['l'] = 1`, etc.
&nbsp;
**Step 2: Create a helper function to compare two words**
- Compare characters at the same position in both words
- If characters differ, return whether `word1[i] < word2[i]` according to `order_map`
- If all compared characters match but `word1` is longer, return `False` (prefix rule)
- If we exhaust the comparison without issues, return `True`
&nbsp;
**Step 3: Check all adjacent pairs**
- Iterate through `words[0]` to `words[n-2]`
- For each pair `(words[i], words[i+1])`, verify they're in correct order
- If any pair is out of order, return `False` immediately
- If all pairs pass, return `True`
&nbsp;
This approach efficiently validates the entire list by leveraging the transitive property of ordering.
common_pitfalls:
- title: Forgetting the Prefix Rule
description: |
When one word is a prefix of another (e.g., "app" vs "apple"), the shorter word must come first.
Many solutions correctly handle character-by-character comparison but forget this edge case. If you reach the end of the shorter word without finding a difference, you must check: is the first word shorter or equal in length? If `word1` is longer than `word2` and `word2` is a prefix of `word1`, the order is wrong.
Example: `["apple", "app"]` should return `False` because "apple" is longer and "app" is its prefix.
wrong_approach: "Only comparing characters without checking lengths"
correct_approach: "After the loop, check if word1 is longer than word2"
- title: Comparing All Pairs Instead of Adjacent Pairs
description: |
Some solutions try to compare every word with every other word, resulting in O(n²) comparisons where n is the number of words.
This is unnecessary! Due to the transitive property of ordering (if a ≤ b and b ≤ c, then a ≤ c), you only need to check adjacent pairs. If `words[0] ≤ words[1]` and `words[1] ≤ words[2]`, then `words[0] ≤ words[2]` is guaranteed.
wrong_approach: "Nested loops comparing all word pairs"
correct_approach: "Single pass comparing consecutive pairs"
- title: Using Character ASCII Values
description: |
Don't compare characters using their built-in ASCII values (`ord(c)`). The alien alphabet has a custom order that may differ completely from ASCII.
For example, in `order = "hlabcdefgijkmnopqrstuvwxyz"`, the character `h` has a lower priority than `l`, even though `ord('h') > ord('l')` is `False` in ASCII.
Always use the custom order map for comparisons.
wrong_approach: "Using ord() or direct character comparison"
correct_approach: "Using order_map[char] for priority lookups"
key_takeaways:
- "**Adjacent pair checking**: To verify a list is sorted, you only need to check consecutive pairs — the transitive property handles the rest"
- "**Hash map for custom ordering**: When dealing with non-standard orderings, build a priority map for O(1) lookups"
- "**Prefix rule in lexicographic order**: The shorter word comes first when one is a prefix of the other"
- "**Foundation for sorting problems**: This comparison logic is the building block for implementing custom sort comparators"
time_complexity: "O(m) where m is the total number of characters across all words. We examine each character at most once during pairwise comparisons."
space_complexity: "O(1). The order map has exactly 26 entries (fixed size), regardless of input size."
solutions:
- approach_name: Hash Map with Adjacent Comparison
is_optimal: true
code: |
def is_alien_sorted(words: list[str], order: str) -> bool:
# Build a map from character to its priority (position in alien alphabet)
order_map = {char: i for i, char in enumerate(order)}
def is_sorted_pair(word1: str, word2: str) -> bool:
"""Check if word1 comes before or equals word2 in alien order."""
# Compare character by character
for c1, c2 in zip(word1, word2):
if order_map[c1] < order_map[c2]:
# word1 is definitely smaller
return True
elif order_map[c1] > order_map[c2]:
# word1 is definitely larger — out of order!
return False
# Characters are equal, continue to next position
# All compared characters matched
# word1 must not be longer than word2 (prefix rule)
return len(word1) <= len(word2)
# Check all adjacent pairs
for i in range(len(words) - 1):
if not is_sorted_pair(words[i], words[i + 1]):
return False
return True
explanation: |
**Time Complexity:** O(m) where m is the total number of characters in all words. Each character is compared at most once.
**Space Complexity:** O(1) — The order map always contains exactly 26 entries.
We build a priority map for O(1) character lookups, then verify each adjacent pair follows the alien lexicographic order. The key insight is that checking adjacent pairs is sufficient due to transitivity.
- approach_name: Inline Comparison (No Helper Function)
is_optimal: true
code: |
def is_alien_sorted(words: list[str], order: str) -> bool:
# Map each character to its position in the alien alphabet
order_map = {c: i for i, c in enumerate(order)}
for i in range(len(words) - 1):
word1, word2 = words[i], words[i + 1]
# Compare characters at each position
for j in range(min(len(word1), len(word2))):
if order_map[word1[j]] < order_map[word2[j]]:
# word1 < word2, this pair is correctly ordered
break
elif order_map[word1[j]] > order_map[word2[j]]:
# word1 > word2, out of order!
return False
# Characters equal, continue checking
else:
# Exhausted comparison — check prefix rule
if len(word1) > len(word2):
return False
return True
explanation: |
**Time Complexity:** O(m) where m is the total number of characters.
**Space Complexity:** O(1) — Fixed-size order map.
This version inlines the comparison logic and uses Python's for-else construct. The `else` block executes only if the loop completes without `break`, meaning all compared characters were equal — at which point we apply the prefix rule.

View File

@@ -0,0 +1,274 @@
title: Word Break II
slug: word-break-ii
difficulty: hard
leetcode_id: 140
leetcode_url: https://leetcode.com/problems/word-break-ii/
categories:
- strings
- dynamic-programming
- hash-tables
patterns:
- backtracking
- dynamic-programming
description: |
Given a string `s` and a dictionary of strings `wordDict`, add spaces in `s` to construct a sentence where each word is a valid dictionary word. Return all such possible sentences in **any order**.
**Note** that the same word in the dictionary may be reused multiple times in the segmentation.
constraints: |
- `1 <= s.length <= 20`
- `1 <= wordDict.length <= 1000`
- `1 <= wordDict[i].length <= 10`
- `s` and `wordDict[i]` consist of only lowercase English letters
- All the strings of `wordDict` are **unique**
- Input is generated in a way that the length of the answer doesn't exceed `10^5`
examples:
- input: 's = "catsanddog", wordDict = ["cat","cats","and","sand","dog"]'
output: '["cats and dog","cat sand dog"]'
explanation: "Both 'cats and dog' and 'cat sand dog' are valid segmentations using dictionary words."
- input: 's = "pineapplepenapple", wordDict = ["apple","pen","applepen","pine","pineapple"]'
output: '["pine apple pen apple","pineapple pen apple","pine applepen apple"]'
explanation: "Note that you are allowed to reuse dictionary words. The word 'apple' appears twice in one solution."
- input: 's = "catsandog", wordDict = ["cats","dog","sand","and","cat"]'
output: "[]"
explanation: "There is no way to segment the string into valid dictionary words (the 'og' at the end cannot be matched)."
explanation:
intuition: |
Imagine you're reading a string of concatenated words with no spaces, like a text message where someone forgot to add spaces. Your task is to figure out all the possible ways to insert spaces so that every segment is a real word.
Think of the string as a path you need to walk from start to end. At each position, you look ahead to see if any dictionary word starts there. If you find a match, you can "jump" forward by the length of that word and continue from the new position. Some positions might have multiple valid words starting there, creating branching paths.
The key insight is that this problem has **overlapping subproblems**: when exploring different paths, you might reach the same position multiple times. For instance, both "cat" and "cats" might lead you to positions where the remaining string is identical. Rather than recomputing all possible sentences from that position each time, we can **memoize** the results.
This naturally leads to a **backtracking with memoization** approach: explore all valid word choices at each position, cache the results for substrings you've already solved, and combine the cached results to build complete sentences.
approach: |
We solve this using **Backtracking with Memoization**:
**Step 1: Convert dictionary to a set**
- Store `wordDict` in a hash set for O(1) word lookups
- This transforms membership checking from O(n) to O(1)
&nbsp;
**Step 2: Create a memoization cache**
- Use a dictionary mapping starting indices to lists of valid sentences
- Key: starting index in the string
- Value: list of all valid sentences that can be formed from that index to the end
&nbsp;
**Step 3: Define the recursive backtracking function**
- Base case: if we've reached the end of the string, return a list containing an empty string (signals successful segmentation)
- If the current index is already in the memo, return the cached result
- For each possible ending position from current index:
- Extract the substring and check if it's in the dictionary
- If valid, recursively get all sentences for the remaining string
- Prepend the current word to each returned sentence
- Cache and return all valid sentences from this position
&nbsp;
**Step 4: Build the final sentences**
- Start the recursion from index 0
- Each recursive call returns sentences for the substring from that index
- Combine words with spaces to form complete sentences
&nbsp;
The memoization ensures we never recompute results for the same starting position, while backtracking explores all valid word combinations.
common_pitfalls:
- title: Pure Backtracking Without Memoization
description: |
A naive backtracking approach without caching will recompute the same subproblems many times.
Consider `s = "aaaaaaa"` with `wordDict = ["a", "aa", "aaa"]`. From each position, there are multiple ways to proceed, and many paths lead to the same remaining substrings. Without memoization, the time complexity becomes exponential in the worst case.
By caching results for each starting index, we ensure each substring is processed only once.
wrong_approach: "Recursively explore without caching results"
correct_approach: "Use a memo dictionary to cache results by starting index"
- title: Forgetting to Handle the Empty Result Case
description: |
When the recursive call returns an empty list, it means no valid segmentation exists from that point. You should skip adding words in this case.
But when you reach the end of the string successfully, you should return `[""]` (a list with one empty string), not `[]`. This empty string serves as a base case that allows word concatenation to work correctly.
wrong_approach: "Return [] at end of string, confusing 'no solution' with 'found solution'"
correct_approach: "Return [''] at end of string as a successful termination signal"
- title: Inefficient String Concatenation
description: |
Building sentences by repeatedly concatenating strings can be inefficient in some languages due to string immutability.
Instead of `word + " " + sentence` in a loop, consider building a list of words and joining them at the end, or using the language's efficient string building mechanisms.
wrong_approach: "Repeated string concatenation in tight loops"
correct_approach: "Build word lists and join once, or use efficient string builders"
- title: Not Using a Hash Set for Dictionary
description: |
Checking if a word exists in a list takes O(n) time per check. With potentially many substring checks during backtracking, this adds up quickly.
Converting the dictionary to a hash set at the start gives O(1) lookups, significantly improving performance for large dictionaries.
wrong_approach: "Use list and check with 'in' operator on list"
correct_approach: "Convert wordDict to a set for O(1) membership testing"
key_takeaways:
- "**Backtracking + Memoization**: When a problem requires finding all solutions and has overlapping subproblems, combine backtracking (to explore all paths) with memoization (to avoid recomputation)"
- "**Index-based caching**: For string problems, cache by starting index rather than by substring to save memory and simplify the logic"
- "**Builds on Word Break I**: This problem extends LeetCode 139 (Word Break) from a boolean 'can it be segmented?' to 'return all segmentations' - understanding the simpler version helps with this one"
- "**Watch for exponential output**: The constraint that output length doesn't exceed `10^5` is crucial - without it, there could be exponentially many valid sentences"
time_complexity: "O(n * 2^n) in the worst case, where `n` is the length of the string. Each position can be a word boundary or not, leading to 2^n possible segmentations. However, memoization and the practical constraint on output size make this much faster for typical inputs."
space_complexity: "O(n * m) where `n` is string length and `m` is the number of valid sentences. The memo stores lists of sentences for each starting index, and the recursion stack can go up to depth `n`."
solutions:
- approach_name: Backtracking with Memoization
is_optimal: true
code: |
def word_break(s: str, word_dict: list[str]) -> list[str]:
# Convert to set for O(1) lookups
word_set = set(word_dict)
# Cache: starting index -> list of valid sentences from that index
memo = {}
def backtrack(start: int) -> list[str]:
# Already computed results for this starting position
if start in memo:
return memo[start]
# Reached end of string - successful segmentation
if start == len(s):
return [""]
sentences = []
# Try all possible end positions for current word
for end in range(start + 1, len(s) + 1):
word = s[start:end]
# If this substring is a valid word
if word in word_set:
# Get all valid sentences for the remaining string
rest_sentences = backtrack(end)
# Combine current word with each sentence from remaining string
for sentence in rest_sentences:
if sentence:
# Add space between word and rest of sentence
sentences.append(word + " " + sentence)
else:
# Last word, no trailing space needed
sentences.append(word)
# Cache results for this starting position
memo[start] = sentences
return sentences
return backtrack(0)
explanation: |
**Time Complexity:** O(n * 2^n) worst case, but typically much better due to memoization and input constraints.
**Space Complexity:** O(n * m) for the memoization cache, where m is the number of valid sentences.
We use backtracking to explore all valid word combinations starting from each position. The memo dictionary ensures we never recompute sentences for the same starting index. The hash set enables O(1) word lookups. By caching at the index level, we efficiently handle overlapping subproblems.
- approach_name: Dynamic Programming (Bottom-Up)
is_optimal: false
code: |
def word_break(s: str, word_dict: list[str]) -> list[str]:
word_set = set(word_dict)
n = len(s)
# dp[i] = list of all valid sentences for s[i:]
dp = [[] for _ in range(n + 1)]
# Base case: empty string at position n
dp[n] = [""]
# Fill DP table from right to left
for start in range(n - 1, -1, -1):
sentences = []
for end in range(start + 1, n + 1):
word = s[start:end]
if word in word_set and dp[end]:
# Combine current word with sentences from dp[end]
for sentence in dp[end]:
if sentence:
sentences.append(word + " " + sentence)
else:
sentences.append(word)
dp[start] = sentences
return dp[0]
explanation: |
**Time Complexity:** O(n * 2^n) worst case, similar to the memoized approach.
**Space Complexity:** O(n * m) for the DP table storing all sentences.
This bottom-up approach builds the solution iteratively from the end of the string to the beginning. The `dp[i]` entry stores all valid sentences that can be formed from `s[i:]`. While conceptually similar to the memoized version, this explicitly shows the DP structure. The memoized version is often preferred as it only computes necessary subproblems.
- approach_name: Trie Optimization
is_optimal: false
code: |
class TrieNode:
def __init__(self):
self.children = {}
self.is_word = False
def word_break(s: str, word_dict: list[str]) -> list[str]:
# Build trie from dictionary
root = TrieNode()
for word in word_dict:
node = root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_word = True
memo = {}
def backtrack(start: int) -> list[str]:
if start in memo:
return memo[start]
if start == len(s):
return [""]
sentences = []
node = root
# Walk the trie while scanning the string
for end in range(start, len(s)):
char = s[end]
if char not in node.children:
break # No words with this prefix
node = node.children[char]
if node.is_word:
word = s[start:end + 1]
for sentence in backtrack(end + 1):
if sentence:
sentences.append(word + " " + sentence)
else:
sentences.append(word)
memo[start] = sentences
return sentences
return backtrack(0)
explanation: |
**Time Complexity:** O(n * 2^n) worst case, but with faster prefix matching.
**Space Complexity:** O(W * L + n * m) where W is dictionary size, L is average word length.
Using a trie allows early termination when no dictionary word starts with the current prefix. This is particularly beneficial when the dictionary is large but words share common prefixes. The trie walk can quickly determine that no words exist with a given prefix, pruning the search space. For small dictionaries, the hash set approach may be simpler and sufficient.

View File

@@ -0,0 +1,225 @@
title: Word Break
slug: word-break
difficulty: medium
leetcode_id: 139
leetcode_url: https://leetcode.com/problems/word-break/
categories:
- dynamic-programming
- strings
- hash-tables
patterns:
- dynamic-programming
description: |
Given a string `s` and a dictionary of strings `wordDict`, return `true` if `s` can be segmented into a space-separated sequence of one or more dictionary words.
**Note** that the same word in the dictionary may be reused multiple times in the segmentation.
constraints: |
- `1 <= s.length <= 300`
- `1 <= wordDict.length <= 1000`
- `1 <= wordDict[i].length <= 20`
- `s` and `wordDict[i]` consist of only lowercase English letters
- All the strings of `wordDict` are **unique**
examples:
- input: 's = "leetcode", wordDict = ["leet","code"]'
output: "true"
explanation: 'Return true because "leetcode" can be segmented as "leet code".'
- input: 's = "applepenapple", wordDict = ["apple","pen"]'
output: "true"
explanation: 'Return true because "applepenapple" can be segmented as "apple pen apple". Note that you are allowed to reuse a dictionary word.'
- input: 's = "catsandog", wordDict = ["cats","dog","sand","and","cat"]'
output: "false"
explanation: "Cannot segment the string using only words from the dictionary."
explanation:
intuition: |
Imagine you're reading a string with all the spaces removed, like "ilovecoding", and you need to figure out if it can be split back into valid words using a given dictionary.
Think of it like this: you're walking through the string character by character, and at each position you ask: "Is there any dictionary word that ends right here, AND was the position just before that word the end of a valid segmentation?"
For "leetcode" with dictionary ["leet", "code"]:
- At position 4, we find "leet" — and position 0 (the start) is a valid starting point
- At position 8, we find "code" — and position 4 was already marked as valid
- Therefore, the entire string can be segmented
This is the **optimal substructure** that makes dynamic programming work: if we know which positions in the string can be reached by valid segmentations, we can determine if new positions are reachable by checking if any dictionary word "bridges" from a known-valid position.
approach: |
We solve this using **Bottom-Up Dynamic Programming**:
**Step 1: Set up for efficient lookups**
- Convert `wordDict` to a set for O(1) lookup time
- Create `dp` array of size `n + 1` where `dp[i]` = "can the first `i` characters be segmented?"
- Set `dp[0] = True` as the base case: an empty string is trivially "segmented"
&nbsp;
**Step 2: Build up solutions for each position**
- For each position `i` from 1 to n (where we're checking if `s[:i]` can be segmented):
- Try each possible starting position `j` from 0 to `i-1`
- If `dp[j]` is True (meaning `s[:j]` can be segmented), check if `s[j:i]` is in the dictionary
- If both conditions hold, set `dp[i] = True` and break (no need to check further)
&nbsp;
**Step 3: Return the answer**
- Return `dp[n]` — whether the entire string can be segmented
&nbsp;
This approach efficiently builds on previously computed results, avoiding redundant work through memoisation in the DP array.
common_pitfalls:
- title: Exponential Backtracking
description: |
A naive recursive approach without memoisation leads to exponential time complexity. Consider the string "aaaaaaaaab" with dictionary ["a", "aa", "aaa", ...].
At each position, you branch into multiple recursive calls. Without caching, you'll recompute the same subproblems many times. With `n = 300`, this will **Time Limit Exceed (TLE)**.
wrong_approach: "Pure recursion without memoisation"
correct_approach: "Use DP array or memoisation to cache subproblem results"
- title: Using List Instead of Set for Dictionary
description: |
Checking if a word exists in a list is O(m) where m is the dictionary size. With up to 1000 words, this adds significant overhead inside nested loops.
Converting to a set gives O(1) average lookup, which can be the difference between passing and failing time limits.
wrong_approach: "word in wordDict (list)"
correct_approach: "word in word_set (set)"
- title: Missing the Empty String Base Case
description: |
Forgetting to set `dp[0] = True` breaks the entire algorithm. The base case represents "the empty prefix is always valid" — it's the foundation from which we build all other solutions.
Without it, no position in the string can ever become True.
wrong_approach: "dp = [False] * (n + 1)"
correct_approach: "dp = [False] * (n + 1); dp[0] = True"
- title: Off-by-One Errors with String Slicing
description: |
The DP array is of size `n + 1` where `dp[i]` represents whether `s[:i]` (first i characters) can be segmented. Be careful that:
- `dp[0]` corresponds to the empty string
- `dp[n]` corresponds to the entire string `s[:n]` which is just `s`
- When checking substring `s[j:i]`, this includes characters from index `j` up to but not including `i`
wrong_approach: "Confusing 1-indexed vs 0-indexed positions"
correct_approach: "dp[i] means 'first i characters can be segmented'"
key_takeaways:
- "**Substring segmentation pattern**: This approach generalises to problems where you must partition a string into valid segments"
- "**Set for dictionary lookup**: Always convert word lists to sets for O(1) containment checks"
- "**Foundation for Word Break II**: The same DP logic extends to finding all valid segmentations, not just checking if one exists"
- "**BFS alternative**: This problem can also be modelled as a graph where each position is a node, with edges to positions reachable by dictionary words"
time_complexity: "O(n^2 * m). For each of n positions, we check up to n previous positions, and each substring comparison takes O(m) where m is the maximum word length (up to 20). With set lookup optimisation, the inner comparison becomes O(m) for hashing."
space_complexity: "O(n + k). The DP array uses O(n) space, and the word set uses O(k) where k is the total length of all dictionary words."
solutions:
- approach_name: Bottom-Up DP
is_optimal: true
code: |
def word_break(s: str, word_dict: list[str]) -> bool:
# Convert to set for O(1) lookup
word_set = set(word_dict)
n = len(s)
# dp[i] = True if s[:i] can be segmented
dp = [False] * (n + 1)
# Base case: empty string is always "segmented"
dp[0] = True
# Check each ending position
for i in range(1, n + 1):
# Try each possible starting position for the last word
for j in range(i):
# If s[:j] can be segmented AND s[j:i] is a valid word
if dp[j] and s[j:i] in word_set:
dp[i] = True
break # Found a valid segmentation, no need to check more
return dp[n]
explanation: |
**Time Complexity:** O(n^2 * m) — Two nested loops over string length, with O(m) substring hashing.
**Space Complexity:** O(n + k) — DP array plus word set storage.
We build solutions from left to right. For each position i, we check if any dictionary word ends there by trying all possible starting positions j. If position j was reachable and the substring s[j:i] is a valid word, then position i is also reachable.
- approach_name: Top-Down DP (Memoisation)
is_optimal: false
code: |
def word_break(s: str, word_dict: list[str]) -> bool:
word_set = set(word_dict)
memo = {}
def can_break(start: int) -> bool:
# Base case: reached end of string
if start == len(s):
return True
# Return cached result if available
if start in memo:
return memo[start]
# Try each possible word starting at 'start'
for end in range(start + 1, len(s) + 1):
word = s[start:end]
if word in word_set and can_break(end):
memo[start] = True
return True
# No valid segmentation found from this position
memo[start] = False
return False
return can_break(0)
explanation: |
**Time Complexity:** O(n^2 * m) — Same as bottom-up in the worst case.
**Space Complexity:** O(n) — Recursion stack plus memoisation cache.
This recursive approach with memoisation is conceptually similar to the iterative DP. We try to segment starting from index 0, and for each starting position, we try all possible first words. Memoisation prevents recomputation of subproblems. Some find this more intuitive than the bottom-up approach.
- approach_name: BFS
is_optimal: false
code: |
from collections import deque
def word_break(s: str, word_dict: list[str]) -> bool:
word_set = set(word_dict)
n = len(s)
# visited[i] = True if we've processed position i
visited = [False] * n
# BFS queue holds starting positions to explore
queue = deque([0])
while queue:
start = queue.popleft()
# Skip if already processed
if visited[start]:
continue
visited[start] = True
# Try all possible words starting at 'start'
for end in range(start + 1, n + 1):
if s[start:end] in word_set:
# Reached the end of string
if end == n:
return True
# Add next position to explore
queue.append(end)
return False
explanation: |
**Time Complexity:** O(n^2 * m) — Each position visited once, with O(n * m) work per position.
**Space Complexity:** O(n) — Visited array and queue.
BFS models the problem as a graph traversal. Each position in the string is a node. There's an edge from position i to position j if s[i:j] is a dictionary word. We perform BFS from position 0 and check if we can reach position n. This approach is particularly intuitive for those familiar with graph algorithms.

View File

@@ -0,0 +1,273 @@
title: Word Ladder
slug: word-ladder
difficulty: hard
leetcode_id: 127
leetcode_url: https://leetcode.com/problems/word-ladder/
categories:
- strings
- graphs
- hash-tables
patterns:
- bfs
description: |
A **transformation sequence** from word `beginWord` to word `endWord` using a dictionary `wordList` is a sequence of words `beginWord -> s1 -> s2 -> ... -> sk` such that:
- Every adjacent pair of words differs by a single letter.
- Every `si` for `1 <= i <= k` is in `wordList`. Note that `beginWord` does not need to be in `wordList`.
- `sk == endWord`
Given two words, `beginWord` and `endWord`, and a dictionary `wordList`, return *the **number of words** in the **shortest transformation sequence** from `beginWord` to `endWord`, or `0` if no such sequence exists*.
constraints: |
- `1 <= beginWord.length <= 10`
- `endWord.length == beginWord.length`
- `1 <= wordList.length <= 5000`
- `wordList[i].length == beginWord.length`
- `beginWord`, `endWord`, and `wordList[i]` consist of lowercase English letters
- `beginWord != endWord`
- All the words in `wordList` are **unique**
examples:
- input: 'beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]'
output: "5"
explanation: 'One shortest transformation sequence is "hit" -> "hot" -> "dot" -> "dog" -> "cog", which is 5 words long.'
- input: 'beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]'
output: "0"
explanation: 'The endWord "cog" is not in wordList, therefore there is no valid transformation sequence.'
explanation:
intuition: |
Imagine each word as a node in a graph. Two nodes are connected by an edge if they differ by exactly one letter. The problem then becomes: **find the shortest path** from `beginWord` to `endWord` in this graph.
Why BFS? When searching for the shortest path in an unweighted graph (where every edge has the same "cost"), **Breadth-First Search** is the ideal algorithm. BFS explores all nodes at distance 1, then all nodes at distance 2, and so on. The first time we reach `endWord`, we've guaranteed found the shortest path.
Think of it like ripples spreading outward from a stone dropped in water. Starting from `beginWord`, we explore all words reachable by changing one letter. Then from each of those words, we explore their one-letter neighbours. The "ripple" that first touches `endWord` tells us the shortest transformation length.
The key insight is recognising this as a **graph shortest-path problem** disguised as a string manipulation problem. Once you see the graph structure, BFS becomes the natural choice.
approach: |
We solve this using **Breadth-First Search (BFS)** with a word set for O(1) lookups:
**Step 1: Handle early termination**
- If `endWord` is not in `wordList`, return `0` immediately since no valid transformation exists
- Convert `wordList` to a set for O(1) membership checks
&nbsp;
**Step 2: Initialise BFS data structures**
- `queue`: Contains tuples of `(current_word, transformation_length)`, starting with `(beginWord, 1)`
- `visited`: A set to track words we've already processed, preventing cycles
&nbsp;
**Step 3: Process the BFS queue**
- Dequeue the front word and its current transformation length
- If this word equals `endWord`, return the transformation length (shortest path found)
- Otherwise, generate all possible one-letter transformations
&nbsp;
**Step 4: Generate neighbour words efficiently**
- For each position in the word, try replacing it with every letter from `a` to `z`
- If the new word exists in `wordList` and hasn't been visited:
- Mark it as visited
- Add it to the queue with `length + 1`
&nbsp;
**Step 5: Return result**
- If the queue empties without finding `endWord`, return `0`
&nbsp;
This approach guarantees we find the shortest path because BFS explores all words at distance `d` before any word at distance `d+1`.
common_pitfalls:
- title: Using DFS Instead of BFS
description: |
DFS will find *a* path but not necessarily the *shortest* path. DFS explores one branch deeply before backtracking, so it might find a longer transformation sequence first.
For example, DFS might find `hit -> hot -> lot -> log -> cog` (5 words) but miss that `hit -> hot -> dot -> dog -> cog` is equally short. Worse, on different inputs DFS could find much longer paths.
BFS guarantees shortest path in unweighted graphs because it explores level by level.
wrong_approach: "Use DFS with path tracking"
correct_approach: "Use BFS to guarantee shortest path"
- title: Comparing Every Word Pair (O(n^2) Neighbour Check)
description: |
A naive approach compares every word against every other word to find neighbours differing by one letter. With `n` words of length `m`, this is O(n^2 * m) just for building the graph.
Instead, for each word, generate all possible one-letter variations and check if they exist in the word set. This is O(n * m * 26) = O(n * m), which is much faster when `n` is large.
With `wordList.length <= 5000` and word length up to 10, the optimised approach does ~1.3M operations vs potentially 250M for the naive approach.
wrong_approach: "Compare every pair of words"
correct_approach: "Generate variations and check set membership"
- title: Forgetting to Check if endWord Exists
description: |
If `endWord` is not in `wordList`, no valid transformation can exist. Failing to check this upfront means BFS runs to exhaustion before returning `0`.
Always validate inputs first: `if endWord not in word_set: return 0`.
- title: Not Marking Words as Visited
description: |
Without tracking visited words, BFS can revisit the same word multiple times from different paths, leading to:
- Infinite loops in graphs with cycles
- Exponential time complexity as the same subgraphs are explored repeatedly
Mark words as visited **when adding to the queue**, not when dequeuing. This prevents adding duplicates to the queue.
wrong_approach: "Process words without tracking visited"
correct_approach: "Mark visited when enqueuing to prevent duplicates"
key_takeaways:
- "**Graph recognition**: Many string transformation problems are graph shortest-path problems in disguise. When you see 'minimum steps' or 'shortest sequence', think BFS"
- "**BFS for shortest path**: In unweighted graphs, BFS guarantees the shortest path. This is fundamental and appears in many problems"
- "**Optimise neighbour generation**: Instead of comparing all pairs, generate possible variations and check set membership. This changes O(n^2) to O(n * alphabet_size)"
- "**Foundation for Word Ladder II**: This problem (LeetCode 126) asks for all shortest paths, requiring you to track parent pointers during BFS"
time_complexity: "O(n * m * 26) where `n` is the number of words and `m` is the word length. For each word, we generate `m * 26` variations and check set membership in O(m) for hashing."
space_complexity: "O(n * m). The visited set and queue can each hold up to `n` words of length `m`."
solutions:
- approach_name: BFS with Set Lookup
is_optimal: true
code: |
from collections import deque
def ladder_length(begin_word: str, end_word: str, word_list: list[str]) -> int:
# Convert to set for O(1) lookups
word_set = set(word_list)
# Early termination: end_word must be reachable
if end_word not in word_set:
return 0
# BFS setup: (current_word, transformation_count)
queue = deque([(begin_word, 1)])
visited = {begin_word}
while queue:
current_word, length = queue.popleft()
# Try changing each character position
for i in range(len(current_word)):
# Try all 26 letters
for c in 'abcdefghijklmnopqrstuvwxyz':
# Build the new word with one character changed
next_word = current_word[:i] + c + current_word[i+1:]
# Found the target!
if next_word == end_word:
return length + 1
# Valid unvisited word? Add to queue
if next_word in word_set and next_word not in visited:
visited.add(next_word)
queue.append((next_word, length + 1))
# No path found
return 0
explanation: |
**Time Complexity:** O(n * m * 26) where n is the word list size and m is word length.
**Space Complexity:** O(n * m) for the visited set and queue.
BFS explores words level by level, guaranteeing the first path found to `endWord` is the shortest. We optimise neighbour finding by generating all single-character variations rather than comparing against all words.
- approach_name: Bidirectional BFS
is_optimal: true
code: |
def ladder_length(begin_word: str, end_word: str, word_list: list[str]) -> int:
word_set = set(word_list)
if end_word not in word_set:
return 0
# Search from both ends simultaneously
front = {begin_word}
back = {end_word}
visited = set()
length = 1
while front and back:
# Always expand the smaller frontier for efficiency
if len(front) > len(back):
front, back = back, front
next_front = set()
for word in front:
for i in range(len(word)):
for c in 'abcdefghijklmnopqrstuvwxyz':
next_word = word[:i] + c + word[i+1:]
# Frontiers meet! Path found
if next_word in back:
return length + 1
if next_word in word_set and next_word not in visited:
visited.add(next_word)
next_front.add(next_word)
front = next_front
length += 1
return 0
explanation: |
**Time Complexity:** O(n * m * 26), but often faster in practice due to smaller search space.
**Space Complexity:** O(n * m) for the visited set and frontiers.
Bidirectional BFS searches from both `beginWord` and `endWord` simultaneously. When the two search frontiers meet, we've found the shortest path. This reduces the search space from O(b^d) to O(b^(d/2)) where b is branching factor and d is depth, providing significant speedup on large graphs.
- approach_name: BFS with Wildcard Preprocessing
is_optimal: false
code: |
from collections import deque, defaultdict
def ladder_length(begin_word: str, end_word: str, word_list: list[str]) -> int:
if end_word not in word_list:
return 0
# Preprocess: group words by wildcard patterns
# "hot" -> ["*ot", "h*t", "ho*"]
word_len = len(begin_word)
patterns = defaultdict(list)
for word in word_list:
for i in range(word_len):
pattern = word[:i] + '*' + word[i+1:]
patterns[pattern].append(word)
# BFS using pattern lookup
queue = deque([(begin_word, 1)])
visited = {begin_word}
while queue:
current_word, length = queue.popleft()
# Find neighbours through shared patterns
for i in range(word_len):
pattern = current_word[:i] + '*' + current_word[i+1:]
for neighbour in patterns[pattern]:
if neighbour == end_word:
return length + 1
if neighbour not in visited:
visited.add(neighbour)
queue.append((neighbour, length + 1))
return 0
explanation: |
**Time Complexity:** O(n * m^2) for preprocessing plus O(n * m) for BFS.
**Space Complexity:** O(n * m^2) for the pattern dictionary.
This approach preprocesses words into "wildcard buckets" (e.g., `h*t` contains both `hot` and `hat`). Finding neighbours becomes a dictionary lookup. This trades space for faster neighbour finding but uses more memory. Best when the word list is dense (many words share patterns).

View File

@@ -0,0 +1,274 @@
title: Word Search II
slug: word-search-ii
difficulty: hard
leetcode_id: 212
leetcode_url: https://leetcode.com/problems/word-search-ii/
categories:
- arrays
- strings
- recursion
patterns:
- trie
- backtracking
- matrix-traversal
description: |
Given an `m x n` `board` of characters and a list of strings `words`, return *all words on the board*.
Each word must be constructed from letters of sequentially adjacent cells, where **adjacent cells** are horizontally or vertically neighboring. The same letter cell may not be used more than once in a word.
constraints: |
- `m == board.length`
- `n == board[i].length`
- `1 <= m, n <= 12`
- `board[i][j]` is a lowercase English letter
- `1 <= words.length <= 3 * 10^4`
- `1 <= words[i].length <= 10`
- `words[i]` consists of lowercase English letters
- All the strings of `words` are unique
examples:
- input: 'board = [["o","a","a","n"],["e","t","a","e"],["i","h","k","r"],["i","f","l","v"]], words = ["oath","pea","eat","rain"]'
output: '["eat","oath"]'
explanation: "Both 'eat' and 'oath' can be constructed from adjacent cells on the board. 'pea' and 'rain' cannot be formed using the available paths."
- input: 'board = [["a","b"],["c","d"]], words = ["abcb"]'
output: "[]"
explanation: "The word 'abcb' would require revisiting the cell 'b', which is not allowed."
explanation:
intuition: |
Imagine you're solving a word search puzzle from a newspaper, but instead of finding one word, you need to find thousands.
The naive approach would be to run Word Search I (the single-word version) for each word in the list. But with up to 30,000 words and a board that allows paths of length 10, this becomes prohibitively slow — you'd repeat the same board traversals over and over.
The key insight is to **flip the problem around**: instead of searching for each word separately, we search the board once and check all words simultaneously. To do this efficiently, we use a **Trie (prefix tree)** to store all the words. As we explore paths on the board using DFS/backtracking, we traverse the Trie in parallel. If the current path isn't a valid prefix of any word, we can prune immediately.
Think of it like this: you're walking through a maze (the board), carrying a map of all possible destinations (the Trie). At each intersection, you check your map — if no destination lies along this path, turn back. If you reach a destination (a complete word), collect it and potentially continue (since "cat" being a word doesn't mean "cats" isn't also there).
approach: |
We solve this using a **Trie + Backtracking** approach:
**Step 1: Build the Trie**
- Create a Trie data structure and insert all words from the input list
- Each node stores children (a dictionary mapping characters to nodes) and an optional word marker
- Store the complete word at terminal nodes for easy retrieval when found
&nbsp;
**Step 2: Set up the backtracking search**
- Iterate through each cell `(i, j)` on the board as a potential starting point
- Only start DFS if the cell's character exists in the Trie root's children
&nbsp;
**Step 3: DFS with Trie navigation**
- At each cell, check if the current character exists in the current Trie node's children
- If not, return immediately (pruning)
- If yes, move to that Trie child node and continue exploring
- Mark the cell as visited (temporarily replace with `#`) to prevent reuse in the same path
- Explore all four directions: up, down, left, right
- Restore the cell's original character when backtracking
&nbsp;
**Step 4: Collect found words**
- When a Trie node contains a complete word, add it to the result set
- Remove the word from the Trie (set the word marker to None) to avoid duplicates
- Continue exploring since longer words may share this prefix
&nbsp;
**Step 5: Optimisation — Trie pruning**
- After finding a word, if a Trie node has no children and no word, we can remove it
- This progressively shrinks the Trie as words are found, speeding up later searches
&nbsp;
**Step 6: Return results**
- Return the list of all found words
common_pitfalls:
- title: Running Word Search I for Each Word
description: |
The most intuitive approach is to reuse the Word Search I solution for each word:
```python
for word in words:
if exists_on_board(board, word):
result.append(word)
```
With `k` words of average length `L` and a board of size `m × n`, this gives **O(k × m × n × 4^L)** time complexity. For the maximum constraints (`k = 30,000`, `m = n = 12`, `L = 10`), this means potentially 10^15 operations — far too slow.
The Trie approach searches all words simultaneously, reducing this to roughly **O(m × n × 4^L)** with effective pruning.
wrong_approach: "Iterate through words and search each separately"
correct_approach: "Build a Trie and search all words in one board traversal"
- title: Forgetting to Handle Duplicates
description: |
The same word might be findable via multiple paths on the board. For example, "aba" might appear both horizontally and diagonally.
Without proper handling, you'll add duplicates to your result. The solution is to either:
- Use a set for results
- Remove the word from the Trie after finding it (preferred, as it also improves performance)
wrong_approach: "Append found words to a list without deduplication"
correct_approach: "Remove word from Trie after finding, or use a result set"
- title: Not Restoring Board State
description: |
When marking cells as visited during DFS, you must restore them when backtracking. A common bug is forgetting to restore, which corrupts the board for other paths.
```python
# Wrong: board[i][j] stays as '#' for other searches
board[i][j] = '#'
dfs(...)
# Missing: board[i][j] = original_char
```
Always restore the cell after exploring all directions from it.
wrong_approach: "Mark visited without restoration"
correct_approach: "Save original character, mark as '#', restore after DFS returns"
- title: Missing Trie Pruning Optimisation
description: |
Without pruning empty Trie branches, the Trie structure remains full even as words are found. This means the algorithm keeps checking paths that can no longer lead to any words.
For example, after finding "oath", if no other words start with "oat" or "oa" or "o", we should remove those nodes to avoid exploring "o..." prefixes again.
This optimisation can significantly improve average-case performance.
wrong_approach: "Keep the full Trie structure throughout"
correct_approach: "Remove childless, wordless nodes after finding words"
key_takeaways:
- "**Trie for multi-pattern search**: When searching for many patterns in the same data, a Trie lets you check all patterns simultaneously rather than iterating through each"
- "**Prune early, prune often**: The power of the Trie approach comes from pruning — rejecting paths as soon as they can't lead to any word"
- "**Backtracking template**: Mark visited → explore all directions → restore state. This pattern appears in many grid/graph problems"
- "**Optimise the Trie dynamically**: Removing found words and empty branches prevents redundant work and can dramatically improve performance"
time_complexity: "O(m × n × 4^L) where `m × n` is the board size and `L` is the maximum word length. Each cell can be a starting point, and from each cell we explore up to 4 directions for up to `L` steps. The Trie pruning makes this much faster in practice."
space_complexity: "O(N) where `N` is the total number of characters across all words, for storing the Trie. The recursion stack adds O(L) for the maximum word length. The board modification for visited marking is O(1) extra space."
solutions:
- approach_name: Trie + Backtracking
is_optimal: true
code: |
class TrieNode:
def __init__(self):
self.children = {} # char -> TrieNode
self.word = None # Stores complete word at terminal nodes
class Solution:
def findWords(self, board: list[list[str]], words: list[str]) -> list[str]:
# Step 1: Build the Trie from all words
root = TrieNode()
for word in words:
node = root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.word = word # Mark end of word
result = []
rows, cols = len(board), len(board[0])
def backtrack(row: int, col: int, parent: TrieNode) -> None:
char = board[row][col]
node = parent.children[char]
# Found a word — add to result and remove from Trie
if node.word:
result.append(node.word)
node.word = None # Prevent duplicates
# Mark cell as visited
board[row][col] = '#'
# Explore all four directions
for dr, dc in [(0, 1), (1, 0), (0, -1), (-1, 0)]:
new_row, new_col = row + dr, col + dc
# Check bounds and if next char exists in Trie
if (0 <= new_row < rows and 0 <= new_col < cols
and board[new_row][new_col] in node.children):
backtrack(new_row, new_col, node)
# Restore cell for other paths
board[row][col] = char
# Optimisation: prune empty branches from Trie
if not node.children:
del parent.children[char]
# Step 2: Start DFS from each cell
for i in range(rows):
for j in range(cols):
if board[i][j] in root.children:
backtrack(i, j, root)
return result
explanation: |
**Time Complexity:** O(m × n × 4^L) — We potentially start from each cell and explore paths up to length L, with 4 directions at each step. Trie pruning significantly reduces this in practice.
**Space Complexity:** O(N) — The Trie stores all characters from all words. Recursion stack adds O(L).
This solution combines three techniques: a Trie for efficient prefix matching, backtracking for exploring all valid paths, and progressive pruning to eliminate dead branches. The key insight is that we traverse the Trie and board simultaneously, allowing us to prune paths that can't possibly lead to any word.
- approach_name: Brute Force (Word Search I per word)
is_optimal: false
code: |
class Solution:
def findWords(self, board: list[list[str]], words: list[str]) -> list[str]:
rows, cols = len(board), len(board[0])
result = []
def search_word(word: str) -> bool:
"""Search for a single word on the board."""
def dfs(row: int, col: int, idx: int) -> bool:
# Found complete word
if idx == len(word):
return True
# Check bounds and character match
if (row < 0 or row >= rows or col < 0 or col >= cols
or board[row][col] != word[idx]):
return False
# Mark visited
temp = board[row][col]
board[row][col] = '#'
# Explore all directions
found = (dfs(row + 1, col, idx + 1) or
dfs(row - 1, col, idx + 1) or
dfs(row, col + 1, idx + 1) or
dfs(row, col - 1, idx + 1))
# Restore cell
board[row][col] = temp
return found
# Try starting from each cell
for i in range(rows):
for j in range(cols):
if dfs(i, j, 0):
return True
return False
# Search for each word separately
for word in words:
if search_word(word):
result.append(word)
return result
explanation: |
**Time Complexity:** O(k × m × n × 4^L) — For each of k words, we potentially explore all cells and paths.
**Space Complexity:** O(L) — Recursion stack depth equals maximum word length.
This approach applies the Word Search I solution to each word independently. While correct, it's extremely slow for large word lists because it repeats board traversals and doesn't share work between words with common prefixes. Included to illustrate why the Trie optimisation is essential.