questions S-W

This commit is contained in:
2025-05-30 19:18:33 +01:00
parent ddceeec07e
commit 041a877295
46 changed files with 9696 additions and 0 deletions

View File

@@ -0,0 +1,219 @@
title: Subsets II
slug: subsets-ii
difficulty: medium
leetcode_id: 90
leetcode_url: https://leetcode.com/problems/subsets-ii/
categories:
- arrays
- sorting
- recursion
patterns:
- backtracking
description: |
Given an integer array `nums` that may contain duplicates, return *all possible subsets* (the power set).
The solution set **must not** contain duplicate subsets. Return the solution in **any order**.
constraints: |
- `1 <= nums.length <= 10`
- `-10 <= nums[i] <= 10`
examples:
- input: "nums = [1,2,2]"
output: "[[],[1],[1,2],[1,2,2],[2],[2,2]]"
explanation: "The array contains a duplicate 2. We generate all unique subsets, avoiding duplicates like having two separate [2] subsets."
- input: "nums = [0]"
output: "[[],[0]]"
explanation: "With a single element, we have two subsets: the empty set and the set containing just that element."
explanation:
intuition: |
This problem extends the classic Subsets problem to handle **duplicate elements**. Think of it like selecting items from a box where some items are identical — you want to count each unique selection only once, even if there are multiple copies of the same item.
Imagine you have three marbles: one red and two blue. The possible selections are: nothing, red only, one blue, two blues, red with one blue, or red with both blues. Notice that "one blue" appears only once in our answer, even though there are two blue marbles we could pick. The *identity* of which blue marble we chose doesn't matter — only how many.
The key insight is that **sorting brings duplicates together**, making them easy to handle as a group. After sorting `[1, 2, 2]` stays as `[1, 2, 2]`, and we can see the two `2`s are adjacent. When we're at the second `2`, we know we've already explored all subsets that include "one `2`" — so we should only explore subsets that include "two `2`s" from this point.
By skipping duplicate elements that would start redundant branches, we prune the decision tree and generate only unique subsets.
approach: |
We solve this using **Backtracking with Duplicate Skipping**:
**Step 1: Sort the input array**
- Sorting groups duplicates together: `[2, 1, 2]` becomes `[1, 2, 2]`
- This is essential for our skip logic to work — we need duplicates to be adjacent
&nbsp;
**Step 2: Set up backtracking state**
- `result`: List to collect all unique subsets
- `current`: The subset being built
- `backtrack(index)`: Recursive function where `index` is the starting position for choosing next elements
&nbsp;
**Step 3: Define the backtracking function**
- First, add a copy of `current` to `result` (every path represents a valid subset)
- Then, iterate through remaining elements from `index` to `len(nums) - 1`
- For each element at position `i`:
- **Skip duplicates**: If `i > index` and `nums[i] == nums[i-1]`, skip this element
- Otherwise, add `nums[i]` to `current`, recurse with `i + 1`, then backtrack (remove the element)
&nbsp;
**Step 4: The duplicate skipping logic**
- The condition `i > index and nums[i] == nums[i-1]` means: "this element equals the previous one, and we're past the starting point for this level"
- When `i == index`, we must consider the element (it's our first choice at this level)
- When `i > index` and it's a duplicate, we've already explored subsets starting with this value at position `index`
- Skipping prevents generating the same subset through different paths
&nbsp;
**Step 5: Return all collected subsets**
- Start with `backtrack(0, [])` and return `result`
common_pitfalls:
- title: Forgetting to Sort
description: |
The duplicate-skipping logic relies on duplicates being adjacent. Without sorting, the skip condition `nums[i] == nums[i-1]` won't catch all duplicates.
For example, `[2, 1, 2]` unsorted has duplicates separated. The condition would miss them, producing duplicate subsets `[2]` from index 0 and index 2.
Always sort first: `nums.sort()` before starting backtracking.
wrong_approach: "Skip duplicates without sorting"
correct_approach: "Sort first, then skip adjacent duplicates"
- title: Wrong Skip Condition Index Check
description: |
A common mistake is using `i > 0` instead of `i > index`:
```python
if i > 0 and nums[i] == nums[i-1]: # Wrong
continue
if i > index and nums[i] == nums[i-1]: # Correct
continue
```
The condition must be `i > index` because we're checking if we've already made this choice *at the current recursion level*. Using `i > 0` would incorrectly skip valid subsets that legitimately include duplicate elements.
wrong_approach: "Use i > 0 in the skip condition"
correct_approach: "Use i > index to check within the current level"
- title: Using a Set for Deduplication
description: |
You might think "just use a set to store results and remove duplicates". While this works, it's inefficient:
- Converting lists to tuples for set storage has overhead
- You generate duplicate subsets only to discard them
- With many duplicates, you waste significant computation
For input `[1,1,1,1,1,1,1,1,1,1]` (10 identical elements), the naive approach generates 2^10 = 1024 subsets but only 11 are unique. The pruning approach generates exactly 11.
wrong_approach: "Generate all subsets, deduplicate with a set"
correct_approach: "Prune duplicate branches during backtracking"
- title: Forgetting to Copy the Subset
description: |
When adding to results, use `result.append(current[:])` or `result.append(list(current))`, not `result.append(current)`.
The `current` list is mutated during backtracking. If you append the reference directly, all entries in `result` will point to the same (eventually empty) list.
key_takeaways:
- "**Sorting enables duplicate detection**: Bringing duplicates together lets you identify and skip them with a simple `nums[i] == nums[i-1]` check"
- "**The index matters in the skip condition**: Use `i > index` (not `i > 0`) to only skip duplicates at the *current recursion level*, not legitimate uses of duplicate values deeper in the tree"
- "**Prune early, not late**: Avoiding duplicate work during generation is far more efficient than deduplicating results afterward"
- "**Extends the Subsets pattern**: This is the same backtracking template as Subsets, with just one additional line for duplicate handling — a powerful reminder that small tweaks can adapt patterns to new constraints"
time_complexity: "O(n * 2^n). In the worst case (all unique elements), we generate 2^n subsets, each taking O(n) to copy. With duplicates, the actual count is lower, but the upper bound remains O(2^n)."
space_complexity: "O(n). The recursion depth is at most n, and the `current` list holds at most n elements. The output space for storing subsets is not counted as auxiliary space."
solutions:
- approach_name: Backtracking with Duplicate Skipping
is_optimal: true
code: |
def subsets_with_dup(nums: list[int]) -> list[list[int]]:
result = []
nums.sort() # Sort to bring duplicates together
def backtrack(index: int, current: list[int]) -> None:
# Every path is a valid subset, add a copy
result.append(current[:])
for i in range(index, len(nums)):
# Skip duplicate values at the same recursion level
if i > index and nums[i] == nums[i - 1]:
continue
# Choose: add nums[i] to current subset
current.append(nums[i])
# Explore: recurse to consider elements after i
backtrack(i + 1, current)
# Unchoose: backtrack to try other options
current.pop()
backtrack(0, [])
return result
explanation: |
**Time Complexity:** O(n * 2^n) — We generate up to 2^n subsets (fewer with duplicates), each requiring O(n) to copy.
**Space Complexity:** O(n) — Recursion stack depth is at most n, plus the `current` list of size n.
The key optimisation is the duplicate skip on line 10. After sorting, duplicates are adjacent. When we encounter a value that equals the previous one *at the same recursion level* (`i > index`), we skip it because all subsets starting with this value have already been explored when we processed its predecessor.
- approach_name: Iterative with Duplicate Handling
is_optimal: true
code: |
def subsets_with_dup(nums: list[int]) -> list[list[int]]:
nums.sort() # Sort to group duplicates
result = [[]] # Start with empty subset
start = 0 # Track where new subsets begin
for i in range(len(nums)):
# If current element is a duplicate, only extend
# subsets added in the previous iteration
if i > 0 and nums[i] == nums[i - 1]:
new_subsets = [subset + [nums[i]] for subset in result[start:]]
else:
# For new elements, extend all existing subsets
new_subsets = [subset + [nums[i]] for subset in result]
start = len(result) # Mark where these new subsets start
result.extend(new_subsets)
return result
explanation: |
**Time Complexity:** O(n * 2^n) — Same as backtracking approach.
**Space Complexity:** O(1) auxiliary — We only use the output list (no recursion stack).
This iterative approach builds subsets incrementally. For each new element, we extend existing subsets by adding the element. The key insight for duplicates: when we see a repeated value, we only extend subsets that were *just* created in the previous round (tracked by `start`). This prevents creating duplicate subsets like `[2]` from both the first and second occurrence of `2`.
- approach_name: Brute Force with Set Deduplication
is_optimal: false
code: |
def subsets_with_dup(nums: list[int]) -> list[list[int]]:
result_set = set()
def backtrack(index: int, current: tuple) -> None:
# Add current subset as a sorted tuple (for set comparison)
result_set.add(current)
for i in range(index, len(nums)):
# Extend current subset with nums[i]
backtrack(i + 1, tuple(sorted(current + (nums[i],))))
backtrack(0, ())
# Convert tuples back to lists
return [list(subset) for subset in result_set]
explanation: |
**Time Complexity:** O(n * 2^n * log(n)) — Generates 2^n subsets, sorting each for comparison adds log(n) factor.
**Space Complexity:** O(n * 2^n) — Stores all unique subsets in a set.
This naive approach generates all possible subsets and relies on a set to remove duplicates. While correct, it's inefficient because it generates duplicate subsets only to discard them. For input with many duplicates like `[1,1,1,1,1]`, it generates 32 subsets but only 6 are unique. The optimised approaches avoid this wasted work.