questions S-W

2025-05-30 19:18:33 +01:00
parent ddceeec07e
commit 041a877295
46 changed files with 9696 additions and 0 deletions
--- a/backend/data/questions/subsets-ii.yaml
+++ b/backend/data/questions/subsets-ii.yaml
@@ -0,0 +1,219 @@
+title: Subsets II
+slug: subsets-ii
+difficulty: medium
+leetcode_id: 90
+leetcode_url: https://leetcode.com/problems/subsets-ii/
+categories:
+  - arrays
+  - sorting
+  - recursion
+patterns:
+  - backtracking
+
+description: |
+  Given an integer array `nums` that may contain duplicates, return *all possible subsets* (the power set).
+
+  The solution set **must not** contain duplicate subsets. Return the solution in **any order**.
+
+constraints: |
+  - `1 <= nums.length <= 10`
+  - `-10 <= nums[i] <= 10`
+
+examples:
+  - input: "nums = [1,2,2]"
+    output: "[[],[1],[1,2],[1,2,2],[2],[2,2]]"
+    explanation: "The array contains a duplicate 2. We generate all unique subsets, avoiding duplicates like having two separate [2] subsets."
+  - input: "nums = [0]"
+    output: "[[],[0]]"
+    explanation: "With a single element, we have two subsets: the empty set and the set containing just that element."
+
+explanation:
+  intuition: |
+    This problem extends the classic Subsets problem to handle **duplicate elements**. Think of it like selecting items from a box where some items are identical — you want to count each unique selection only once, even if there are multiple copies of the same item.
+
+    Imagine you have three marbles: one red and two blue. The possible selections are: nothing, red only, one blue, two blues, red with one blue, or red with both blues. Notice that "one blue" appears only once in our answer, even though there are two blue marbles we could pick. The *identity* of which blue marble we chose doesn't matter — only how many.
+
+    The key insight is that **sorting brings duplicates together**, making them easy to handle as a group. After sorting `[1, 2, 2]` stays as `[1, 2, 2]`, and we can see the two `2`s are adjacent. When we're at the second `2`, we know we've already explored all subsets that include "one `2`" — so we should only explore subsets that include "two `2`s" from this point.
+
+    By skipping duplicate elements that would start redundant branches, we prune the decision tree and generate only unique subsets.
+
+  approach: |
+    We solve this using **Backtracking with Duplicate Skipping**:
+
+    **Step 1: Sort the input array**
+
+    - Sorting groups duplicates together: `[2, 1, 2]` becomes `[1, 2, 2]`
+    - This is essential for our skip logic to work — we need duplicates to be adjacent
+
+    &nbsp;
+
+    **Step 2: Set up backtracking state**
+
+    - `result`: List to collect all unique subsets
+    - `current`: The subset being built
+    - `backtrack(index)`: Recursive function where `index` is the starting position for choosing next elements
+
+    &nbsp;
+
+    **Step 3: Define the backtracking function**
+
+    - First, add a copy of `current` to `result` (every path represents a valid subset)
+    - Then, iterate through remaining elements from `index` to `len(nums) - 1`
+    - For each element at position `i`:
+      - **Skip duplicates**: If `i > index` and `nums[i] == nums[i-1]`, skip this element
+      - Otherwise, add `nums[i]` to `current`, recurse with `i + 1`, then backtrack (remove the element)
+
+    &nbsp;
+
+    **Step 4: The duplicate skipping logic**
+
+    - The condition `i > index and nums[i] == nums[i-1]` means: "this element equals the previous one, and we're past the starting point for this level"
+    - When `i == index`, we must consider the element (it's our first choice at this level)
+    - When `i > index` and it's a duplicate, we've already explored subsets starting with this value at position `index`
+    - Skipping prevents generating the same subset through different paths
+
+    &nbsp;
+
+    **Step 5: Return all collected subsets**
+
+    - Start with `backtrack(0, [])` and return `result`
+
+  common_pitfalls:
+    - title: Forgetting to Sort
+      description: |
+        The duplicate-skipping logic relies on duplicates being adjacent. Without sorting, the skip condition `nums[i] == nums[i-1]` won't catch all duplicates.
+
+        For example, `[2, 1, 2]` unsorted has duplicates separated. The condition would miss them, producing duplicate subsets `[2]` from index 0 and index 2.
+
+        Always sort first: `nums.sort()` before starting backtracking.
+      wrong_approach: "Skip duplicates without sorting"
+      correct_approach: "Sort first, then skip adjacent duplicates"
+
+    - title: Wrong Skip Condition Index Check
+      description: |
+        A common mistake is using `i > 0` instead of `i > index`:
+
+        ```python
+        if i > 0 and nums[i] == nums[i-1]:  # Wrong
+            continue
+
+        if i > index and nums[i] == nums[i-1]:  # Correct
+            continue
+        ```
+
+        The condition must be `i > index` because we're checking if we've already made this choice *at the current recursion level*. Using `i > 0` would incorrectly skip valid subsets that legitimately include duplicate elements.
+      wrong_approach: "Use i > 0 in the skip condition"
+      correct_approach: "Use i > index to check within the current level"
+
+    - title: Using a Set for Deduplication
+      description: |
+        You might think "just use a set to store results and remove duplicates". While this works, it's inefficient:
+        - Converting lists to tuples for set storage has overhead
+        - You generate duplicate subsets only to discard them
+        - With many duplicates, you waste significant computation
+
+        For input `[1,1,1,1,1,1,1,1,1,1]` (10 identical elements), the naive approach generates 2^10 = 1024 subsets but only 11 are unique. The pruning approach generates exactly 11.
+      wrong_approach: "Generate all subsets, deduplicate with a set"
+      correct_approach: "Prune duplicate branches during backtracking"
+
+    - title: Forgetting to Copy the Subset
+      description: |
+        When adding to results, use `result.append(current[:])` or `result.append(list(current))`, not `result.append(current)`.
+
+        The `current` list is mutated during backtracking. If you append the reference directly, all entries in `result` will point to the same (eventually empty) list.
+
+  key_takeaways:
+    - "**Sorting enables duplicate detection**: Bringing duplicates together lets you identify and skip them with a simple `nums[i] == nums[i-1]` check"
+    - "**The index matters in the skip condition**: Use `i > index` (not `i > 0`) to only skip duplicates at the *current recursion level*, not legitimate uses of duplicate values deeper in the tree"
+    - "**Prune early, not late**: Avoiding duplicate work during generation is far more efficient than deduplicating results afterward"
+    - "**Extends the Subsets pattern**: This is the same backtracking template as Subsets, with just one additional line for duplicate handling — a powerful reminder that small tweaks can adapt patterns to new constraints"
+
+  time_complexity: "O(n * 2^n). In the worst case (all unique elements), we generate 2^n subsets, each taking O(n) to copy. With duplicates, the actual count is lower, but the upper bound remains O(2^n)."
+  space_complexity: "O(n). The recursion depth is at most n, and the `current` list holds at most n elements. The output space for storing subsets is not counted as auxiliary space."
+
+solutions:
+  - approach_name: Backtracking with Duplicate Skipping
+    is_optimal: true
+    code: |
+      def subsets_with_dup(nums: list[int]) -> list[list[int]]:
+          result = []
+          nums.sort()  # Sort to bring duplicates together
+
+          def backtrack(index: int, current: list[int]) -> None:
+              # Every path is a valid subset, add a copy
+              result.append(current[:])
+
+              for i in range(index, len(nums)):
+                  # Skip duplicate values at the same recursion level
+                  if i > index and nums[i] == nums[i - 1]:
+                      continue
+
+                  # Choose: add nums[i] to current subset
+                  current.append(nums[i])
+
+                  # Explore: recurse to consider elements after i
+                  backtrack(i + 1, current)
+
+                  # Unchoose: backtrack to try other options
+                  current.pop()
+
+          backtrack(0, [])
+          return result
+    explanation: |
+      **Time Complexity:** O(n * 2^n) — We generate up to 2^n subsets (fewer with duplicates), each requiring O(n) to copy.
+
+      **Space Complexity:** O(n) — Recursion stack depth is at most n, plus the `current` list of size n.
+
+      The key optimisation is the duplicate skip on line 10. After sorting, duplicates are adjacent. When we encounter a value that equals the previous one *at the same recursion level* (`i > index`), we skip it because all subsets starting with this value have already been explored when we processed its predecessor.
+
+  - approach_name: Iterative with Duplicate Handling
+    is_optimal: true
+    code: |
+      def subsets_with_dup(nums: list[int]) -> list[list[int]]:
+          nums.sort()  # Sort to group duplicates
+          result = [[]]  # Start with empty subset
+
+          start = 0  # Track where new subsets begin
+          for i in range(len(nums)):
+              # If current element is a duplicate, only extend
+              # subsets added in the previous iteration
+              if i > 0 and nums[i] == nums[i - 1]:
+                  new_subsets = [subset + [nums[i]] for subset in result[start:]]
+              else:
+                  # For new elements, extend all existing subsets
+                  new_subsets = [subset + [nums[i]] for subset in result]
+
+              start = len(result)  # Mark where these new subsets start
+              result.extend(new_subsets)
+
+          return result
+    explanation: |
+      **Time Complexity:** O(n * 2^n) — Same as backtracking approach.
+
+      **Space Complexity:** O(1) auxiliary — We only use the output list (no recursion stack).
+
+      This iterative approach builds subsets incrementally. For each new element, we extend existing subsets by adding the element. The key insight for duplicates: when we see a repeated value, we only extend subsets that were *just* created in the previous round (tracked by `start`). This prevents creating duplicate subsets like `[2]` from both the first and second occurrence of `2`.
+
+  - approach_name: Brute Force with Set Deduplication
+    is_optimal: false
+    code: |
+      def subsets_with_dup(nums: list[int]) -> list[list[int]]:
+          result_set = set()
+
+          def backtrack(index: int, current: tuple) -> None:
+              # Add current subset as a sorted tuple (for set comparison)
+              result_set.add(current)
+
+              for i in range(index, len(nums)):
+                  # Extend current subset with nums[i]
+                  backtrack(i + 1, tuple(sorted(current + (nums[i],))))
+
+          backtrack(0, ())
+          # Convert tuples back to lists
+          return [list(subset) for subset in result_set]
+    explanation: |
+      **Time Complexity:** O(n * 2^n * log(n)) — Generates 2^n subsets, sorting each for comparison adds log(n) factor.
+
+      **Space Complexity:** O(n * 2^n) — Stores all unique subsets in a set.
+
+      This naive approach generates all possible subsets and relies on a set to remove duplicates. While correct, it's inefficient because it generates duplicate subsets only to discard them. For input with many duplicates like `[1,1,1,1,1]`, it generates 32 subsets but only 6 are unique. The optimised approaches avoid this wasted work.