questions F-L

2025-05-25 11:47:04 +01:00
parent ecf95bd23d
commit 917c371529
54 changed files with 11235 additions and 0 deletions
--- a/backend/data/questions/first-missing-positive.yaml
+++ b/backend/data/questions/first-missing-positive.yaml
@@ -0,0 +1,212 @@
+title: First Missing Positive
+slug: first-missing-positive
+difficulty: hard
+leetcode_id: 41
+leetcode_url: https://leetcode.com/problems/first-missing-positive/
+categories:
+  - arrays
+  - hash-tables
+patterns:
+  - matrix-traversal
+
+description: |
+  Given an unsorted integer array `nums`, return the *smallest positive integer* that is *not present* in `nums`.
+
+  You must implement an algorithm that runs in `O(n)` time and uses `O(1)` auxiliary space.
+
+constraints: |
+  - `1 <= nums.length <= 10^5`
+  - `-2^31 <= nums[i] <= 2^31 - 1`
+
+examples:
+  - input: "nums = [1,2,0]"
+    output: "3"
+    explanation: "The numbers in the range [1,2] are all in the array."
+  - input: "nums = [3,4,-1,1]"
+    output: "2"
+    explanation: "1 is in the array but 2 is missing."
+  - input: "nums = [7,8,9,11,12]"
+    output: "1"
+    explanation: "The smallest positive integer 1 is missing."
+
+explanation:
+  intuition: |
+    At first glance, this problem seems straightforward — just find the smallest positive integer not in the array. But the real challenge lies in the **O(n) time and O(1) space** constraints. These constraints rule out sorting (O(n log n)) and hash sets (O(n) space).
+
+    The key insight is to **use the array itself as a hash table**. Think of it like assigning seats in a row: if you have `n` seats numbered 1 through `n`, you want each person with ticket number `i` to sit in seat `i`. After everyone is seated, you walk through the row and find the first empty seat — that's your answer.
+
+    Why does this work? The first missing positive must be in the range `[1, n+1]` where `n` is the array length. If all numbers 1 through `n` are present, the answer is `n+1`. Otherwise, some number in `[1, n]` is missing, and we want the smallest one.
+
+    By placing each value `x` at index `x-1` (so value `1` goes to index `0`, value `2` goes to index `1`, etc.), we transform the array into a lookup table. Then a single scan reveals the first position where the value doesn't match its expected index.
+
+  approach: |
+    We solve this using **Cyclic Sort** (in-place rearrangement):
+
+    **Step 1: Rearrange the array**
+
+    - Iterate through each position in the array
+    - For each element `nums[i]`, if it's a positive integer in the range `[1, n]` and not already in its correct position, swap it to where it belongs
+    - Continue swapping at the current position until the element there is either out of range or already correct
+    - This ensures each valid value ends up at index `value - 1`
+
+    &nbsp;
+
+    **Step 2: Find the first missing positive**
+
+    - Scan through the rearranged array
+    - The first index `i` where `nums[i] != i + 1` indicates that `i + 1` is missing
+    - Return `i + 1` as the answer
+
+    &nbsp;
+
+    **Step 3: Handle the all-present case**
+
+    - If all positions contain their expected values (1, 2, 3, ..., n), the answer is `n + 1`
+
+    &nbsp;
+
+    The cyclic sort approach works because we're essentially building a perfect hash function: value `x` maps to index `x - 1`. By rearranging in-place, we use constant extra space while achieving linear time.
+
+  common_pitfalls:
+    - title: Using a Hash Set
+      description: |
+        The most natural approach is to use a hash set to store all positive numbers, then iterate from 1 upward to find the first missing:
+
+        ```python
+        seen = set(nums)
+        for i in range(1, len(nums) + 2):
+            if i not in seen:
+                return i
+        ```
+
+        While this is O(n) time, it uses **O(n) space** for the hash set, violating the space constraint. The problem explicitly requires O(1) auxiliary space.
+      wrong_approach: "Hash set for O(1) lookup"
+      correct_approach: "Use the array itself as a hash table via cyclic sort"
+
+    - title: Sorting the Array
+      description: |
+        Another tempting approach is to sort the array first, then scan for the first missing positive:
+
+        ```python
+        nums.sort()
+        # Find first missing...
+        ```
+
+        Sorting takes **O(n log n)** time, which violates the O(n) time constraint. Even if you're okay with that, this approach still requires careful handling of duplicates and negatives.
+      wrong_approach: "Sort first, then scan"
+      correct_approach: "Cyclic sort achieves O(n) time"
+
+    - title: Infinite Loop During Swapping
+      description: |
+        When implementing the swap logic, you must check if the target position already contains the correct value:
+
+        ```python
+        # Wrong: may infinite loop if duplicates exist
+        while 1 <= nums[i] <= n:
+            swap(nums[i], nums[nums[i] - 1])
+
+        # Correct: stop if already in place or duplicate
+        while 1 <= nums[i] <= n and nums[i] != nums[nums[i] - 1]:
+            swap(...)
+        ```
+
+        Without the second condition, swapping identical values creates an infinite loop.
+      wrong_approach: "Only check range bounds"
+      correct_approach: "Also check if target position already has the correct value"
+
+    - title: Forgetting the n+1 Case
+      description: |
+        If the array contains exactly [1, 2, 3, ..., n], then no number in the array is missing — the answer is `n + 1`. Make sure your final scan handles this edge case, typically by returning `n + 1` if the entire array is correctly positioned.
+      wrong_approach: "Only scan the array without a fallback"
+      correct_approach: "Return n + 1 if all positions are correct"
+
+  key_takeaways:
+    - "**Cyclic sort pattern**: When values have a natural position (like 1 to n mapping to indices 0 to n-1), consider rearranging the array in-place"
+    - "**Array as hash table**: The array itself can serve as a constant-space lookup structure when the value range is bounded"
+    - "**Constraint-driven design**: The O(1) space requirement is the key hint that we must modify the input array rather than use auxiliary data structures"
+    - "**Related problems**: This technique applies to finding duplicates, missing numbers, and other permutation-based problems"
+
+  time_complexity: "O(n). Each element is swapped at most once to its correct position, and we make two linear passes through the array."
+  space_complexity: "O(1). We only use a constant number of variables; all rearrangement happens in-place."
+
+solutions:
+  - approach_name: Cyclic Sort
+    is_optimal: true
+    code: |
+      def first_missing_positive(nums: list[int]) -> int:
+          n = len(nums)
+
+          # Phase 1: Place each value at its correct index
+          # Value x should be at index x-1
+          for i in range(n):
+              # Keep swapping until current element is in place or invalid
+              while 1 <= nums[i] <= n and nums[i] != nums[nums[i] - 1]:
+                  # Swap nums[i] to its correct position
+                  correct_idx = nums[i] - 1
+                  nums[i], nums[correct_idx] = nums[correct_idx], nums[i]
+
+          # Phase 2: Find first position where value doesn't match index + 1
+          for i in range(n):
+              if nums[i] != i + 1:
+                  return i + 1
+
+          # All values 1 to n are present, so answer is n + 1
+          return n + 1
+    explanation: |
+      **Time Complexity:** O(n) — Although there's a nested while loop, each element is moved at most once to its final position, giving O(n) total swaps.
+
+      **Space Complexity:** O(1) — Only a few variables are used; the array is modified in-place.
+
+      The algorithm works in two phases: first, we rearrange the array so that value `i` sits at index `i-1`. Then we scan to find the first mismatch. This clever use of the input array as a hash table satisfies both the time and space constraints.
+
+  - approach_name: Hash Set
+    is_optimal: false
+    code: |
+      def first_missing_positive(nums: list[int]) -> int:
+          # Store all positive numbers in a set
+          num_set = set(nums)
+
+          # Check each positive integer starting from 1
+          for i in range(1, len(nums) + 2):
+              if i not in num_set:
+                  return i
+
+          # This line is never reached given the loop bounds
+          return len(nums) + 1
+    explanation: |
+      **Time Complexity:** O(n) — Building the set and scanning are both linear.
+
+      **Space Complexity:** O(n) — The hash set stores up to n elements.
+
+      This approach is intuitive and correct, but uses O(n) extra space, violating the problem's constraints. It's included to illustrate the natural solution that the cyclic sort approach improves upon.
+
+  - approach_name: Index Marking
+    is_optimal: true
+    code: |
+      def first_missing_positive(nums: list[int]) -> int:
+          n = len(nums)
+
+          # Step 1: Replace non-positive and out-of-range values with n+1
+          for i in range(n):
+              if nums[i] <= 0 or nums[i] > n:
+                  nums[i] = n + 1
+
+          # Step 2: Mark presence by negating values at corresponding indices
+          for i in range(n):
+              val = abs(nums[i])
+              if val <= n:
+                  # Mark index val-1 as "seen" by making it negative
+                  nums[val - 1] = -abs(nums[val - 1])
+
+          # Step 3: Find first positive value (indicates missing number)
+          for i in range(n):
+              if nums[i] > 0:
+                  return i + 1
+
+          return n + 1
+    explanation: |
+      **Time Complexity:** O(n) — Three linear passes through the array.
+
+      **Space Complexity:** O(1) — Only modifies the array in-place.
+
+      This alternative approach uses the sign of each element as a flag. After replacing invalid values with `n+1`, we mark the presence of value `x` by negating the element at index `x-1`. Finally, the first positive element indicates the missing number. Both this and cyclic sort are optimal solutions.