title: First Missing Positive slug: first-missing-positive difficulty: hard leetcode_id: 41 leetcode_url: https://leetcode.com/problems/first-missing-positive/ categories: - arrays - hash-tables patterns: - matrix-traversal description: | Given an unsorted integer array `nums`, return the *smallest positive integer* that is *not present* in `nums`. You must implement an algorithm that runs in `O(n)` time and uses `O(1)` auxiliary space. constraints: | - `1 <= nums.length <= 10^5` - `-2^31 <= nums[i] <= 2^31 - 1` examples: - input: "nums = [1,2,0]" output: "3" explanation: "The numbers in the range [1,2] are all in the array." - input: "nums = [3,4,-1,1]" output: "2" explanation: "1 is in the array but 2 is missing." - input: "nums = [7,8,9,11,12]" output: "1" explanation: "The smallest positive integer 1 is missing." explanation: intuition: | At first glance, this problem seems straightforward — just find the smallest positive integer not in the array. But the real challenge lies in the **O(n) time and O(1) space** constraints. These constraints rule out sorting (O(n log n)) and hash sets (O(n) space). The key insight is to **use the array itself as a hash table**. Think of it like assigning seats in a row: if you have `n` seats numbered 1 through `n`, you want each person with ticket number `i` to sit in seat `i`. After everyone is seated, you walk through the row and find the first empty seat — that's your answer. Why does this work? The first missing positive must be in the range `[1, n+1]` where `n` is the array length. If all numbers 1 through `n` are present, the answer is `n+1`. Otherwise, some number in `[1, n]` is missing, and we want the smallest one. By placing each value `x` at index `x-1` (so value `1` goes to index `0`, value `2` goes to index `1`, etc.), we transform the array into a lookup table. Then a single scan reveals the first position where the value doesn't match its expected index. approach: | We solve this using **Cyclic Sort** (in-place rearrangement): **Step 1: Rearrange the array** - Iterate through each position in the array - For each element `nums[i]`, if it's a positive integer in the range `[1, n]` and not already in its correct position, swap it to where it belongs - Continue swapping at the current position until the element there is either out of range or already correct - This ensures each valid value ends up at index `value - 1`   **Step 2: Find the first missing positive** - Scan through the rearranged array - The first index `i` where `nums[i] != i + 1` indicates that `i + 1` is missing - Return `i + 1` as the answer   **Step 3: Handle the all-present case** - If all positions contain their expected values (1, 2, 3, ..., n), the answer is `n + 1`   The cyclic sort approach works because we're essentially building a perfect hash function: value `x` maps to index `x - 1`. By rearranging in-place, we use constant extra space while achieving linear time. common_pitfalls: - title: Using a Hash Set description: | The most natural approach is to use a hash set to store all positive numbers, then iterate from 1 upward to find the first missing: ```python seen = set(nums) for i in range(1, len(nums) + 2): if i not in seen: return i ``` While this is O(n) time, it uses **O(n) space** for the hash set, violating the space constraint. The problem explicitly requires O(1) auxiliary space. wrong_approach: "Hash set for O(1) lookup" correct_approach: "Use the array itself as a hash table via cyclic sort" - title: Sorting the Array description: | Another tempting approach is to sort the array first, then scan for the first missing positive: ```python nums.sort() # Find first missing... ``` Sorting takes **O(n log n)** time, which violates the O(n) time constraint. Even if you're okay with that, this approach still requires careful handling of duplicates and negatives. wrong_approach: "Sort first, then scan" correct_approach: "Cyclic sort achieves O(n) time" - title: Infinite Loop During Swapping description: | When implementing the swap logic, you must check if the target position already contains the correct value: ```python # Wrong: may infinite loop if duplicates exist while 1 <= nums[i] <= n: swap(nums[i], nums[nums[i] - 1]) # Correct: stop if already in place or duplicate while 1 <= nums[i] <= n and nums[i] != nums[nums[i] - 1]: swap(...) ``` Without the second condition, swapping identical values creates an infinite loop. wrong_approach: "Only check range bounds" correct_approach: "Also check if target position already has the correct value" - title: Forgetting the n+1 Case description: | If the array contains exactly [1, 2, 3, ..., n], then no number in the array is missing — the answer is `n + 1`. Make sure your final scan handles this edge case, typically by returning `n + 1` if the entire array is correctly positioned. wrong_approach: "Only scan the array without a fallback" correct_approach: "Return n + 1 if all positions are correct" key_takeaways: - "**Cyclic sort pattern**: When values have a natural position (like 1 to n mapping to indices 0 to n-1), consider rearranging the array in-place" - "**Array as hash table**: The array itself can serve as a constant-space lookup structure when the value range is bounded" - "**Constraint-driven design**: The O(1) space requirement is the key hint that we must modify the input array rather than use auxiliary data structures" - "**Related problems**: This technique applies to finding duplicates, missing numbers, and other permutation-based problems" time_complexity: "O(n). Each element is swapped at most once to its correct position, and we make two linear passes through the array." space_complexity: "O(1). We only use a constant number of variables; all rearrangement happens in-place." solutions: - approach_name: Cyclic Sort is_optimal: true code: | def first_missing_positive(nums: list[int]) -> int: n = len(nums) # Phase 1: Place each value at its correct index # Value x should be at index x-1 for i in range(n): # Keep swapping until current element is in place or invalid while 1 <= nums[i] <= n and nums[i] != nums[nums[i] - 1]: # Swap nums[i] to its correct position correct_idx = nums[i] - 1 nums[i], nums[correct_idx] = nums[correct_idx], nums[i] # Phase 2: Find first position where value doesn't match index + 1 for i in range(n): if nums[i] != i + 1: return i + 1 # All values 1 to n are present, so answer is n + 1 return n + 1 explanation: | **Time Complexity:** O(n) — Although there's a nested while loop, each element is moved at most once to its final position, giving O(n) total swaps. **Space Complexity:** O(1) — Only a few variables are used; the array is modified in-place. The algorithm works in two phases: first, we rearrange the array so that value `i` sits at index `i-1`. Then we scan to find the first mismatch. This clever use of the input array as a hash table satisfies both the time and space constraints. - approach_name: Hash Set is_optimal: false code: | def first_missing_positive(nums: list[int]) -> int: # Store all positive numbers in a set num_set = set(nums) # Check each positive integer starting from 1 for i in range(1, len(nums) + 2): if i not in num_set: return i # This line is never reached given the loop bounds return len(nums) + 1 explanation: | **Time Complexity:** O(n) — Building the set and scanning are both linear. **Space Complexity:** O(n) — The hash set stores up to n elements. This approach is intuitive and correct, but uses O(n) extra space, violating the problem's constraints. It's included to illustrate the natural solution that the cyclic sort approach improves upon. - approach_name: Index Marking is_optimal: true code: | def first_missing_positive(nums: list[int]) -> int: n = len(nums) # Step 1: Replace non-positive and out-of-range values with n+1 for i in range(n): if nums[i] <= 0 or nums[i] > n: nums[i] = n + 1 # Step 2: Mark presence by negating values at corresponding indices for i in range(n): val = abs(nums[i]) if val <= n: # Mark index val-1 as "seen" by making it negative nums[val - 1] = -abs(nums[val - 1]) # Step 3: Find first positive value (indicates missing number) for i in range(n): if nums[i] > 0: return i + 1 return n + 1 explanation: | **Time Complexity:** O(n) — Three linear passes through the array. **Space Complexity:** O(1) — Only modifies the array in-place. This alternative approach uses the sign of each element as a flag. After replacing invalid values with `n+1`, we mark the presence of value `x` by negating the element at index `x-1`. Finally, the first positive element indicates the missing number. Both this and cyclic sort are optimal solutions.