213 lines
9.6 KiB
YAML
213 lines
9.6 KiB
YAML
title: First Missing Positive
|
|
slug: first-missing-positive
|
|
difficulty: hard
|
|
leetcode_id: 41
|
|
leetcode_url: https://leetcode.com/problems/first-missing-positive/
|
|
categories:
|
|
- arrays
|
|
- hash-tables
|
|
patterns:
|
|
- matrix-traversal
|
|
|
|
description: |
|
|
Given an unsorted integer array `nums`, return the *smallest positive integer* that is *not present* in `nums`.
|
|
|
|
You must implement an algorithm that runs in `O(n)` time and uses `O(1)` auxiliary space.
|
|
|
|
constraints: |
|
|
- `1 <= nums.length <= 10^5`
|
|
- `-2^31 <= nums[i] <= 2^31 - 1`
|
|
|
|
examples:
|
|
- input: "nums = [1,2,0]"
|
|
output: "3"
|
|
explanation: "The numbers in the range [1,2] are all in the array."
|
|
- input: "nums = [3,4,-1,1]"
|
|
output: "2"
|
|
explanation: "1 is in the array but 2 is missing."
|
|
- input: "nums = [7,8,9,11,12]"
|
|
output: "1"
|
|
explanation: "The smallest positive integer 1 is missing."
|
|
|
|
explanation:
|
|
intuition: |
|
|
At first glance, this problem seems straightforward — just find the smallest positive integer not in the array. But the real challenge lies in the **O(n) time and O(1) space** constraints. These constraints rule out sorting (O(n log n)) and hash sets (O(n) space).
|
|
|
|
The key insight is to **use the array itself as a hash table**. Think of it like assigning seats in a row: if you have `n` seats numbered 1 through `n`, you want each person with ticket number `i` to sit in seat `i`. After everyone is seated, you walk through the row and find the first empty seat — that's your answer.
|
|
|
|
Why does this work? The first missing positive must be in the range `[1, n+1]` where `n` is the array length. If all numbers 1 through `n` are present, the answer is `n+1`. Otherwise, some number in `[1, n]` is missing, and we want the smallest one.
|
|
|
|
By placing each value `x` at index `x-1` (so value `1` goes to index `0`, value `2` goes to index `1`, etc.), we transform the array into a lookup table. Then a single scan reveals the first position where the value doesn't match its expected index.
|
|
|
|
approach: |
|
|
We solve this using **Cyclic Sort** (in-place rearrangement):
|
|
|
|
**Step 1: Rearrange the array**
|
|
|
|
- Iterate through each position in the array
|
|
- For each element `nums[i]`, if it's a positive integer in the range `[1, n]` and not already in its correct position, swap it to where it belongs
|
|
- Continue swapping at the current position until the element there is either out of range or already correct
|
|
- This ensures each valid value ends up at index `value - 1`
|
|
|
|
|
|
|
|
**Step 2: Find the first missing positive**
|
|
|
|
- Scan through the rearranged array
|
|
- The first index `i` where `nums[i] != i + 1` indicates that `i + 1` is missing
|
|
- Return `i + 1` as the answer
|
|
|
|
|
|
|
|
**Step 3: Handle the all-present case**
|
|
|
|
- If all positions contain their expected values (1, 2, 3, ..., n), the answer is `n + 1`
|
|
|
|
|
|
|
|
The cyclic sort approach works because we're essentially building a perfect hash function: value `x` maps to index `x - 1`. By rearranging in-place, we use constant extra space while achieving linear time.
|
|
|
|
common_pitfalls:
|
|
- title: Using a Hash Set
|
|
description: |
|
|
The most natural approach is to use a hash set to store all positive numbers, then iterate from 1 upward to find the first missing:
|
|
|
|
```python
|
|
seen = set(nums)
|
|
for i in range(1, len(nums) + 2):
|
|
if i not in seen:
|
|
return i
|
|
```
|
|
|
|
While this is O(n) time, it uses **O(n) space** for the hash set, violating the space constraint. The problem explicitly requires O(1) auxiliary space.
|
|
wrong_approach: "Hash set for O(1) lookup"
|
|
correct_approach: "Use the array itself as a hash table via cyclic sort"
|
|
|
|
- title: Sorting the Array
|
|
description: |
|
|
Another tempting approach is to sort the array first, then scan for the first missing positive:
|
|
|
|
```python
|
|
nums.sort()
|
|
# Find first missing...
|
|
```
|
|
|
|
Sorting takes **O(n log n)** time, which violates the O(n) time constraint. Even if you're okay with that, this approach still requires careful handling of duplicates and negatives.
|
|
wrong_approach: "Sort first, then scan"
|
|
correct_approach: "Cyclic sort achieves O(n) time"
|
|
|
|
- title: Infinite Loop During Swapping
|
|
description: |
|
|
When implementing the swap logic, you must check if the target position already contains the correct value:
|
|
|
|
```python
|
|
# Wrong: may infinite loop if duplicates exist
|
|
while 1 <= nums[i] <= n:
|
|
swap(nums[i], nums[nums[i] - 1])
|
|
|
|
# Correct: stop if already in place or duplicate
|
|
while 1 <= nums[i] <= n and nums[i] != nums[nums[i] - 1]:
|
|
swap(...)
|
|
```
|
|
|
|
Without the second condition, swapping identical values creates an infinite loop.
|
|
wrong_approach: "Only check range bounds"
|
|
correct_approach: "Also check if target position already has the correct value"
|
|
|
|
- title: Forgetting the n+1 Case
|
|
description: |
|
|
If the array contains exactly [1, 2, 3, ..., n], then no number in the array is missing — the answer is `n + 1`. Make sure your final scan handles this edge case, typically by returning `n + 1` if the entire array is correctly positioned.
|
|
wrong_approach: "Only scan the array without a fallback"
|
|
correct_approach: "Return n + 1 if all positions are correct"
|
|
|
|
key_takeaways:
|
|
- "**Cyclic sort pattern**: When values have a natural position (like 1 to n mapping to indices 0 to n-1), consider rearranging the array in-place"
|
|
- "**Array as hash table**: The array itself can serve as a constant-space lookup structure when the value range is bounded"
|
|
- "**Constraint-driven design**: The O(1) space requirement is the key hint that we must modify the input array rather than use auxiliary data structures"
|
|
- "**Related problems**: This technique applies to finding duplicates, missing numbers, and other permutation-based problems"
|
|
|
|
time_complexity: "O(n). Each element is swapped at most once to its correct position, and we make two linear passes through the array."
|
|
space_complexity: "O(1). We only use a constant number of variables; all rearrangement happens in-place."
|
|
|
|
solutions:
|
|
- approach_name: Cyclic Sort
|
|
is_optimal: true
|
|
code: |
|
|
def first_missing_positive(nums: list[int]) -> int:
|
|
n = len(nums)
|
|
|
|
# Phase 1: Place each value at its correct index
|
|
# Value x should be at index x-1
|
|
for i in range(n):
|
|
# Keep swapping until current element is in place or invalid
|
|
while 1 <= nums[i] <= n and nums[i] != nums[nums[i] - 1]:
|
|
# Swap nums[i] to its correct position
|
|
correct_idx = nums[i] - 1
|
|
nums[i], nums[correct_idx] = nums[correct_idx], nums[i]
|
|
|
|
# Phase 2: Find first position where value doesn't match index + 1
|
|
for i in range(n):
|
|
if nums[i] != i + 1:
|
|
return i + 1
|
|
|
|
# All values 1 to n are present, so answer is n + 1
|
|
return n + 1
|
|
explanation: |
|
|
**Time Complexity:** O(n) — Although there's a nested while loop, each element is moved at most once to its final position, giving O(n) total swaps.
|
|
|
|
**Space Complexity:** O(1) — Only a few variables are used; the array is modified in-place.
|
|
|
|
The algorithm works in two phases: first, we rearrange the array so that value `i` sits at index `i-1`. Then we scan to find the first mismatch. This clever use of the input array as a hash table satisfies both the time and space constraints.
|
|
|
|
- approach_name: Hash Set
|
|
is_optimal: false
|
|
code: |
|
|
def first_missing_positive(nums: list[int]) -> int:
|
|
# Store all positive numbers in a set
|
|
num_set = set(nums)
|
|
|
|
# Check each positive integer starting from 1
|
|
for i in range(1, len(nums) + 2):
|
|
if i not in num_set:
|
|
return i
|
|
|
|
# This line is never reached given the loop bounds
|
|
return len(nums) + 1
|
|
explanation: |
|
|
**Time Complexity:** O(n) — Building the set and scanning are both linear.
|
|
|
|
**Space Complexity:** O(n) — The hash set stores up to n elements.
|
|
|
|
This approach is intuitive and correct, but uses O(n) extra space, violating the problem's constraints. It's included to illustrate the natural solution that the cyclic sort approach improves upon.
|
|
|
|
- approach_name: Index Marking
|
|
is_optimal: true
|
|
code: |
|
|
def first_missing_positive(nums: list[int]) -> int:
|
|
n = len(nums)
|
|
|
|
# Step 1: Replace non-positive and out-of-range values with n+1
|
|
for i in range(n):
|
|
if nums[i] <= 0 or nums[i] > n:
|
|
nums[i] = n + 1
|
|
|
|
# Step 2: Mark presence by negating values at corresponding indices
|
|
for i in range(n):
|
|
val = abs(nums[i])
|
|
if val <= n:
|
|
# Mark index val-1 as "seen" by making it negative
|
|
nums[val - 1] = -abs(nums[val - 1])
|
|
|
|
# Step 3: Find first positive value (indicates missing number)
|
|
for i in range(n):
|
|
if nums[i] > 0:
|
|
return i + 1
|
|
|
|
return n + 1
|
|
explanation: |
|
|
**Time Complexity:** O(n) — Three linear passes through the array.
|
|
|
|
**Space Complexity:** O(1) — Only modifies the array in-place.
|
|
|
|
This alternative approach uses the sign of each element as a flag. After replacing invalid values with `n+1`, we mark the presence of value `x` by negating the element at index `x-1`. Finally, the first positive element indicates the missing number. Both this and cyclic sort are optimal solutions.
|