Files
codetutor/backend/data/questions/remove-duplicates-from-sorted-array.yaml
2025-05-25 12:43:25 +01:00

191 lines
8.8 KiB
YAML

title: Remove Duplicates from Sorted Array
slug: remove-duplicates-from-sorted-array
difficulty: easy
leetcode_id: 26
leetcode_url: https://leetcode.com/problems/remove-duplicates-from-sorted-array/
categories:
- arrays
- two-pointers
patterns:
- two-pointers
description: |
Given an integer array `nums` sorted in **non-decreasing order**, remove the duplicates *in-place* such that each unique element appears only **once**. The **relative order** of the elements should be kept the **same**.
Consider the number of unique elements in `nums` to be `k`. After removing duplicates, return the number of unique elements `k`.
The first `k` elements of `nums` should contain the unique numbers in **sorted order**. The remaining elements beyond index `k - 1` can be ignored.
**Custom Judge:**
The judge will test your solution with the following code:
```java
int[] nums = [...]; // Input array
int[] expectedNums = [...]; // The expected answer with correct length
int k = removeDuplicates(nums); // Calls your implementation
assert k == expectedNums.length;
for (int i = 0; i < k; i++) {
assert nums[i] == expectedNums[i];
}
```
If all assertions pass, then your solution will be **accepted**.
constraints: |
- `1 <= nums.length <= 3 * 10^4`
- `-100 <= nums[i] <= 100`
- `nums` is sorted in **non-decreasing** order
examples:
- input: "nums = [1,1,2]"
output: "2, nums = [1,2,_]"
explanation: "Your function should return k = 2, with the first two elements of nums being 1 and 2 respectively. It does not matter what you leave beyond the returned k (hence they are underscores)."
- input: "nums = [0,0,1,1,1,2,2,3,3,4]"
output: "5, nums = [0,1,2,3,4,_,_,_,_,_]"
explanation: "Your function should return k = 5, with the first five elements of nums being 0, 1, 2, 3, and 4 respectively. It does not matter what you leave beyond the returned k (hence they are underscores)."
explanation:
intuition: |
Imagine you're organising a bookshelf where books are already sorted alphabetically, but some titles appear multiple times. You want to keep only one copy of each book while maintaining the sorted order.
The key insight is that the array is **already sorted**. This means all duplicates of a value are grouped together consecutively. You don't need to search the entire array to find duplicates — you only need to compare adjacent elements.
Think of it like this: use two pointers working together. One pointer (`write_index`) marks where the next unique element should be written. The other pointer (`read_index`) scans through the array looking for new unique values. When you find a value different from the last unique one, you copy it to the write position and advance.
Since the array is sorted, if `nums[i] != nums[i-1]`, then `nums[i]` is definitely a new unique value that hasn't appeared before in our result.
approach: |
We solve this using the **Two Pointers** technique:
**Step 1: Handle edge case**
- If the array has 0 or 1 elements, return the length directly — no duplicates possible
&nbsp;
**Step 2: Initialise write pointer**
- `write_index`: Set to `1` because the first element is always unique (nothing to compare it against)
&nbsp;
**Step 3: Iterate with read pointer**
- Start `read_index` at `1` and scan through the array
- For each element, compare `nums[read_index]` with `nums[write_index - 1]` (the last written unique value)
- If they differ, we found a new unique element:
- Copy `nums[read_index]` to `nums[write_index]`
- Increment `write_index`
- If they're the same, it's a duplicate — skip it by just incrementing `read_index`
&nbsp;
**Step 4: Return the count**
- Return `write_index` which equals the number of unique elements
&nbsp;
This works because we're essentially partitioning the array: the first `write_index` positions contain unique values, and everything after can be ignored.
common_pitfalls:
- title: Using Extra Space
description: |
A common instinct is to create a new array or use a set to track seen elements:
```python
seen = set()
result = []
for num in nums:
if num not in seen:
seen.add(num)
result.append(num)
```
While this works logically, it uses **O(n) extra space**, violating the in-place requirement. The problem specifically asks you to modify the original array using only O(1) extra space.
wrong_approach: "Using a set or auxiliary array"
correct_approach: "Two pointers modifying array in-place"
- title: Comparing Wrong Elements
description: |
When checking for duplicates, compare the current element with the **last written unique element**, not the previous element in the original array.
For example, with `[1, 1, 1, 2]`:
- If you compare `nums[2]` with `nums[1]`, they're both `1`, so you skip — correct so far
- But if your write_index is at 1 and you compare `nums[3]` with `nums[2]`, you get `2 != 1` — but you should compare with `nums[write_index - 1]`
The safest approach: always compare with `nums[write_index - 1]` to check against the last confirmed unique value.
wrong_approach: "Comparing with nums[i-1] in original array"
correct_approach: "Comparing with nums[write_index - 1]"
- title: Off-by-One Errors
description: |
Starting `write_index` at `0` instead of `1` leads to overwriting the first element incorrectly, or missing it entirely in your count.
Remember: the first element is automatically unique. Start writing from index `1`, and your final answer is the value of `write_index`, not `write_index - 1`.
key_takeaways:
- "**Two pointers for in-place modification**: One pointer tracks where to write, the other scans for new values — a classic pattern for array manipulation without extra space"
- "**Sorted arrays simplify duplicate detection**: Duplicates are always adjacent, so a single comparison with the previous unique element is sufficient"
- "**Foundation for harder problems**: This technique extends to problems like *Remove Duplicates from Sorted Array II* (allow up to 2 duplicates) and *Remove Element*"
- "**Read-write pointer pattern**: This same pattern applies whenever you need to selectively keep elements while modifying an array in-place"
time_complexity: "O(n). We traverse the array exactly once with two pointers, performing constant-time operations at each step."
space_complexity: "O(1). We only use two integer variables (`write_index` and `read_index`) regardless of input size — the modification happens in-place."
solutions:
- approach_name: Two Pointers
is_optimal: true
code: |
def remove_duplicates(nums: list[int]) -> int:
# Edge case: empty or single-element array
if len(nums) <= 1:
return len(nums)
# First element is always unique, start writing from index 1
write_index = 1
# Scan through array starting from second element
for read_index in range(1, len(nums)):
# Found a new unique value (different from last written)
if nums[read_index] != nums[write_index - 1]:
# Copy it to the write position
nums[write_index] = nums[read_index]
# Move write pointer forward
write_index += 1
# write_index equals the count of unique elements
return write_index
explanation: |
**Time Complexity:** O(n) — Single pass through the array.
**Space Complexity:** O(1) — Only two integer pointers used.
The read pointer scans every element once, while the write pointer only advances when we find a unique value. Since the array is sorted, we only need to compare with the most recently written element to detect duplicates.
- approach_name: Using Set (Extra Space)
is_optimal: false
code: |
def remove_duplicates(nums: list[int]) -> int:
# Track unique elements we've seen
seen = set()
write_index = 0
for num in nums:
# Only keep elements we haven't seen before
if num not in seen:
seen.add(num)
nums[write_index] = num
write_index += 1
return write_index
explanation: |
**Time Complexity:** O(n) — Single pass with O(1) set operations.
**Space Complexity:** O(n) — The set stores up to n unique elements.
While this achieves the same result, it violates the O(1) space constraint. It's included to illustrate how the sorted property of the input allows us to eliminate the need for a set entirely. If the array were unsorted, we'd need this approach or sorting first.