208 lines
9.3 KiB
YAML
208 lines
9.3 KiB
YAML
title: Contains Duplicate II
|
||
slug: contains-duplicate-ii
|
||
difficulty: easy
|
||
leetcode_id: 219
|
||
leetcode_url: https://leetcode.com/problems/contains-duplicate-ii/
|
||
categories:
|
||
- arrays
|
||
- hash-tables
|
||
patterns:
|
||
- sliding-window
|
||
|
||
function_signature: "def contains_nearby_duplicate(nums: list[int], k: int) -> bool:"
|
||
|
||
test_cases:
|
||
visible:
|
||
- input: { nums: [1, 2, 3, 1], k: 3 }
|
||
expected: true
|
||
- input: { nums: [1, 0, 1, 1], k: 1 }
|
||
expected: true
|
||
- input: { nums: [1, 2, 3, 1, 2, 3], k: 2 }
|
||
expected: false
|
||
hidden:
|
||
- input: { nums: [1], k: 1 }
|
||
expected: false
|
||
- input: { nums: [1, 1], k: 0 }
|
||
expected: false
|
||
- input: { nums: [1, 2, 1], k: 2 }
|
||
expected: true
|
||
- input: { nums: [99, 99], k: 2 }
|
||
expected: true
|
||
- input: { nums: [1, 2, 3, 4, 5], k: 3 }
|
||
expected: false
|
||
- input: { nums: [0, 1, 2, 3, 4, 0, 0, 7, 8, 9, 10, 11, 12, 0], k: 1 }
|
||
expected: true
|
||
|
||
description: |
|
||
Given an integer array `nums` and an integer `k`, return `true` *if there are two **distinct indices*** `i` *and* `j` *in the array such that* `nums[i] == nums[j]` *and* `abs(i - j) <= k`.
|
||
|
||
constraints: |
|
||
- `1 <= nums.length <= 10^5`
|
||
- `-10^9 <= nums[i] <= 10^9`
|
||
- `0 <= k <= 10^5`
|
||
|
||
examples:
|
||
- input: "nums = [1,2,3,1], k = 3"
|
||
output: "true"
|
||
explanation: "The element 1 appears at index 0 and index 3. Since abs(0 - 3) = 3 <= k, we return true."
|
||
- input: "nums = [1,0,1,1], k = 1"
|
||
output: "true"
|
||
explanation: "The element 1 appears at index 2 and index 3. Since abs(2 - 3) = 1 <= k, we return true."
|
||
- input: "nums = [1,2,3,1,2,3], k = 2"
|
||
output: "false"
|
||
explanation: "While there are duplicates, no pair of duplicate values are within k = 2 indices of each other. The closest duplicate pair (1 at index 0 and 3) has distance 3 > k."
|
||
|
||
explanation:
|
||
intuition: |
|
||
Imagine you're walking through a hallway with numbered rooms, and you need to find if any room number repeats within the last `k` rooms you've passed.
|
||
|
||
The core insight is that we don't need to remember *every* room we've ever seen — we only care about rooms within our **sliding window** of the last `k` positions. If we encounter a room number we've seen within this window, we've found our duplicate.
|
||
|
||
Think of it like this: as you move forward, you maintain a "memory" of the last `k` rooms. When you see a new room number, you check if it's already in your memory. If yes, you found a nearby duplicate. If not, add it to your memory and forget the oldest room (the one that's now more than `k` steps behind).
|
||
|
||
This naturally suggests using a **hash set** as our memory — it gives us O(1) lookups to check for duplicates and O(1) insertions/deletions to maintain our sliding window.
|
||
|
||
approach: |
|
||
We solve this using a **Sliding Window with Hash Set** approach:
|
||
|
||
**Step 1: Initialise a hash set**
|
||
|
||
- Create an empty set `window` to store elements within our current window of size `k`
|
||
- The set will contain at most `k` elements at any time
|
||
|
||
|
||
|
||
**Step 2: Iterate through the array**
|
||
|
||
- For each element at index `i`, check if it already exists in our `window` set
|
||
- If yes, we found a duplicate within distance `k` — return `true`
|
||
- If no, add the current element to the window
|
||
|
||
|
||
|
||
**Step 3: Maintain window size**
|
||
|
||
- If the window size exceeds `k`, remove the oldest element (the one at index `i - k`)
|
||
- This ensures we only track elements within the valid distance
|
||
|
||
|
||
|
||
**Step 4: Return the result**
|
||
|
||
- If we complete the loop without finding duplicates, return `false`
|
||
|
||
|
||
|
||
This approach efficiently combines the sliding window pattern with a hash set for O(1) operations, giving us an optimal O(n) solution.
|
||
|
||
common_pitfalls:
|
||
- title: The Brute Force Trap
|
||
description: |
|
||
A naive approach checks every pair of elements to see if they're equal and within distance `k`:
|
||
- Outer loop `i` from `0` to `n-1`
|
||
- Inner loop `j` from `i+1` to `min(i+k+1, n)`
|
||
|
||
While this limits the inner loop to `k` iterations, it's still **O(n × k)** in the worst case. When both `n` and `k` are at their maximum (`10^5`), this results in up to 10 billion operations — causing a **Time Limit Exceeded (TLE)** error.
|
||
wrong_approach: "Nested loops checking pairs within distance k"
|
||
correct_approach: "Sliding window with hash set for O(n) time"
|
||
|
||
- title: Using a Hash Map Instead of a Set
|
||
description: |
|
||
While a hash map (storing value → index) works, it's more complex than necessary. You'd need to update indices as you go and compare distances.
|
||
|
||
A hash set is simpler: by maintaining exactly the last `k` elements, we implicitly guarantee any match is within the valid distance. If it's in the set, it's within range.
|
||
wrong_approach: "Hash map with index tracking and distance calculation"
|
||
correct_approach: "Hash set with sliding window of size k"
|
||
|
||
- title: Off-by-One in Window Size
|
||
description: |
|
||
Be careful about when to remove elements from the window. The condition `abs(i - j) <= k` means indices can be up to `k` apart, so your window should contain `k` previous elements (not `k-1` or `k+1`).
|
||
|
||
Remove the element at index `i - k` only when `i >= k`, ensuring the window never exceeds `k` elements from the past.
|
||
wrong_approach: "Removing when i > k or keeping k+1 elements"
|
||
correct_approach: "Remove element at index i - k when i >= k"
|
||
|
||
key_takeaways:
|
||
- "**Sliding window + hash set**: When you need to find duplicates within a range, combine a fixed-size window with a set for O(1) lookups"
|
||
- "**Implicit distance guarantee**: By maintaining exactly `k` elements, any match is automatically within the valid distance — no need to track indices"
|
||
- "**Set vs Map tradeoff**: Choose the simpler data structure when it suffices; a set is often cleaner than a map when you don't need the stored values"
|
||
- "**Related problems**: This pattern extends to 'Contains Duplicate III' (within range *and* value difference) and other sliding window problems"
|
||
|
||
time_complexity: "O(n). We traverse the array once, with O(1) hash set operations (add, remove, lookup) at each step."
|
||
space_complexity: "O(min(n, k)). The hash set stores at most `min(n, k)` elements at any time."
|
||
|
||
solutions:
|
||
- approach_name: Sliding Window with Hash Set
|
||
is_optimal: true
|
||
code: |
|
||
def contains_nearby_duplicate(nums: list[int], k: int) -> bool:
|
||
# Set to track elements in our current window of size k
|
||
window = set()
|
||
|
||
for i, num in enumerate(nums):
|
||
# If we've seen this number in our window, we found a duplicate
|
||
if num in window:
|
||
return True
|
||
|
||
# Add current element to the window
|
||
window.add(num)
|
||
|
||
# Maintain window size: remove element that's now too far behind
|
||
if i >= k:
|
||
window.remove(nums[i - k])
|
||
|
||
# No nearby duplicates found
|
||
return False
|
||
explanation: |
|
||
**Time Complexity:** O(n) — Single pass through the array with O(1) set operations.
|
||
|
||
**Space Complexity:** O(min(n, k)) — The set contains at most k elements.
|
||
|
||
We maintain a sliding window of the last k elements using a hash set. For each new element, we check if it's already in the window (O(1) lookup). If found, we have a duplicate within distance k. Otherwise, we add it and remove the oldest element to maintain the window size.
|
||
|
||
- approach_name: Hash Map with Index Tracking
|
||
is_optimal: false
|
||
code: |
|
||
def contains_nearby_duplicate(nums: list[int], k: int) -> bool:
|
||
# Map each value to its most recent index
|
||
last_seen = {}
|
||
|
||
for i, num in enumerate(nums):
|
||
# Check if we've seen this number before
|
||
if num in last_seen:
|
||
# Check if the previous occurrence is within distance k
|
||
if i - last_seen[num] <= k:
|
||
return True
|
||
|
||
# Update the most recent index for this number
|
||
last_seen[num] = i
|
||
|
||
return False
|
||
explanation: |
|
||
**Time Complexity:** O(n) — Single pass with O(1) hash map operations.
|
||
|
||
**Space Complexity:** O(n) — In the worst case, all elements are unique and stored in the map.
|
||
|
||
This approach stores the last seen index for each value. When we encounter a number we've seen before, we check if the distance is within k. While correct and efficient, it uses more space than the sliding window approach when k is small relative to n.
|
||
|
||
- approach_name: Brute Force
|
||
is_optimal: false
|
||
code: |
|
||
def contains_nearby_duplicate(nums: list[int], k: int) -> bool:
|
||
n = len(nums)
|
||
|
||
# Check each element against the next k elements
|
||
for i in range(n):
|
||
# Only check within the valid range
|
||
for j in range(i + 1, min(i + k + 1, n)):
|
||
if nums[i] == nums[j]:
|
||
return True
|
||
|
||
return False
|
||
explanation: |
|
||
**Time Complexity:** O(n × k) — For each element, we check up to k subsequent elements.
|
||
|
||
**Space Complexity:** O(1) — No additional data structures used.
|
||
|
||
This straightforward approach checks every valid pair. While it passes small test cases, it will TLE on large inputs where both n and k approach 10^5. Included to illustrate why the hash-based approaches are necessary.
|