title: Contains Duplicate II slug: contains-duplicate-ii difficulty: easy leetcode_id: 219 leetcode_url: https://leetcode.com/problems/contains-duplicate-ii/ categories: - arrays - hash-tables patterns: - sliding-window function_signature: "def contains_nearby_duplicate(nums: list[int], k: int) -> bool:" test_cases: visible: - input: { nums: [1, 2, 3, 1], k: 3 } expected: true - input: { nums: [1, 0, 1, 1], k: 1 } expected: true - input: { nums: [1, 2, 3, 1, 2, 3], k: 2 } expected: false hidden: - input: { nums: [1], k: 1 } expected: false - input: { nums: [1, 1], k: 0 } expected: false - input: { nums: [1, 2, 1], k: 2 } expected: true - input: { nums: [99, 99], k: 2 } expected: true - input: { nums: [1, 2, 3, 4, 5], k: 3 } expected: false - input: { nums: [0, 1, 2, 3, 4, 0, 0, 7, 8, 9, 10, 11, 12, 0], k: 1 } expected: true description: | Given an integer array `nums` and an integer `k`, return `true` *if there are two **distinct indices*** `i` *and* `j` *in the array such that* `nums[i] == nums[j]` *and* `abs(i - j) <= k`. constraints: | - `1 <= nums.length <= 10^5` - `-10^9 <= nums[i] <= 10^9` - `0 <= k <= 10^5` examples: - input: "nums = [1,2,3,1], k = 3" output: "true" explanation: "The element 1 appears at index 0 and index 3. Since abs(0 - 3) = 3 <= k, we return true." - input: "nums = [1,0,1,1], k = 1" output: "true" explanation: "The element 1 appears at index 2 and index 3. Since abs(2 - 3) = 1 <= k, we return true." - input: "nums = [1,2,3,1,2,3], k = 2" output: "false" explanation: "While there are duplicates, no pair of duplicate values are within k = 2 indices of each other. The closest duplicate pair (1 at index 0 and 3) has distance 3 > k." explanation: intuition: | Imagine you're walking through a hallway with numbered rooms, and you need to find if any room number repeats within the last `k` rooms you've passed. The core insight is that we don't need to remember *every* room we've ever seen — we only care about rooms within our **sliding window** of the last `k` positions. If we encounter a room number we've seen within this window, we've found our duplicate. Think of it like this: as you move forward, you maintain a "memory" of the last `k` rooms. When you see a new room number, you check if it's already in your memory. If yes, you found a nearby duplicate. If not, add it to your memory and forget the oldest room (the one that's now more than `k` steps behind). This naturally suggests using a **hash set** as our memory — it gives us O(1) lookups to check for duplicates and O(1) insertions/deletions to maintain our sliding window. approach: | We solve this using a **Sliding Window with Hash Set** approach: **Step 1: Initialise a hash set** - Create an empty set `window` to store elements within our current window of size `k` - The set will contain at most `k` elements at any time   **Step 2: Iterate through the array** - For each element at index `i`, check if it already exists in our `window` set - If yes, we found a duplicate within distance `k` — return `true` - If no, add the current element to the window   **Step 3: Maintain window size** - If the window size exceeds `k`, remove the oldest element (the one at index `i - k`) - This ensures we only track elements within the valid distance   **Step 4: Return the result** - If we complete the loop without finding duplicates, return `false`   This approach efficiently combines the sliding window pattern with a hash set for O(1) operations, giving us an optimal O(n) solution. common_pitfalls: - title: The Brute Force Trap description: | A naive approach checks every pair of elements to see if they're equal and within distance `k`: - Outer loop `i` from `0` to `n-1` - Inner loop `j` from `i+1` to `min(i+k+1, n)` While this limits the inner loop to `k` iterations, it's still **O(n × k)** in the worst case. When both `n` and `k` are at their maximum (`10^5`), this results in up to 10 billion operations — causing a **Time Limit Exceeded (TLE)** error. wrong_approach: "Nested loops checking pairs within distance k" correct_approach: "Sliding window with hash set for O(n) time" - title: Using a Hash Map Instead of a Set description: | While a hash map (storing value → index) works, it's more complex than necessary. You'd need to update indices as you go and compare distances. A hash set is simpler: by maintaining exactly the last `k` elements, we implicitly guarantee any match is within the valid distance. If it's in the set, it's within range. wrong_approach: "Hash map with index tracking and distance calculation" correct_approach: "Hash set with sliding window of size k" - title: Off-by-One in Window Size description: | Be careful about when to remove elements from the window. The condition `abs(i - j) <= k` means indices can be up to `k` apart, so your window should contain `k` previous elements (not `k-1` or `k+1`). Remove the element at index `i - k` only when `i >= k`, ensuring the window never exceeds `k` elements from the past. wrong_approach: "Removing when i > k or keeping k+1 elements" correct_approach: "Remove element at index i - k when i >= k" key_takeaways: - "**Sliding window + hash set**: When you need to find duplicates within a range, combine a fixed-size window with a set for O(1) lookups" - "**Implicit distance guarantee**: By maintaining exactly `k` elements, any match is automatically within the valid distance — no need to track indices" - "**Set vs Map tradeoff**: Choose the simpler data structure when it suffices; a set is often cleaner than a map when you don't need the stored values" - "**Related problems**: This pattern extends to 'Contains Duplicate III' (within range *and* value difference) and other sliding window problems" time_complexity: "O(n). We traverse the array once, with O(1) hash set operations (add, remove, lookup) at each step." space_complexity: "O(min(n, k)). The hash set stores at most `min(n, k)` elements at any time." solutions: - approach_name: Sliding Window with Hash Set is_optimal: true code: | def contains_nearby_duplicate(nums: list[int], k: int) -> bool: # Set to track elements in our current window of size k window = set() for i, num in enumerate(nums): # If we've seen this number in our window, we found a duplicate if num in window: return True # Add current element to the window window.add(num) # Maintain window size: remove element that's now too far behind if i >= k: window.remove(nums[i - k]) # No nearby duplicates found return False explanation: | **Time Complexity:** O(n) — Single pass through the array with O(1) set operations. **Space Complexity:** O(min(n, k)) — The set contains at most k elements. We maintain a sliding window of the last k elements using a hash set. For each new element, we check if it's already in the window (O(1) lookup). If found, we have a duplicate within distance k. Otherwise, we add it and remove the oldest element to maintain the window size. - approach_name: Hash Map with Index Tracking is_optimal: false code: | def contains_nearby_duplicate(nums: list[int], k: int) -> bool: # Map each value to its most recent index last_seen = {} for i, num in enumerate(nums): # Check if we've seen this number before if num in last_seen: # Check if the previous occurrence is within distance k if i - last_seen[num] <= k: return True # Update the most recent index for this number last_seen[num] = i return False explanation: | **Time Complexity:** O(n) — Single pass with O(1) hash map operations. **Space Complexity:** O(n) — In the worst case, all elements are unique and stored in the map. This approach stores the last seen index for each value. When we encounter a number we've seen before, we check if the distance is within k. While correct and efficient, it uses more space than the sliding window approach when k is small relative to n. - approach_name: Brute Force is_optimal: false code: | def contains_nearby_duplicate(nums: list[int], k: int) -> bool: n = len(nums) # Check each element against the next k elements for i in range(n): # Only check within the valid range for j in range(i + 1, min(i + k + 1, n)): if nums[i] == nums[j]: return True return False explanation: | **Time Complexity:** O(n × k) — For each element, we check up to k subsequent elements. **Space Complexity:** O(1) — No additional data structures used. This straightforward approach checks every valid pair. While it passes small test cases, it will TLE on large inputs where both n and k approach 10^5. Included to illustrate why the hash-based approaches are necessary.