217 lines
8.6 KiB
YAML
217 lines
8.6 KiB
YAML
title: Contains Duplicate
|
|
slug: contains-duplicate
|
|
difficulty: easy
|
|
leetcode_id: 217
|
|
leetcode_url: https://leetcode.com/problems/contains-duplicate/
|
|
categories:
|
|
- arrays
|
|
- hash-tables
|
|
patterns:
|
|
- heap
|
|
|
|
function_signature: "def contains_duplicate(nums: list[int]) -> bool:"
|
|
|
|
test_cases:
|
|
visible:
|
|
- input: { nums: [1, 2, 3, 1] }
|
|
expected: true
|
|
- input: { nums: [1, 2, 3, 4] }
|
|
expected: false
|
|
- input: { nums: [1, 1, 1, 3, 3, 4, 3, 2, 4, 2] }
|
|
expected: true
|
|
hidden:
|
|
- input: { nums: [1] }
|
|
expected: false
|
|
- input: { nums: [1, 1] }
|
|
expected: true
|
|
- input: { nums: [-1, -1, -2, -3] }
|
|
expected: true
|
|
- input: { nums: [0, 0] }
|
|
expected: true
|
|
- input: { nums: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] }
|
|
expected: false
|
|
- input: { nums: [1, 2, 3, 4, 5, 1] }
|
|
expected: true
|
|
- input: { nums: [1000000000, -1000000000] }
|
|
expected: false
|
|
|
|
description: |
|
|
Given an integer array `nums`, return `true` if any value appears **at least twice** in the array, and return `false` if every element is distinct.
|
|
|
|
constraints: |
|
|
- `1 <= nums.length <= 10^5`
|
|
- `-10^9 <= nums[i] <= 10^9`
|
|
|
|
examples:
|
|
- input: "nums = [1,2,3,1]"
|
|
output: "true"
|
|
explanation: "The element 1 occurs at the indices 0 and 3."
|
|
- input: "nums = [1,2,3,4]"
|
|
output: "false"
|
|
explanation: "All elements are distinct."
|
|
- input: "nums = [1,1,1,3,3,4,3,2,4,2]"
|
|
output: "true"
|
|
explanation: "Multiple elements appear more than once."
|
|
|
|
explanation:
|
|
intuition: |
|
|
Imagine you're checking coats at a party and need to ensure no two guests have the same ticket number. As each guest arrives, you could compare their ticket to every previous ticket — but that gets tedious as the party grows. Instead, what if you kept a quick-reference list of all ticket numbers you've seen?
|
|
|
|
This is the core insight: **use a data structure that allows instant lookups** to check if you've seen a number before. A *hash set* provides exactly this capability — adding an element and checking membership both take O(1) average time.
|
|
|
|
Think of it like this: as you iterate through the array, you maintain a "memory" of all numbers encountered so far. For each new number, you ask: "Have I seen this before?" If yes, you've found a duplicate. If no, add it to your memory and continue.
|
|
|
|
The key constraint guiding our solution is the array size (up to 10^5 elements). This rules out O(n^2) approaches and points us toward O(n) or O(n log n) solutions.
|
|
|
|
approach: |
|
|
We solve this using a **Hash Set Approach**:
|
|
|
|
**Step 1: Create an empty set**
|
|
|
|
- `seen`: An empty set to store numbers we've encountered
|
|
- Sets provide O(1) average time for both insertion and membership testing
|
|
|
|
|
|
|
|
**Step 2: Iterate through the array**
|
|
|
|
- For each number in `nums`, check if it already exists in `seen`
|
|
- If the number is in `seen`, we've found a duplicate — return `True` immediately
|
|
- If the number is not in `seen`, add it to the set and continue
|
|
|
|
|
|
|
|
**Step 3: Return the result**
|
|
|
|
- If we complete the loop without finding any duplicates, return `False`
|
|
- This means all elements were distinct
|
|
|
|
|
|
|
|
This approach works because hash sets give us constant-time lookups. We trade space (storing up to n elements) for time (avoiding nested comparisons).
|
|
|
|
common_pitfalls:
|
|
- title: The Brute Force Trap
|
|
description: |
|
|
A natural first instinct is to compare every pair of elements:
|
|
- Outer loop `i` from `0` to `n-1`
|
|
- Inner loop `j` from `i+1` to `n-1`
|
|
- Check if `nums[i] == nums[j]`
|
|
|
|
This results in **O(n^2) time complexity**. With `nums.length <= 10^5`, this means up to 5 billion comparisons — guaranteed **Time Limit Exceeded (TLE)**.
|
|
|
|
The hash set approach reduces this to O(n) by eliminating the inner loop entirely.
|
|
wrong_approach: "Nested loops comparing all pairs"
|
|
correct_approach: "Hash set for O(1) membership testing"
|
|
|
|
- title: Sorting Without Understanding the Trade-off
|
|
description: |
|
|
Sorting the array first (O(n log n)) then checking adjacent elements works, but it has two downsides:
|
|
- Slower than the hash set approach for this specific problem
|
|
- Modifies the original array (or requires O(n) extra space for a copy)
|
|
|
|
However, sorting can be preferable when memory is extremely constrained, as it uses O(1) extra space if done in-place.
|
|
wrong_approach: "Always defaulting to sorting"
|
|
correct_approach: "Choose hash set for O(n) time when space permits"
|
|
|
|
- title: Using a List Instead of a Set
|
|
description: |
|
|
In Python, checking `if x in list` is O(n), not O(1). Using a list instead of a set turns your "optimised" solution back into O(n^2).
|
|
|
|
```python
|
|
# Wrong - O(n^2) total
|
|
seen = []
|
|
for num in nums:
|
|
if num in seen: # O(n) lookup!
|
|
return True
|
|
seen.append(num)
|
|
```
|
|
|
|
Always use a set (or dict) for membership testing.
|
|
|
|
key_takeaways:
|
|
- "**Hash sets for membership testing**: When you need to check 'have I seen this before?', a set gives O(1) lookups"
|
|
- "**Space-time trade-off**: Using O(n) extra space gives us O(n) time instead of O(n^2)"
|
|
- "**Early exit optimisation**: Return immediately when a duplicate is found — no need to check the rest"
|
|
- "**Foundation for harder problems**: This pattern appears in problems like Two Sum, finding pairs, and detecting cycles"
|
|
|
|
time_complexity: "O(n). We traverse the array once, with O(1) set operations at each step."
|
|
space_complexity: "O(n). In the worst case (all unique elements), we store all n elements in the set."
|
|
|
|
solutions:
|
|
- approach_name: Hash Set
|
|
is_optimal: true
|
|
code: |
|
|
def contains_duplicate(nums: list[int]) -> bool:
|
|
# Set to track numbers we've seen
|
|
seen = set()
|
|
|
|
for num in nums:
|
|
# Already seen this number? Duplicate found!
|
|
if num in seen:
|
|
return True
|
|
# First time seeing this number, remember it
|
|
seen.add(num)
|
|
|
|
# No duplicates found after checking all elements
|
|
return False
|
|
explanation: |
|
|
**Time Complexity:** O(n) — Single pass through the array with O(1) set operations.
|
|
|
|
**Space Complexity:** O(n) — Set stores up to n elements in the worst case.
|
|
|
|
We iterate once, checking each number against our set of seen values. The moment we find a number already in the set, we return `True`. If we finish without finding duplicates, we return `False`.
|
|
|
|
- approach_name: One-liner with Set Length
|
|
is_optimal: true
|
|
code: |
|
|
def contains_duplicate(nums: list[int]) -> bool:
|
|
# If set has fewer elements than list, duplicates exist
|
|
return len(nums) != len(set(nums))
|
|
explanation: |
|
|
**Time Complexity:** O(n) — Building a set from the list is O(n).
|
|
|
|
**Space Complexity:** O(n) — The set stores up to n elements.
|
|
|
|
This elegant one-liner exploits the fact that sets automatically remove duplicates. If the set has fewer elements than the original list, at least one duplicate existed. Note: this always processes all elements, unlike the early-exit version above.
|
|
|
|
- approach_name: Sorting
|
|
is_optimal: false
|
|
code: |
|
|
def contains_duplicate(nums: list[int]) -> bool:
|
|
# Sort the array so duplicates become adjacent
|
|
nums.sort()
|
|
|
|
# Check adjacent pairs for duplicates
|
|
for i in range(1, len(nums)):
|
|
if nums[i] == nums[i - 1]:
|
|
return True
|
|
|
|
return False
|
|
explanation: |
|
|
**Time Complexity:** O(n log n) — Dominated by the sorting step.
|
|
|
|
**Space Complexity:** O(1) — In-place sorting uses constant extra space (ignoring the recursion stack).
|
|
|
|
After sorting, any duplicates will be adjacent. We scan through checking consecutive pairs. This approach is useful when memory is extremely limited, but it modifies the original array.
|
|
|
|
- approach_name: Brute Force
|
|
is_optimal: false
|
|
code: |
|
|
def contains_duplicate(nums: list[int]) -> bool:
|
|
n = len(nums)
|
|
|
|
# Compare every pair of elements
|
|
for i in range(n):
|
|
for j in range(i + 1, n):
|
|
if nums[i] == nums[j]:
|
|
return True
|
|
|
|
return False
|
|
explanation: |
|
|
**Time Complexity:** O(n^2) — Nested loops comparing all pairs.
|
|
|
|
**Space Complexity:** O(1) — No extra data structures used.
|
|
|
|
This straightforward approach checks every possible pair. While correct, it's far too slow for large inputs (TLE on LeetCode). Included to illustrate why hash-based approaches are essential.
|