questions C
This commit is contained in:
210
backend/data/questions/contains-duplicate.yaml
Normal file
210
backend/data/questions/contains-duplicate.yaml
Normal file
@@ -0,0 +1,210 @@
|
||||
title: Contains Duplicate
|
||||
slug: contains-duplicate
|
||||
difficulty: easy
|
||||
leetcode_id: 217
|
||||
leetcode_url: https://leetcode.com/problems/contains-duplicate/
|
||||
categories:
|
||||
- arrays
|
||||
- hash-tables
|
||||
patterns:
|
||||
- heap
|
||||
|
||||
function_signature: "def contains_duplicate(nums: list[int]) -> bool:"
|
||||
|
||||
test_cases:
|
||||
visible:
|
||||
- input: { nums: [1, 2, 3, 1] }
|
||||
expected: true
|
||||
- input: { nums: [1, 2, 3, 4] }
|
||||
expected: false
|
||||
- input: { nums: [1, 1, 1, 3, 3, 4, 3, 2, 4, 2] }
|
||||
expected: true
|
||||
hidden:
|
||||
- input: { nums: [1] }
|
||||
expected: false
|
||||
- input: { nums: [1, 1] }
|
||||
expected: true
|
||||
- input: { nums: [-1, -1, -2, -3] }
|
||||
expected: true
|
||||
- input: { nums: [0, 0] }
|
||||
expected: true
|
||||
|
||||
description: |
|
||||
Given an integer array `nums`, return `true` if any value appears **at least twice** in the array, and return `false` if every element is distinct.
|
||||
|
||||
constraints: |
|
||||
- `1 <= nums.length <= 10^5`
|
||||
- `-10^9 <= nums[i] <= 10^9`
|
||||
|
||||
examples:
|
||||
- input: "nums = [1,2,3,1]"
|
||||
output: "true"
|
||||
explanation: "The element 1 occurs at the indices 0 and 3."
|
||||
- input: "nums = [1,2,3,4]"
|
||||
output: "false"
|
||||
explanation: "All elements are distinct."
|
||||
- input: "nums = [1,1,1,3,3,4,3,2,4,2]"
|
||||
output: "true"
|
||||
explanation: "Multiple elements appear more than once."
|
||||
|
||||
explanation:
|
||||
intuition: |
|
||||
Imagine you're checking coats at a party and need to ensure no two guests have the same ticket number. As each guest arrives, you could compare their ticket to every previous ticket — but that gets tedious as the party grows. Instead, what if you kept a quick-reference list of all ticket numbers you've seen?
|
||||
|
||||
This is the core insight: **use a data structure that allows instant lookups** to check if you've seen a number before. A *hash set* provides exactly this capability — adding an element and checking membership both take O(1) average time.
|
||||
|
||||
Think of it like this: as you iterate through the array, you maintain a "memory" of all numbers encountered so far. For each new number, you ask: "Have I seen this before?" If yes, you've found a duplicate. If no, add it to your memory and continue.
|
||||
|
||||
The key constraint guiding our solution is the array size (up to 10^5 elements). This rules out O(n^2) approaches and points us toward O(n) or O(n log n) solutions.
|
||||
|
||||
approach: |
|
||||
We solve this using a **Hash Set Approach**:
|
||||
|
||||
**Step 1: Create an empty set**
|
||||
|
||||
- `seen`: An empty set to store numbers we've encountered
|
||||
- Sets provide O(1) average time for both insertion and membership testing
|
||||
|
||||
|
||||
|
||||
**Step 2: Iterate through the array**
|
||||
|
||||
- For each number in `nums`, check if it already exists in `seen`
|
||||
- If the number is in `seen`, we've found a duplicate — return `True` immediately
|
||||
- If the number is not in `seen`, add it to the set and continue
|
||||
|
||||
|
||||
|
||||
**Step 3: Return the result**
|
||||
|
||||
- If we complete the loop without finding any duplicates, return `False`
|
||||
- This means all elements were distinct
|
||||
|
||||
|
||||
|
||||
This approach works because hash sets give us constant-time lookups. We trade space (storing up to n elements) for time (avoiding nested comparisons).
|
||||
|
||||
common_pitfalls:
|
||||
- title: The Brute Force Trap
|
||||
description: |
|
||||
A natural first instinct is to compare every pair of elements:
|
||||
- Outer loop `i` from `0` to `n-1`
|
||||
- Inner loop `j` from `i+1` to `n-1`
|
||||
- Check if `nums[i] == nums[j]`
|
||||
|
||||
This results in **O(n^2) time complexity**. With `nums.length <= 10^5`, this means up to 5 billion comparisons — guaranteed **Time Limit Exceeded (TLE)**.
|
||||
|
||||
The hash set approach reduces this to O(n) by eliminating the inner loop entirely.
|
||||
wrong_approach: "Nested loops comparing all pairs"
|
||||
correct_approach: "Hash set for O(1) membership testing"
|
||||
|
||||
- title: Sorting Without Understanding the Trade-off
|
||||
description: |
|
||||
Sorting the array first (O(n log n)) then checking adjacent elements works, but it has two downsides:
|
||||
- Slower than the hash set approach for this specific problem
|
||||
- Modifies the original array (or requires O(n) extra space for a copy)
|
||||
|
||||
However, sorting can be preferable when memory is extremely constrained, as it uses O(1) extra space if done in-place.
|
||||
wrong_approach: "Always defaulting to sorting"
|
||||
correct_approach: "Choose hash set for O(n) time when space permits"
|
||||
|
||||
- title: Using a List Instead of a Set
|
||||
description: |
|
||||
In Python, checking `if x in list` is O(n), not O(1). Using a list instead of a set turns your "optimised" solution back into O(n^2).
|
||||
|
||||
```python
|
||||
# Wrong - O(n^2) total
|
||||
seen = []
|
||||
for num in nums:
|
||||
if num in seen: # O(n) lookup!
|
||||
return True
|
||||
seen.append(num)
|
||||
```
|
||||
|
||||
Always use a set (or dict) for membership testing.
|
||||
|
||||
key_takeaways:
|
||||
- "**Hash sets for membership testing**: When you need to check 'have I seen this before?', a set gives O(1) lookups"
|
||||
- "**Space-time trade-off**: Using O(n) extra space gives us O(n) time instead of O(n^2)"
|
||||
- "**Early exit optimisation**: Return immediately when a duplicate is found — no need to check the rest"
|
||||
- "**Foundation for harder problems**: This pattern appears in problems like Two Sum, finding pairs, and detecting cycles"
|
||||
|
||||
time_complexity: "O(n). We traverse the array once, with O(1) set operations at each step."
|
||||
space_complexity: "O(n). In the worst case (all unique elements), we store all n elements in the set."
|
||||
|
||||
solutions:
|
||||
- approach_name: Hash Set
|
||||
is_optimal: true
|
||||
code: |
|
||||
def contains_duplicate(nums: list[int]) -> bool:
|
||||
# Set to track numbers we've seen
|
||||
seen = set()
|
||||
|
||||
for num in nums:
|
||||
# Already seen this number? Duplicate found!
|
||||
if num in seen:
|
||||
return True
|
||||
# First time seeing this number, remember it
|
||||
seen.add(num)
|
||||
|
||||
# No duplicates found after checking all elements
|
||||
return False
|
||||
explanation: |
|
||||
**Time Complexity:** O(n) — Single pass through the array with O(1) set operations.
|
||||
|
||||
**Space Complexity:** O(n) — Set stores up to n elements in the worst case.
|
||||
|
||||
We iterate once, checking each number against our set of seen values. The moment we find a number already in the set, we return `True`. If we finish without finding duplicates, we return `False`.
|
||||
|
||||
- approach_name: One-liner with Set Length
|
||||
is_optimal: true
|
||||
code: |
|
||||
def contains_duplicate(nums: list[int]) -> bool:
|
||||
# If set has fewer elements than list, duplicates exist
|
||||
return len(nums) != len(set(nums))
|
||||
explanation: |
|
||||
**Time Complexity:** O(n) — Building a set from the list is O(n).
|
||||
|
||||
**Space Complexity:** O(n) — The set stores up to n elements.
|
||||
|
||||
This elegant one-liner exploits the fact that sets automatically remove duplicates. If the set has fewer elements than the original list, at least one duplicate existed. Note: this always processes all elements, unlike the early-exit version above.
|
||||
|
||||
- approach_name: Sorting
|
||||
is_optimal: false
|
||||
code: |
|
||||
def contains_duplicate(nums: list[int]) -> bool:
|
||||
# Sort the array so duplicates become adjacent
|
||||
nums.sort()
|
||||
|
||||
# Check adjacent pairs for duplicates
|
||||
for i in range(1, len(nums)):
|
||||
if nums[i] == nums[i - 1]:
|
||||
return True
|
||||
|
||||
return False
|
||||
explanation: |
|
||||
**Time Complexity:** O(n log n) — Dominated by the sorting step.
|
||||
|
||||
**Space Complexity:** O(1) — In-place sorting uses constant extra space (ignoring the recursion stack).
|
||||
|
||||
After sorting, any duplicates will be adjacent. We scan through checking consecutive pairs. This approach is useful when memory is extremely limited, but it modifies the original array.
|
||||
|
||||
- approach_name: Brute Force
|
||||
is_optimal: false
|
||||
code: |
|
||||
def contains_duplicate(nums: list[int]) -> bool:
|
||||
n = len(nums)
|
||||
|
||||
# Compare every pair of elements
|
||||
for i in range(n):
|
||||
for j in range(i + 1, n):
|
||||
if nums[i] == nums[j]:
|
||||
return True
|
||||
|
||||
return False
|
||||
explanation: |
|
||||
**Time Complexity:** O(n^2) — Nested loops comparing all pairs.
|
||||
|
||||
**Space Complexity:** O(1) — No extra data structures used.
|
||||
|
||||
This straightforward approach checks every possible pair. While correct, it's far too slow for large inputs (TLE on LeetCode). Included to illustrate why hash-based approaches are essential.
|
||||
Reference in New Issue
Block a user