questions C

2025-05-25 10:16:13 +01:00
parent c4662f5001
commit 615e3f1291
85 changed files with 16925 additions and 0 deletions
@@ -0,0 +1,210 @@
+title: Contains Duplicate
+slug: contains-duplicate
+difficulty: easy
+leetcode_id: 217
+leetcode_url: https://leetcode.com/problems/contains-duplicate/
+categories:
+  - arrays
+  - hash-tables
+patterns:
+  - heap
+
+function_signature: "def contains_duplicate(nums: list[int]) -> bool:"
+
+test_cases:
+  visible:
+    - input: { nums: [1, 2, 3, 1] }
+      expected: true
+    - input: { nums: [1, 2, 3, 4] }
+      expected: false
+    - input: { nums: [1, 1, 1, 3, 3, 4, 3, 2, 4, 2] }
+      expected: true
+  hidden:
+    - input: { nums: [1] }
+      expected: false
+    - input: { nums: [1, 1] }
+      expected: true
+    - input: { nums: [-1, -1, -2, -3] }
+      expected: true
+    - input: { nums: [0, 0] }
+      expected: true
+
+description: |
+  Given an integer array `nums`, return `true` if any value appears **at least twice** in the array, and return `false` if every element is distinct.
+
+constraints: |
+  - `1 <= nums.length <= 10^5`
+  - `-10^9 <= nums[i] <= 10^9`
+
+examples:
+  - input: "nums = [1,2,3,1]"
+    output: "true"
+    explanation: "The element 1 occurs at the indices 0 and 3."
+  - input: "nums = [1,2,3,4]"
+    output: "false"
+    explanation: "All elements are distinct."
+  - input: "nums = [1,1,1,3,3,4,3,2,4,2]"
+    output: "true"
+    explanation: "Multiple elements appear more than once."
+
+explanation:
+  intuition: |
+    Imagine you're checking coats at a party and need to ensure no two guests have the same ticket number. As each guest arrives, you could compare their ticket to every previous ticket — but that gets tedious as the party grows. Instead, what if you kept a quick-reference list of all ticket numbers you've seen?
+
+    This is the core insight: **use a data structure that allows instant lookups** to check if you've seen a number before. A *hash set* provides exactly this capability — adding an element and checking membership both take O(1) average time.
+
+    Think of it like this: as you iterate through the array, you maintain a "memory" of all numbers encountered so far. For each new number, you ask: "Have I seen this before?" If yes, you've found a duplicate. If no, add it to your memory and continue.
+
+    The key constraint guiding our solution is the array size (up to 10^5 elements). This rules out O(n^2) approaches and points us toward O(n) or O(n log n) solutions.
+
+  approach: |
+    We solve this using a **Hash Set Approach**:
+
+    **Step 1: Create an empty set**
+
+    - `seen`: An empty set to store numbers we've encountered
+    - Sets provide O(1) average time for both insertion and membership testing
+
+    &nbsp;
+
+    **Step 2: Iterate through the array**
+
+    - For each number in `nums`, check if it already exists in `seen`
+    - If the number is in `seen`, we've found a duplicate — return `True` immediately
+    - If the number is not in `seen`, add it to the set and continue
+
+    &nbsp;
+
+    **Step 3: Return the result**
+
+    - If we complete the loop without finding any duplicates, return `False`
+    - This means all elements were distinct
+
+    &nbsp;
+
+    This approach works because hash sets give us constant-time lookups. We trade space (storing up to n elements) for time (avoiding nested comparisons).
+
+  common_pitfalls:
+    - title: The Brute Force Trap
+      description: |
+        A natural first instinct is to compare every pair of elements:
+        - Outer loop `i` from `0` to `n-1`
+        - Inner loop `j` from `i+1` to `n-1`
+        - Check if `nums[i] == nums[j]`
+
+        This results in **O(n^2) time complexity**. With `nums.length <= 10^5`, this means up to 5 billion comparisons — guaranteed **Time Limit Exceeded (TLE)**.
+
+        The hash set approach reduces this to O(n) by eliminating the inner loop entirely.
+      wrong_approach: "Nested loops comparing all pairs"
+      correct_approach: "Hash set for O(1) membership testing"
+
+    - title: Sorting Without Understanding the Trade-off
+      description: |
+        Sorting the array first (O(n log n)) then checking adjacent elements works, but it has two downsides:
+        - Slower than the hash set approach for this specific problem
+        - Modifies the original array (or requires O(n) extra space for a copy)
+
+        However, sorting can be preferable when memory is extremely constrained, as it uses O(1) extra space if done in-place.
+      wrong_approach: "Always defaulting to sorting"
+      correct_approach: "Choose hash set for O(n) time when space permits"
+
+    - title: Using a List Instead of a Set
+      description: |
+        In Python, checking `if x in list` is O(n), not O(1). Using a list instead of a set turns your "optimised" solution back into O(n^2).
+
+        ```python
+        # Wrong - O(n^2) total
+        seen = []
+        for num in nums:
+            if num in seen:  # O(n) lookup!
+                return True
+            seen.append(num)
+        ```
+
+        Always use a set (or dict) for membership testing.
+
+  key_takeaways:
+    - "**Hash sets for membership testing**: When you need to check 'have I seen this before?', a set gives O(1) lookups"
+    - "**Space-time trade-off**: Using O(n) extra space gives us O(n) time instead of O(n^2)"
+    - "**Early exit optimisation**: Return immediately when a duplicate is found — no need to check the rest"
+    - "**Foundation for harder problems**: This pattern appears in problems like Two Sum, finding pairs, and detecting cycles"
+
+  time_complexity: "O(n). We traverse the array once, with O(1) set operations at each step."
+  space_complexity: "O(n). In the worst case (all unique elements), we store all n elements in the set."
+
+solutions:
+  - approach_name: Hash Set
+    is_optimal: true
+    code: |
+      def contains_duplicate(nums: list[int]) -> bool:
+          # Set to track numbers we've seen
+          seen = set()
+
+          for num in nums:
+              # Already seen this number? Duplicate found!
+              if num in seen:
+                  return True
+              # First time seeing this number, remember it
+              seen.add(num)
+
+          # No duplicates found after checking all elements
+          return False
+    explanation: |
+      **Time Complexity:** O(n) — Single pass through the array with O(1) set operations.
+
+      **Space Complexity:** O(n) — Set stores up to n elements in the worst case.
+
+      We iterate once, checking each number against our set of seen values. The moment we find a number already in the set, we return `True`. If we finish without finding duplicates, we return `False`.
+
+  - approach_name: One-liner with Set Length
+    is_optimal: true
+    code: |
+      def contains_duplicate(nums: list[int]) -> bool:
+          # If set has fewer elements than list, duplicates exist
+          return len(nums) != len(set(nums))
+    explanation: |
+      **Time Complexity:** O(n) — Building a set from the list is O(n).
+
+      **Space Complexity:** O(n) — The set stores up to n elements.
+
+      This elegant one-liner exploits the fact that sets automatically remove duplicates. If the set has fewer elements than the original list, at least one duplicate existed. Note: this always processes all elements, unlike the early-exit version above.
+
+  - approach_name: Sorting
+    is_optimal: false
+    code: |
+      def contains_duplicate(nums: list[int]) -> bool:
+          # Sort the array so duplicates become adjacent
+          nums.sort()
+
+          # Check adjacent pairs for duplicates
+          for i in range(1, len(nums)):
+              if nums[i] == nums[i - 1]:
+                  return True
+
+          return False
+    explanation: |
+      **Time Complexity:** O(n log n) — Dominated by the sorting step.
+
+      **Space Complexity:** O(1) — In-place sorting uses constant extra space (ignoring the recursion stack).
+
+      After sorting, any duplicates will be adjacent. We scan through checking consecutive pairs. This approach is useful when memory is extremely limited, but it modifies the original array.
+
+  - approach_name: Brute Force
+    is_optimal: false
+    code: |
+      def contains_duplicate(nums: list[int]) -> bool:
+          n = len(nums)
+
+          # Compare every pair of elements
+          for i in range(n):
+              for j in range(i + 1, n):
+                  if nums[i] == nums[j]:
+                      return True
+
+          return False
+    explanation: |
+      **Time Complexity:** O(n^2) — Nested loops comparing all pairs.
+
+      **Space Complexity:** O(1) — No extra data structures used.
+
+      This straightforward approach checks every possible pair. While correct, it's far too slow for large inputs (TLE on LeetCode). Included to illustrate why hash-based approaches are essential.