questions F-L

2025-05-25 11:47:04 +01:00
parent ecf95bd23d
commit 917c371529
54 changed files with 11235 additions and 0 deletions
--- a/backend/data/questions/longest-consecutive-sequence.yaml
+++ b/backend/data/questions/longest-consecutive-sequence.yaml
@@ -0,0 +1,231 @@
+title: Longest Consecutive Sequence
+slug: longest-consecutive-sequence
+difficulty: medium
+leetcode_id: 128
+leetcode_url: https://leetcode.com/problems/longest-consecutive-sequence/
+categories:
+  - arrays
+  - hash-tables
+patterns:
+  - union-find
+
+description: |
+  Given an unsorted array of integers `nums`, return *the length of the longest consecutive elements sequence*.
+
+  You must write an algorithm that runs in `O(n)` time.
+
+constraints: |
+  - `0 <= nums.length <= 10^5`
+  - `-10^9 <= nums[i] <= 10^9`
+
+examples:
+  - input: "nums = [100, 4, 200, 1, 3, 2]"
+    output: "4"
+    explanation: "The longest consecutive elements sequence is [1, 2, 3, 4]. Therefore its length is 4."
+  - input: "nums = [0, 3, 7, 2, 5, 8, 4, 6, 0, 1]"
+    output: "9"
+    explanation: "The longest consecutive elements sequence is [0, 1, 2, 3, 4, 5, 6, 7, 8]. Therefore its length is 9."
+  - input: "nums = [1, 0, 1, 2]"
+    output: "3"
+    explanation: "The longest consecutive elements sequence is [0, 1, 2]. Therefore its length is 3."
+
+explanation:
+  intuition: |
+    Imagine you have a collection of scattered puzzle pieces, each with a number on it. Your goal is to find the **longest chain** where each piece connects to the next (consecutive numbers). The naive approach would be to pick up each piece and search through all other pieces for its neighbour — but that's slow.
+
+    The key insight is this: **a consecutive sequence always has a starting point** — a number that has no predecessor (`num - 1` doesn't exist in the array). If we can identify these starting points efficiently, we can then count forward from each one to find the sequence length.
+
+    Think of it like this: instead of blindly searching, we first dump all the puzzle pieces into a bag (a hash set) for O(1) lookups. Then, for each piece, we ask: "Is there a piece with `num - 1`?" If not, this piece is the **start of a potential sequence**. We then count forward: does `num + 1` exist? Does `num + 2` exist? And so on.
+
+    By only counting forward from sequence starts, we ensure each number is visited at most twice (once when added to the set, once when counted in a sequence), giving us O(n) time.
+
+  approach: |
+    We solve this using a **Hash Set Approach**:
+
+    **Step 1: Handle edge cases**
+
+    - If the array is empty, return `0`
+
+    &nbsp;
+
+    **Step 2: Build a hash set**
+
+    - Convert the array to a set for O(1) lookups
+    - This also automatically handles duplicates
+
+    &nbsp;
+
+    **Step 3: Find sequence starting points**
+
+    - Iterate through each number in the set
+    - A number is a sequence start if `num - 1` is NOT in the set
+    - This ensures we only start counting from the beginning of each sequence
+
+    &nbsp;
+
+    **Step 4: Count consecutive elements**
+
+    - For each starting point, count how many consecutive numbers exist
+    - Keep checking if `num + 1`, `num + 2`, etc. are in the set
+    - Track the maximum sequence length found
+
+    &nbsp;
+
+    **Step 5: Return the result**
+
+    - Return the longest sequence length found
+
+    &nbsp;
+
+    This approach is efficient because each number is processed at most twice: once to check if it's a starting point, and once when counting a sequence.
+
+  common_pitfalls:
+    - title: The Sorting Trap
+      description: |
+        A natural first instinct is to sort the array and then scan for consecutive elements. While this works correctly, sorting takes **O(n log n)** time.
+
+        The problem explicitly requires O(n) time complexity, so a sorting-based solution would fail this requirement. The hash set approach achieves true O(n) by trading time for space.
+      wrong_approach: "Sort then scan for consecutive elements"
+      correct_approach: "Use a hash set for O(1) lookups"
+
+    - title: Counting From Every Number
+      description: |
+        If you try to count the sequence length starting from every number in the array, you'll get O(n²) time complexity in the worst case.
+
+        For example, with `nums = [1, 2, 3, 4, 5]`, starting from `5` counts 1 element, from `4` counts 2, from `3` counts 3, and so on — leading to 1 + 2 + 3 + 4 + 5 = O(n²) total work.
+
+        The fix is to **only count from sequence starting points** (numbers where `num - 1` doesn't exist). This ensures each element is counted exactly once across all sequences.
+      wrong_approach: "Count sequence length from every element"
+      correct_approach: "Only count from elements where num - 1 is not in the set"
+
+    - title: Not Handling Duplicates
+      description: |
+        The array may contain duplicate values (e.g., `[1, 0, 1, 2]`). If you iterate over the original array instead of the set, you might count the same sequence multiple times or get incorrect lengths.
+
+        Using a set automatically deduplicates the input, ensuring each unique number is processed only once.
+
+  key_takeaways:
+    - "**Hash set for O(1) lookups**: When you need to check membership repeatedly, convert to a set first"
+    - "**Identify sequence boundaries**: Only start counting from elements that begin a sequence (`num - 1` not present)"
+    - "**Each element visited once**: Smart iteration ensures O(n) despite nested-looking loops"
+    - "**Space-time tradeoff**: We use O(n) space to achieve O(n) time instead of O(n log n)"
+
+  time_complexity: "O(n). Each number is visited at most twice — once when checking if it's a sequence start, and once when counting forward from a starting point."
+  space_complexity: "O(n). We store all unique elements in a hash set."
+
+solutions:
+  - approach_name: Hash Set
+    is_optimal: true
+    code: |
+      def longest_consecutive(nums: list[int]) -> int:
+          if not nums:
+              return 0
+
+          # Build a set for O(1) lookups
+          num_set = set(nums)
+          longest = 0
+
+          for num in num_set:
+              # Only start counting if this is the beginning of a sequence
+              # (i.e., num - 1 is not in the set)
+              if num - 1 not in num_set:
+                  current_num = num
+                  current_length = 1
+
+                  # Count consecutive numbers
+                  while current_num + 1 in num_set:
+                      current_num += 1
+                      current_length += 1
+
+                  # Update the longest sequence found
+                  longest = max(longest, current_length)
+
+          return longest
+    explanation: |
+      **Time Complexity:** O(n) — Each number is processed at most twice.
+
+      **Space Complexity:** O(n) — Hash set stores all unique elements.
+
+      The key optimisation is only counting from sequence starting points. When we find a number where `num - 1` doesn't exist, we know it's the start of a new sequence and count forward from there.
+
+  - approach_name: Sorting
+    is_optimal: false
+    code: |
+      def longest_consecutive(nums: list[int]) -> int:
+          if not nums:
+              return 0
+
+          # Sort the array
+          nums.sort()
+
+          longest = 1
+          current_length = 1
+
+          for i in range(1, len(nums)):
+              # Skip duplicates
+              if nums[i] == nums[i - 1]:
+                  continue
+
+              # Check if consecutive
+              if nums[i] == nums[i - 1] + 1:
+                  current_length += 1
+              else:
+                  # Sequence broken, start fresh
+                  longest = max(longest, current_length)
+                  current_length = 1
+
+          return max(longest, current_length)
+    explanation: |
+      **Time Complexity:** O(n log n) — Dominated by the sorting step.
+
+      **Space Complexity:** O(1) or O(n) — Depends on the sorting algorithm used.
+
+      This approach sorts the array first, then scans linearly to find consecutive sequences. While simpler to understand, it doesn't meet the O(n) time requirement specified in the problem. Included here to illustrate the tradeoff between simplicity and optimal complexity.
+
+  - approach_name: Union-Find
+    is_optimal: false
+    code: |
+      def longest_consecutive(nums: list[int]) -> int:
+          if not nums:
+              return 0
+
+          # Map each number to its index
+          num_to_idx = {}
+          for i, num in enumerate(nums):
+              if num not in num_to_idx:
+                  num_to_idx[num] = num  # Each number is its own parent initially
+
+          # Union-Find with path compression
+          def find(x):
+              if num_to_idx[x] != x:
+                  num_to_idx[x] = find(num_to_idx[x])
+              return num_to_idx[x]
+
+          def union(x, y):
+              root_x, root_y = find(x), find(y)
+              if root_x != root_y:
+                  # Always point to the larger number
+                  if root_x < root_y:
+                      num_to_idx[root_x] = root_y
+                  else:
+                      num_to_idx[root_y] = root_x
+
+          # Union consecutive numbers
+          for num in num_to_idx:
+              if num + 1 in num_to_idx:
+                  union(num, num + 1)
+
+          # Count sequence lengths by finding the root of each number
+          # and measuring distance to root
+          longest = 0
+          for num in num_to_idx:
+              root = find(num)
+              longest = max(longest, root - num + 1)
+
+          return longest
+    explanation: |
+      **Time Complexity:** O(n × α(n)) ≈ O(n) — Where α is the inverse Ackermann function.
+
+      **Space Complexity:** O(n) — Storage for the parent mapping.
+
+      Union-Find groups consecutive numbers into the same set. While this is a valid O(n) approach, it's more complex than the hash set solution. The hash set approach is preferred for its simplicity and clarity.