questions M-R

2025-05-25 12:43:25 +01:00
parent 917c371529
commit 68699f35ec
62 changed files with 12841 additions and 0 deletions
--- a/backend/data/questions/redundant-connection.yaml
+++ b/backend/data/questions/redundant-connection.yaml
@@ -0,0 +1,196 @@
+title: Redundant Connection
+slug: redundant-connection
+difficulty: medium
+leetcode_id: 684
+leetcode_url: https://leetcode.com/problems/redundant-connection/
+categories:
+  - graphs
+patterns:
+  - union-find
+  - dfs
+
+description: |
+  In this problem, a tree is an **undirected graph** that is connected and has no cycles.
+
+  You are given a graph that started as a tree with `n` nodes labeled from `1` to `n`, with one additional edge added. The added edge has two **different** vertices chosen from `1` to `n`, and was not an edge that already existed. The graph is represented as an array `edges` of length `n` where `edges[i] = [a_i, b_i]` indicates that there is an edge between nodes `a_i` and `b_i` in the graph.
+
+  Return *an edge that can be removed so that the resulting graph is a tree of* `n` *nodes*. If there are multiple answers, return the answer that occurs last in the input.
+
+constraints: |
+  - `n == edges.length`
+  - `3 <= n <= 1000`
+  - `edges[i].length == 2`
+  - `1 <= a_i < b_i <= edges.length`
+  - `a_i != b_i`
+  - There are no repeated edges
+  - The given graph is connected
+
+examples:
+  - input: "edges = [[1,2],[1,3],[2,3]]"
+    output: "[2,3]"
+    explanation: "Adding edge [2,3] creates a cycle 1-2-3-1. Removing it restores the tree structure."
+  - input: "edges = [[1,2],[2,3],[3,4],[1,4],[1,5]]"
+    output: "[1,4]"
+    explanation: "Adding edge [1,4] creates a cycle 1-2-3-4-1. Removing it (the last edge that could be removed) restores the tree."
+
+explanation:
+  intuition: |
+    Imagine building a forest from scratch by adding edges one by one. Each time you add an edge, you're either **connecting two separate trees** (valid) or **connecting two nodes that are already in the same tree** (creates a cycle).
+
+    Think of it like connecting islands with bridges. If two islands are already connected (possibly through other islands), building another bridge between them creates a loop — that's the redundant connection.
+
+    The key insight is: **the first edge that connects two already-connected nodes is the edge that creates the cycle**. Since we want the last such edge in the input (if multiple exist), we process edges in order and return the last one that would create a cycle.
+
+    This is exactly what **Union-Find** (Disjoint Set Union) excels at: efficiently tracking which nodes belong to the same connected component and detecting when an edge would connect nodes already in the same component.
+
+  approach: |
+    We solve this using **Union-Find (Disjoint Set Union)**:
+
+    **Step 1: Initialise the Union-Find structure**
+
+    - `parent`: Array where `parent[i]` initially equals `i` (each node is its own parent)
+    - `rank`: Array to track tree depth for union by rank optimisation
+
+    &nbsp;
+
+    **Step 2: Define helper functions**
+
+    - `find(x)`: Returns the root of `x`'s component, with path compression
+    - `union(x, y)`: Merges components of `x` and `y`, returns `False` if already connected
+
+    &nbsp;
+
+    **Step 3: Process each edge in order**
+
+    - For each edge `[a, b]`, attempt to union nodes `a` and `b`
+    - If `find(a) == find(b)`, they're already in the same component — this edge is redundant
+    - Return the first (and in our iteration, last-checked) edge that causes a cycle
+
+    &nbsp;
+
+    **Step 4: Return the redundant edge**
+
+    - The problem guarantees exactly one redundant edge exists
+    - Since we process edges in order, we naturally find the last one that creates a cycle
+
+    &nbsp;
+
+    The Union-Find approach is ideal here because it efficiently handles dynamic connectivity queries. Path compression and union by rank ensure near-constant time operations.
+
+  common_pitfalls:
+    - title: Forgetting Path Compression
+      description: |
+        Without path compression, `find()` can degrade to O(n) per call, making the overall solution O(n^2).
+
+        Path compression flattens the tree structure by making each node point directly to the root during `find()` operations. This keeps the tree shallow and ensures near-constant time lookups.
+      wrong_approach: "Simple find without path compression"
+      correct_approach: "find() with path compression: parent[x] = find(parent[x])"
+
+    - title: Using 0-indexed Arrays with 1-indexed Nodes
+      description: |
+        The problem uses nodes labeled `1` to `n`, but many implementations use 0-indexed arrays.
+
+        Either allocate arrays of size `n + 1` (indices 0 to n, ignoring index 0), or subtract 1 from each node value. Mixing conventions leads to off-by-one errors.
+      wrong_approach: "Array of size n with 1-indexed node access"
+      correct_approach: "Array of size n+1 to accommodate 1-indexed nodes"
+
+    - title: Using DFS/BFS for Each Edge
+      description: |
+        A naive approach runs DFS/BFS before each edge to check if the two nodes are already connected. This works but is O(n^2) in the worst case.
+
+        Union-Find provides amortised O(α(n)) per operation (nearly constant), making the total time complexity O(n × α(n)) ≈ O(n).
+      wrong_approach: "DFS/BFS connectivity check before each edge"
+      correct_approach: "Union-Find with path compression and union by rank"
+
+  key_takeaways:
+    - "**Union-Find pattern**: The go-to data structure for dynamic connectivity problems — tracking which elements belong to the same group"
+    - "**Cycle detection in graphs**: An edge creates a cycle if and only if it connects two nodes already in the same connected component"
+    - "**Path compression + union by rank**: These two optimisations together give nearly O(1) amortised time per operation"
+    - "**Foundation for harder problems**: This pattern extends to problems like accounts merge, number of provinces, and minimum spanning trees (Kruskal's algorithm)"
+
+  time_complexity: "O(n × α(n)), where α is the inverse Ackermann function. With path compression and union by rank, each union/find operation is nearly O(1), and we perform n operations."
+  space_complexity: "O(n). We store parent and rank arrays of size n+1 to represent the Union-Find structure."
+
+solutions:
+  - approach_name: Union-Find
+    is_optimal: true
+    code: |
+      def find_redundant_connection(edges: list[list[int]]) -> list[int]:
+          n = len(edges)
+          # Initialise parent array: each node is its own parent
+          parent = list(range(n + 1))
+          # Rank for union by rank optimisation
+          rank = [0] * (n + 1)
+
+          def find(x: int) -> int:
+              # Path compression: make each node point to root
+              if parent[x] != x:
+                  parent[x] = find(parent[x])
+              return parent[x]
+
+          def union(x: int, y: int) -> bool:
+              # Find roots of both nodes
+              root_x, root_y = find(x), find(y)
+
+              # Already in same component - this edge creates a cycle
+              if root_x == root_y:
+                  return False
+
+              # Union by rank: attach smaller tree under larger tree
+              if rank[root_x] < rank[root_y]:
+                  parent[root_x] = root_y
+              elif rank[root_x] > rank[root_y]:
+                  parent[root_y] = root_x
+              else:
+                  parent[root_y] = root_x
+                  rank[root_x] += 1
+
+              return True
+
+          # Process each edge - first one that fails union is redundant
+          for a, b in edges:
+              if not union(a, b):
+                  return [a, b]
+
+          return []  # Problem guarantees a redundant edge exists
+    explanation: |
+      **Time Complexity:** O(n × α(n)) — Each union/find is nearly O(1) with path compression and union by rank.
+
+      **Space Complexity:** O(n) — Parent and rank arrays.
+
+      We process edges in order. For each edge, we try to union the two nodes. If they're already in the same component (same root), this edge would create a cycle — it's the redundant connection. Since we process edges in input order, we naturally return the last such edge.
+
+  - approach_name: DFS Cycle Detection
+    is_optimal: false
+    code: |
+      from collections import defaultdict
+
+      def find_redundant_connection(edges: list[list[int]]) -> list[int]:
+          graph = defaultdict(set)
+
+          def has_path(source: int, target: int, visited: set) -> bool:
+              # DFS to check if path exists between source and target
+              if source == target:
+                  return True
+              visited.add(source)
+              for neighbour in graph[source]:
+                  if neighbour not in visited:
+                      if has_path(neighbour, target, visited):
+                          return True
+              return False
+
+          for a, b in edges:
+              # Before adding edge, check if nodes are already connected
+              if has_path(a, b, set()):
+                  return [a, b]
+              # Add edge to graph
+              graph[a].add(b)
+              graph[b].add(a)
+
+          return []
+    explanation: |
+      **Time Complexity:** O(n^2) — For each of n edges, we potentially traverse up to n nodes.
+
+      **Space Complexity:** O(n) — Graph adjacency list and recursion stack.
+
+      Before adding each edge, we use DFS to check if the two nodes are already connected. If they are, adding this edge would create a cycle. While correct, this approach is slower than Union-Find for large inputs.