questions M-R

2025-05-25 12:43:25 +01:00
parent ad320dc703
commit 0a0feb93b5
62 changed files with 12841 additions and 0 deletions
@@ -0,0 +1,218 @@
+title: Min Cost to Connect All Points
+slug: min-cost-to-connect-all-points
+difficulty: medium
+leetcode_id: 1584
+leetcode_url: https://leetcode.com/problems/min-cost-to-connect-all-points/
+categories:
+  - graphs
+  - heap
+patterns:
+  - heap
+  - union-find
+
+description: |
+  You are given an array `points` representing integer coordinates of some points on a 2D-plane, where `points[i] = [x_i, y_i]`.
+
+  The cost of connecting two points `[x_i, y_i]` and `[x_j, y_j]` is the **manhattan distance** between them: `|x_i - x_j| + |y_i - y_j|`, where `|val|` denotes the absolute value of `val`.
+
+  Return *the minimum cost to make all points connected*. All points are connected if there is **exactly one** simple path between any two points.
+
+constraints: |
+  - `1 <= points.length <= 1000`
+  - `-10^6 <= x_i, y_i <= 10^6`
+  - All pairs `(x_i, y_i)` are distinct
+
+examples:
+  - input: "points = [[0,0],[2,2],[3,10],[5,2],[7,0]]"
+    output: "20"
+    explanation: "We can connect points with edges of total cost 20. Notice that there is a unique path between every pair of points."
+  - input: "points = [[3,12],[-2,5],[-4,1]]"
+    output: "18"
+    explanation: "Connect the three points with edges to form a tree of minimum total cost."
+
+explanation:
+  intuition: |
+    This problem is asking us to connect all points with the **minimum total edge cost** such that any point can reach any other point. This is exactly the definition of a **Minimum Spanning Tree (MST)**.
+
+    Think of it like this: imagine you're a city planner laying down cables between houses. Each house is a point, and the cost of laying cable between two houses is the manhattan distance. You want to connect all houses while spending as little as possible on cable.
+
+    The key insight is that to connect `n` points, we need exactly `n - 1` edges (any more would create a cycle, any fewer would leave points disconnected). Among all possible ways to choose `n - 1` edges that connect everything, we want the one with minimum total weight.
+
+    Two classic algorithms solve this:
+    - **Prim's Algorithm**: Start from one point and greedily add the cheapest edge that connects a new point to our growing tree
+    - **Kruskal's Algorithm**: Sort all edges by cost and greedily add them if they don't create a cycle
+
+    Since the graph is **dense** (every point can connect to every other point, giving us `n(n-1)/2` edges), Prim's algorithm with a min-heap is typically more efficient here.
+
+  approach: |
+    We'll use **Prim's Algorithm** with a min-heap to build the MST efficiently.
+
+    **Step 1: Initialise data structures**
+
+    - `total_cost`: Set to `0` to accumulate the MST weight
+    - `visited`: A set to track which points are already in our MST
+    - `min_heap`: Priority queue storing `(cost, point_index)` tuples, initialised with `(0, 0)` to start from point 0
+
+    &nbsp;
+
+    **Step 2: Build the MST greedily**
+
+    - While we haven't connected all `n` points:
+      - Pop the minimum cost edge from the heap
+      - If this point is already visited, skip it (we found a cheaper path earlier)
+      - Otherwise, add this point to the MST: mark as visited, add the edge cost to `total_cost`
+      - For each unvisited point, calculate the manhattan distance and push `(distance, point_index)` to the heap
+
+    &nbsp;
+
+    **Step 3: Return the result**
+
+    - Return `total_cost` once all `n` points are connected
+
+    &nbsp;
+
+    The min-heap ensures we always process the cheapest available edge first, guaranteeing we build an optimal MST.
+
+  common_pitfalls:
+    - title: Using Adjacency List for Dense Graph
+      description: |
+        A common instinct is to precompute all edges and store them in an adjacency list. With `n` points, this creates `n(n-1)/2` edges, using O(n^2) space.
+
+        For this problem with `n <= 1000`, that's about 500,000 edges which is manageable. However, Prim's algorithm can compute edge weights on-the-fly, avoiding the upfront memory cost while achieving the same time complexity.
+      wrong_approach: "Precompute and store all O(n^2) edges"
+      correct_approach: "Compute manhattan distance on-demand during Prim's traversal"
+
+    - title: Forgetting to Check Visited Before Processing
+      description: |
+        When popping from the heap, the same point might appear multiple times with different costs (we pushed it once for each neighbor that discovered it). Always check if a point is already in the MST before processing.
+
+        Processing a visited point would add duplicate edges and inflate the total cost.
+      wrong_approach: "Process every heap entry without checking visited"
+      correct_approach: "Skip heap entries for already-visited points"
+
+    - title: Off-by-One in Edge Count
+      description: |
+        An MST connecting `n` nodes has exactly `n - 1` edges. Some implementations track edge count to know when to stop. If you're counting edges, ensure you stop at `n - 1`, not `n`.
+
+        Using a visited set with size check `len(visited) == n` avoids this issue entirely.
+      wrong_approach: "Stop when edge_count == n"
+      correct_approach: "Stop when len(visited) == n or edge_count == n - 1"
+
+  key_takeaways:
+    - "**Minimum Spanning Tree**: When connecting nodes with minimum cost and no cycles, think MST algorithms (Prim's or Kruskal's)"
+    - "**Dense vs Sparse graphs**: Prim's with a heap is O(E log V), which is efficient for dense graphs where E approaches V^2"
+    - "**On-demand computation**: For fully connected graphs, compute edge weights as needed rather than storing them all"
+    - "**Heap for greedy selection**: Min-heaps efficiently find the next best edge in O(log n) time"
+
+  time_complexity: "O(n^2 log n). We potentially push O(n^2) edges to the heap, and each heap operation is O(log n). Alternatively, O(n^2) using Prim's with an array instead of a heap."
+  space_complexity: "O(n). We store the visited set of size n and the heap can grow up to O(n) entries at a time (we only push edges to unvisited nodes)."
+
+solutions:
+  - approach_name: Prim's Algorithm with Min-Heap
+    is_optimal: true
+    code: |
+      import heapq
+
+      def min_cost_connect_points(points: list[list[int]]) -> int:
+          n = len(points)
+          if n <= 1:
+              return 0
+
+          # Track which points are in our MST
+          visited = set()
+          # Min-heap: (cost, point_index)
+          # Start from point 0 with cost 0
+          min_heap = [(0, 0)]
+          total_cost = 0
+
+          while len(visited) < n:
+              # Get the cheapest edge to an unvisited point
+              cost, curr = heapq.heappop(min_heap)
+
+              # Skip if already in MST (found a cheaper path earlier)
+              if curr in visited:
+                  continue
+
+              # Add this point to MST
+              visited.add(curr)
+              total_cost += cost
+
+              # Explore edges to all unvisited points
+              for next_point in range(n):
+                  if next_point not in visited:
+                      # Calculate manhattan distance
+                      dist = (abs(points[curr][0] - points[next_point][0]) +
+                              abs(points[curr][1] - points[next_point][1]))
+                      heapq.heappush(min_heap, (dist, next_point))
+
+          return total_cost
+    explanation: |
+      **Time Complexity:** O(n^2 log n) — We push up to O(n^2) edges to the heap, each operation is O(log n).
+
+      **Space Complexity:** O(n) — The visited set is O(n), and the heap stores at most one entry per unvisited node at any time in the worst case.
+
+      This is the standard Prim's algorithm implementation. We greedily select the minimum cost edge that adds a new point to our growing MST, continuing until all points are connected.
+
+  - approach_name: Kruskal's Algorithm with Union-Find
+    is_optimal: false
+    code: |
+      class UnionFind:
+          def __init__(self, n: int):
+              self.parent = list(range(n))
+              self.rank = [0] * n
+
+          def find(self, x: int) -> int:
+              # Path compression
+              if self.parent[x] != x:
+                  self.parent[x] = self.find(self.parent[x])
+              return self.parent[x]
+
+          def union(self, x: int, y: int) -> bool:
+              # Union by rank, returns True if merged
+              px, py = self.find(x), self.find(y)
+              if px == py:
+                  return False
+              if self.rank[px] < self.rank[py]:
+                  px, py = py, px
+              self.parent[py] = px
+              if self.rank[px] == self.rank[py]:
+                  self.rank[px] += 1
+              return True
+
+      def min_cost_connect_points(points: list[list[int]]) -> int:
+          n = len(points)
+          if n <= 1:
+              return 0
+
+          # Generate all edges: (cost, point_i, point_j)
+          edges = []
+          for i in range(n):
+              for j in range(i + 1, n):
+                  dist = (abs(points[i][0] - points[j][0]) +
+                          abs(points[i][1] - points[j][1]))
+                  edges.append((dist, i, j))
+
+          # Sort edges by cost
+          edges.sort()
+
+          # Build MST using Union-Find
+          uf = UnionFind(n)
+          total_cost = 0
+          edges_used = 0
+
+          for cost, u, v in edges:
+              # Only add edge if it connects two components
+              if uf.union(u, v):
+                  total_cost += cost
+                  edges_used += 1
+                  # MST complete when we have n-1 edges
+                  if edges_used == n - 1:
+                      break
+
+          return total_cost
+    explanation: |
+      **Time Complexity:** O(n^2 log n) — Generating edges is O(n^2), sorting is O(n^2 log n), and Union-Find operations are nearly O(1) amortized.
+
+      **Space Complexity:** O(n^2) — We store all n(n-1)/2 edges before sorting.
+
+      Kruskal's algorithm sorts all edges and greedily adds them if they don't create a cycle. Union-Find efficiently detects cycles. This approach uses more memory due to storing all edges but is conceptually simpler.