questions M-R
This commit is contained in:
218
backend/data/questions/min-cost-to-connect-all-points.yaml
Normal file
218
backend/data/questions/min-cost-to-connect-all-points.yaml
Normal file
@@ -0,0 +1,218 @@
|
||||
title: Min Cost to Connect All Points
|
||||
slug: min-cost-to-connect-all-points
|
||||
difficulty: medium
|
||||
leetcode_id: 1584
|
||||
leetcode_url: https://leetcode.com/problems/min-cost-to-connect-all-points/
|
||||
categories:
|
||||
- graphs
|
||||
- heap
|
||||
patterns:
|
||||
- heap
|
||||
- union-find
|
||||
|
||||
description: |
|
||||
You are given an array `points` representing integer coordinates of some points on a 2D-plane, where `points[i] = [x_i, y_i]`.
|
||||
|
||||
The cost of connecting two points `[x_i, y_i]` and `[x_j, y_j]` is the **manhattan distance** between them: `|x_i - x_j| + |y_i - y_j|`, where `|val|` denotes the absolute value of `val`.
|
||||
|
||||
Return *the minimum cost to make all points connected*. All points are connected if there is **exactly one** simple path between any two points.
|
||||
|
||||
constraints: |
|
||||
- `1 <= points.length <= 1000`
|
||||
- `-10^6 <= x_i, y_i <= 10^6`
|
||||
- All pairs `(x_i, y_i)` are distinct
|
||||
|
||||
examples:
|
||||
- input: "points = [[0,0],[2,2],[3,10],[5,2],[7,0]]"
|
||||
output: "20"
|
||||
explanation: "We can connect points with edges of total cost 20. Notice that there is a unique path between every pair of points."
|
||||
- input: "points = [[3,12],[-2,5],[-4,1]]"
|
||||
output: "18"
|
||||
explanation: "Connect the three points with edges to form a tree of minimum total cost."
|
||||
|
||||
explanation:
|
||||
intuition: |
|
||||
This problem is asking us to connect all points with the **minimum total edge cost** such that any point can reach any other point. This is exactly the definition of a **Minimum Spanning Tree (MST)**.
|
||||
|
||||
Think of it like this: imagine you're a city planner laying down cables between houses. Each house is a point, and the cost of laying cable between two houses is the manhattan distance. You want to connect all houses while spending as little as possible on cable.
|
||||
|
||||
The key insight is that to connect `n` points, we need exactly `n - 1` edges (any more would create a cycle, any fewer would leave points disconnected). Among all possible ways to choose `n - 1` edges that connect everything, we want the one with minimum total weight.
|
||||
|
||||
Two classic algorithms solve this:
|
||||
- **Prim's Algorithm**: Start from one point and greedily add the cheapest edge that connects a new point to our growing tree
|
||||
- **Kruskal's Algorithm**: Sort all edges by cost and greedily add them if they don't create a cycle
|
||||
|
||||
Since the graph is **dense** (every point can connect to every other point, giving us `n(n-1)/2` edges), Prim's algorithm with a min-heap is typically more efficient here.
|
||||
|
||||
approach: |
|
||||
We'll use **Prim's Algorithm** with a min-heap to build the MST efficiently.
|
||||
|
||||
**Step 1: Initialise data structures**
|
||||
|
||||
- `total_cost`: Set to `0` to accumulate the MST weight
|
||||
- `visited`: A set to track which points are already in our MST
|
||||
- `min_heap`: Priority queue storing `(cost, point_index)` tuples, initialised with `(0, 0)` to start from point 0
|
||||
|
||||
|
||||
|
||||
**Step 2: Build the MST greedily**
|
||||
|
||||
- While we haven't connected all `n` points:
|
||||
- Pop the minimum cost edge from the heap
|
||||
- If this point is already visited, skip it (we found a cheaper path earlier)
|
||||
- Otherwise, add this point to the MST: mark as visited, add the edge cost to `total_cost`
|
||||
- For each unvisited point, calculate the manhattan distance and push `(distance, point_index)` to the heap
|
||||
|
||||
|
||||
|
||||
**Step 3: Return the result**
|
||||
|
||||
- Return `total_cost` once all `n` points are connected
|
||||
|
||||
|
||||
|
||||
The min-heap ensures we always process the cheapest available edge first, guaranteeing we build an optimal MST.
|
||||
|
||||
common_pitfalls:
|
||||
- title: Using Adjacency List for Dense Graph
|
||||
description: |
|
||||
A common instinct is to precompute all edges and store them in an adjacency list. With `n` points, this creates `n(n-1)/2` edges, using O(n^2) space.
|
||||
|
||||
For this problem with `n <= 1000`, that's about 500,000 edges which is manageable. However, Prim's algorithm can compute edge weights on-the-fly, avoiding the upfront memory cost while achieving the same time complexity.
|
||||
wrong_approach: "Precompute and store all O(n^2) edges"
|
||||
correct_approach: "Compute manhattan distance on-demand during Prim's traversal"
|
||||
|
||||
- title: Forgetting to Check Visited Before Processing
|
||||
description: |
|
||||
When popping from the heap, the same point might appear multiple times with different costs (we pushed it once for each neighbor that discovered it). Always check if a point is already in the MST before processing.
|
||||
|
||||
Processing a visited point would add duplicate edges and inflate the total cost.
|
||||
wrong_approach: "Process every heap entry without checking visited"
|
||||
correct_approach: "Skip heap entries for already-visited points"
|
||||
|
||||
- title: Off-by-One in Edge Count
|
||||
description: |
|
||||
An MST connecting `n` nodes has exactly `n - 1` edges. Some implementations track edge count to know when to stop. If you're counting edges, ensure you stop at `n - 1`, not `n`.
|
||||
|
||||
Using a visited set with size check `len(visited) == n` avoids this issue entirely.
|
||||
wrong_approach: "Stop when edge_count == n"
|
||||
correct_approach: "Stop when len(visited) == n or edge_count == n - 1"
|
||||
|
||||
key_takeaways:
|
||||
- "**Minimum Spanning Tree**: When connecting nodes with minimum cost and no cycles, think MST algorithms (Prim's or Kruskal's)"
|
||||
- "**Dense vs Sparse graphs**: Prim's with a heap is O(E log V), which is efficient for dense graphs where E approaches V^2"
|
||||
- "**On-demand computation**: For fully connected graphs, compute edge weights as needed rather than storing them all"
|
||||
- "**Heap for greedy selection**: Min-heaps efficiently find the next best edge in O(log n) time"
|
||||
|
||||
time_complexity: "O(n^2 log n). We potentially push O(n^2) edges to the heap, and each heap operation is O(log n). Alternatively, O(n^2) using Prim's with an array instead of a heap."
|
||||
space_complexity: "O(n). We store the visited set of size n and the heap can grow up to O(n) entries at a time (we only push edges to unvisited nodes)."
|
||||
|
||||
solutions:
|
||||
- approach_name: Prim's Algorithm with Min-Heap
|
||||
is_optimal: true
|
||||
code: |
|
||||
import heapq
|
||||
|
||||
def min_cost_connect_points(points: list[list[int]]) -> int:
|
||||
n = len(points)
|
||||
if n <= 1:
|
||||
return 0
|
||||
|
||||
# Track which points are in our MST
|
||||
visited = set()
|
||||
# Min-heap: (cost, point_index)
|
||||
# Start from point 0 with cost 0
|
||||
min_heap = [(0, 0)]
|
||||
total_cost = 0
|
||||
|
||||
while len(visited) < n:
|
||||
# Get the cheapest edge to an unvisited point
|
||||
cost, curr = heapq.heappop(min_heap)
|
||||
|
||||
# Skip if already in MST (found a cheaper path earlier)
|
||||
if curr in visited:
|
||||
continue
|
||||
|
||||
# Add this point to MST
|
||||
visited.add(curr)
|
||||
total_cost += cost
|
||||
|
||||
# Explore edges to all unvisited points
|
||||
for next_point in range(n):
|
||||
if next_point not in visited:
|
||||
# Calculate manhattan distance
|
||||
dist = (abs(points[curr][0] - points[next_point][0]) +
|
||||
abs(points[curr][1] - points[next_point][1]))
|
||||
heapq.heappush(min_heap, (dist, next_point))
|
||||
|
||||
return total_cost
|
||||
explanation: |
|
||||
**Time Complexity:** O(n^2 log n) — We push up to O(n^2) edges to the heap, each operation is O(log n).
|
||||
|
||||
**Space Complexity:** O(n) — The visited set is O(n), and the heap stores at most one entry per unvisited node at any time in the worst case.
|
||||
|
||||
This is the standard Prim's algorithm implementation. We greedily select the minimum cost edge that adds a new point to our growing MST, continuing until all points are connected.
|
||||
|
||||
- approach_name: Kruskal's Algorithm with Union-Find
|
||||
is_optimal: false
|
||||
code: |
|
||||
class UnionFind:
|
||||
def __init__(self, n: int):
|
||||
self.parent = list(range(n))
|
||||
self.rank = [0] * n
|
||||
|
||||
def find(self, x: int) -> int:
|
||||
# Path compression
|
||||
if self.parent[x] != x:
|
||||
self.parent[x] = self.find(self.parent[x])
|
||||
return self.parent[x]
|
||||
|
||||
def union(self, x: int, y: int) -> bool:
|
||||
# Union by rank, returns True if merged
|
||||
px, py = self.find(x), self.find(y)
|
||||
if px == py:
|
||||
return False
|
||||
if self.rank[px] < self.rank[py]:
|
||||
px, py = py, px
|
||||
self.parent[py] = px
|
||||
if self.rank[px] == self.rank[py]:
|
||||
self.rank[px] += 1
|
||||
return True
|
||||
|
||||
def min_cost_connect_points(points: list[list[int]]) -> int:
|
||||
n = len(points)
|
||||
if n <= 1:
|
||||
return 0
|
||||
|
||||
# Generate all edges: (cost, point_i, point_j)
|
||||
edges = []
|
||||
for i in range(n):
|
||||
for j in range(i + 1, n):
|
||||
dist = (abs(points[i][0] - points[j][0]) +
|
||||
abs(points[i][1] - points[j][1]))
|
||||
edges.append((dist, i, j))
|
||||
|
||||
# Sort edges by cost
|
||||
edges.sort()
|
||||
|
||||
# Build MST using Union-Find
|
||||
uf = UnionFind(n)
|
||||
total_cost = 0
|
||||
edges_used = 0
|
||||
|
||||
for cost, u, v in edges:
|
||||
# Only add edge if it connects two components
|
||||
if uf.union(u, v):
|
||||
total_cost += cost
|
||||
edges_used += 1
|
||||
# MST complete when we have n-1 edges
|
||||
if edges_used == n - 1:
|
||||
break
|
||||
|
||||
return total_cost
|
||||
explanation: |
|
||||
**Time Complexity:** O(n^2 log n) — Generating edges is O(n^2), sorting is O(n^2 log n), and Union-Find operations are nearly O(1) amortized.
|
||||
|
||||
**Space Complexity:** O(n^2) — We store all n(n-1)/2 edges before sorting.
|
||||
|
||||
Kruskal's algorithm sorts all edges and greedily adds them if they don't create a cycle. Union-Find efficiently detects cycles. This approach uses more memory due to storing all edges but is conceptually simpler.
|
||||
Reference in New Issue
Block a user