274 lines
13 KiB
YAML
274 lines
13 KiB
YAML
title: Allocate Mailboxes
|
|
slug: allocate-mailboxes
|
|
difficulty: hard
|
|
leetcode_id: 1478
|
|
leetcode_url: https://leetcode.com/problems/allocate-mailboxes/
|
|
categories:
|
|
- arrays
|
|
- dynamic-programming
|
|
- sorting
|
|
- math
|
|
patterns:
|
|
- slug: dynamic-programming
|
|
is_optimal: true
|
|
|
|
function_signature: "def min_distance(houses: list[int], k: int) -> int:"
|
|
|
|
test_cases:
|
|
visible:
|
|
- input: { houses: [1, 4, 8, 10, 20], k: 3 }
|
|
expected: 5
|
|
- input: { houses: [2, 3, 5, 12, 18], k: 2 }
|
|
expected: 9
|
|
hidden:
|
|
- input: { houses: [1, 2, 3], k: 1 }
|
|
expected: 2
|
|
- input: { houses: [1, 2, 3], k: 3 }
|
|
expected: 0
|
|
- input: { houses: [7, 4, 6, 1], k: 1 }
|
|
expected: 8
|
|
- input: { houses: [3, 6, 14, 10], k: 4 }
|
|
expected: 0
|
|
- input: { houses: [1, 10, 100], k: 2 }
|
|
expected: 9
|
|
|
|
description: |
|
|
Given the array `houses` where `houses[i]` is the location of the i<sup>th</sup> house along a street and an integer `k`, allocate `k` mailboxes in the street.
|
|
|
|
Return *the **minimum** total distance between each house and its nearest mailbox*.
|
|
|
|
The test cases are generated so that the answer fits in a 32-bit integer.
|
|
|
|
constraints: |
|
|
- `1 <= k <= houses.length <= 100`
|
|
- `1 <= houses[i] <= 10^4`
|
|
- All the integers of `houses` are **unique**
|
|
|
|
examples:
|
|
- input: "houses = [1,4,8,10,20], k = 3"
|
|
output: "5"
|
|
explanation: "Allocate mailboxes in position 3, 9 and 20. Minimum total distance from each house to nearest mailbox is |3-1| + |4-3| + |9-8| + |10-9| + |20-20| = 5."
|
|
- input: "houses = [2,3,5,12,18], k = 2"
|
|
output: "9"
|
|
explanation: "Allocate mailboxes in position 3 and 14. Minimum total distance from each house to nearest mailbox is |2-3| + |3-3| + |5-3| + |12-14| + |18-14| = 9."
|
|
|
|
explanation:
|
|
intuition: |
|
|
Imagine you're a postal service manager trying to place mailboxes along a street to minimise the total walking distance for all residents. Each house will use the nearest mailbox, so you need to strategically partition houses into groups and place one mailbox optimally for each group.
|
|
|
|
The **core insight** is recognising two key mathematical facts:
|
|
|
|
**Fact 1: Optimal placement for one mailbox serving multiple houses is at the median.**
|
|
|
|
If you have a single mailbox serving houses at positions `[2, 5, 8]`, where should you place it? The answer is the **median** position (5 in this case). The median minimises the sum of absolute deviations — this is a well-known result from statistics. Placing it at position 5 gives total distance `|2-5| + |5-5| + |8-5| = 3 + 0 + 3 = 6`, which is optimal.
|
|
|
|
**Fact 2: Houses served by the same mailbox must be contiguous (after sorting).**
|
|
|
|
Think about it: if house A and house C use mailbox M, but house B (between them) uses a different mailbox M', then B would be closer to M than to M' — a contradiction. So we can **sort the houses first** and then partition them into `k` contiguous groups.
|
|
|
|
With these insights, the problem transforms into: *"Partition `n` sorted houses into `k` contiguous groups to minimise the total cost, where the cost of a group is the sum of distances to the median."*
|
|
|
|
This is a classic **interval DP** problem where we try all ways to split houses into groups.
|
|
|
|
approach: |
|
|
We solve this using **Dynamic Programming with Precomputed Costs**:
|
|
|
|
**Step 1: Sort the houses**
|
|
|
|
- Sorting ensures that houses served by the same mailbox are contiguous
|
|
- This is crucial for the DP to work correctly
|
|
|
|
|
|
|
|
**Step 2: Precompute the cost matrix**
|
|
|
|
- `cost[i][j]`: The minimum total distance when one mailbox serves houses from index `i` to index `j`
|
|
- For each pair `(i, j)`, the optimal mailbox position is at the median house
|
|
- Calculate the sum of distances from all houses in `[i, j]` to the median
|
|
|
|
|
|
|
|
**Step 3: Define the DP state**
|
|
|
|
- `dp[i][m]`: The minimum total distance to serve houses `0` to `i-1` using exactly `m` mailboxes
|
|
- Base case: `dp[0][0] = 0` (no houses, no mailboxes, zero cost)
|
|
- Goal: `dp[n][k]` where `n` is the number of houses
|
|
|
|
|
|
|
|
**Step 4: Fill the DP table**
|
|
|
|
- For each number of houses `i` from `1` to `n`:
|
|
- For each number of mailboxes `m` from `1` to `min(i, k)`:
|
|
- Try all ways to assign the last group: houses `j` to `i-1` served by mailbox `m`
|
|
- `dp[i][m] = min(dp[j][m-1] + cost[j][i-1])` for all valid `j`
|
|
|
|
|
|
|
|
**Step 5: Return the result**
|
|
|
|
- Return `dp[n][k]`, the minimum cost to serve all `n` houses with `k` mailboxes
|
|
|
|
common_pitfalls:
|
|
- title: Forgetting to Sort
|
|
description: |
|
|
The houses are not necessarily given in sorted order. Without sorting, the assumption that each mailbox serves a contiguous segment breaks down.
|
|
|
|
For example, with `houses = [10, 1, 5]` and `k = 2`, if we don't sort, we might incorrectly partition as `[10, 1]` and `[5]`, but after sorting it becomes `[1, 5, 10]` where valid partitions are `[1]` and `[5, 10]` or `[1, 5]` and `[10]`.
|
|
wrong_approach: "Process houses in given order"
|
|
correct_approach: "Sort houses first, then apply DP"
|
|
|
|
- title: Using Mean Instead of Median
|
|
description: |
|
|
A common mathematical error is placing the mailbox at the **mean** (average) position instead of the **median**.
|
|
|
|
The mean minimises the sum of *squared* distances, but we need to minimise the sum of *absolute* distances. For `houses = [1, 2, 10]`:
|
|
- Mean = 4.33: Total distance = `|1-4.33| + |2-4.33| + |10-4.33|` ≈ 11.0
|
|
- Median = 2: Total distance = `|1-2| + |2-2| + |10-2|` = 9
|
|
|
|
Always use the median for minimising absolute deviations.
|
|
wrong_approach: "Place mailbox at average position"
|
|
correct_approach: "Place mailbox at median position"
|
|
|
|
- title: Inefficient Cost Calculation
|
|
description: |
|
|
Recalculating the cost for each interval `[i, j]` during DP leads to O(n^3) or O(n^4) complexity.
|
|
|
|
**Precompute all costs** in a matrix first. For each interval, the cost can be computed in O(j - i) time, giving O(n^2) total precomputation. Then DP lookups are O(1).
|
|
wrong_approach: "Calculate interval cost inside DP loops"
|
|
correct_approach: "Precompute cost[i][j] matrix before DP"
|
|
|
|
- title: Off-by-One Errors in DP Indices
|
|
description: |
|
|
The DP has multiple indices (`i` for houses, `m` for mailboxes, `j` for partition points). It's easy to confuse 0-indexed vs 1-indexed or inclusive vs exclusive bounds.
|
|
|
|
Be consistent: if `dp[i][m]` represents the first `i` houses with `m` mailboxes, then `dp[0][0] = 0` is the base case, and `cost[j][i-1]` covers houses from index `j` to `i-1` inclusive.
|
|
|
|
key_takeaways:
|
|
- "**Median minimises absolute distance**: When placing one point to minimise sum of absolute distances to multiple points, use the median"
|
|
- "**Sorting enables contiguity**: After sorting, optimal groups are always contiguous — this transforms the problem into interval DP"
|
|
- "**Precomputation optimisation**: Compute all `cost[i][j]` values upfront to avoid redundant calculations in DP"
|
|
- "**Interval DP pattern**: Problems asking to partition an array into `k` groups with a cost function often use this `dp[i][m]` formulation"
|
|
|
|
time_complexity: "O(n^2 * k). The DP table has O(n * k) states, and each state considers O(n) possible partitions."
|
|
space_complexity: "O(n^2 + n * k). We use O(n^2) for the precomputed cost matrix and O(n * k) for the DP table."
|
|
|
|
solutions:
|
|
- approach_name: Dynamic Programming with Precomputed Costs
|
|
is_optimal: true
|
|
code: |
|
|
def min_distance(houses: list[int], k: int) -> int:
|
|
# Sort houses so each mailbox serves a contiguous segment
|
|
houses.sort()
|
|
n = len(houses)
|
|
|
|
# Precompute cost[i][j]: min distance for one mailbox serving houses[i:j+1]
|
|
# Optimal position is at the median house
|
|
cost = [[0] * n for _ in range(n)]
|
|
for i in range(n):
|
|
for j in range(i, n):
|
|
# Median is at index (i + j) // 2
|
|
median = houses[(i + j) // 2]
|
|
# Sum distances from all houses in range to the median
|
|
for h in range(i, j + 1):
|
|
cost[i][j] += abs(houses[h] - median)
|
|
|
|
# dp[i][m] = min cost to serve first i houses with m mailboxes
|
|
# Initialize with infinity
|
|
INF = float('inf')
|
|
dp = [[INF] * (k + 1) for _ in range(n + 1)]
|
|
dp[0][0] = 0 # Base case: 0 houses, 0 mailboxes, 0 cost
|
|
|
|
# Fill DP table
|
|
for i in range(1, n + 1): # Number of houses to serve
|
|
for m in range(1, min(i, k) + 1): # Number of mailboxes used
|
|
# Try all ways to assign last group
|
|
# Houses j to i-1 (0-indexed) served by mailbox m
|
|
for j in range(m - 1, i):
|
|
dp[i][m] = min(dp[i][m], dp[j][m - 1] + cost[j][i - 1])
|
|
|
|
return dp[n][k]
|
|
explanation: |
|
|
**Time Complexity:** O(n^2 * k) — O(n^2) to precompute costs, O(n^2 * k) for the DP.
|
|
|
|
**Space Complexity:** O(n^2 + n * k) — Cost matrix plus DP table.
|
|
|
|
We first sort the houses, then precompute the cost of serving any contiguous segment with one optimally-placed mailbox. The DP finds the optimal way to partition houses into `k` groups, minimising total cost.
|
|
|
|
- approach_name: Optimised Cost Calculation
|
|
is_optimal: true
|
|
code: |
|
|
def min_distance(houses: list[int], k: int) -> int:
|
|
houses.sort()
|
|
n = len(houses)
|
|
|
|
# Optimised cost calculation using the property:
|
|
# cost[i][j] = cost[i][j-1] + houses[j] - houses[(i+j)//2]
|
|
# But simpler: compute using two-pointer from ends
|
|
cost = [[0] * n for _ in range(n)]
|
|
for i in range(n):
|
|
for j in range(i + 1, n):
|
|
# Cost grows by adding distance from new house to median
|
|
# For a range, sum of |h - median| can be computed as:
|
|
# houses[j] - houses[i] when range has 2 elements
|
|
# For larger ranges, add distances symmetrically
|
|
cost[i][j] = cost[i][j - 1] + houses[j] - houses[(i + j) // 2]
|
|
|
|
# DP with space optimization: only need previous row
|
|
INF = float('inf')
|
|
dp = [INF] * (n + 1)
|
|
dp[0] = 0
|
|
|
|
for m in range(1, k + 1):
|
|
# Process right-to-left to use previous iteration's values
|
|
new_dp = [INF] * (n + 1)
|
|
for i in range(m, n + 1):
|
|
for j in range(m - 1, i):
|
|
new_dp[i] = min(new_dp[i], dp[j] + cost[j][i - 1])
|
|
dp = new_dp
|
|
|
|
return dp[n]
|
|
explanation: |
|
|
**Time Complexity:** O(n^2 * k) — Same asymptotic complexity but with optimised cost computation.
|
|
|
|
**Space Complexity:** O(n^2) — Cost matrix dominates; DP uses O(n) with space optimisation.
|
|
|
|
This version uses the recurrence relation for cost calculation: when extending a range by one house, the new cost equals the old cost plus the distance from the new house to the (possibly shifted) median. The DP is space-optimised to use only O(n) for the current and previous rows.
|
|
|
|
- approach_name: Brute Force (Exponential)
|
|
is_optimal: false
|
|
code: |
|
|
def min_distance(houses: list[int], k: int) -> int:
|
|
houses.sort()
|
|
n = len(houses)
|
|
|
|
def cost(i: int, j: int) -> int:
|
|
"""Cost for one mailbox serving houses[i:j+1]"""
|
|
median = houses[(i + j) // 2]
|
|
return sum(abs(houses[h] - median) for h in range(i, j + 1))
|
|
|
|
def solve(start: int, remaining: int) -> int:
|
|
"""Min cost to serve houses[start:] with remaining mailboxes"""
|
|
# Base case: no more houses
|
|
if start == n:
|
|
return 0 if remaining == 0 else float('inf')
|
|
# Base case: no more mailboxes but houses remain
|
|
if remaining == 0:
|
|
return float('inf')
|
|
|
|
min_cost = float('inf')
|
|
# Try assigning houses[start:end+1] to one mailbox
|
|
for end in range(start, n - remaining + 1):
|
|
current = cost(start, end) + solve(end + 1, remaining - 1)
|
|
min_cost = min(min_cost, current)
|
|
|
|
return min_cost
|
|
|
|
return solve(0, k)
|
|
explanation: |
|
|
**Time Complexity:** O(n^k * n) — Exponential due to trying all partitions without memoisation.
|
|
|
|
**Space Complexity:** O(k) — Recursion depth.
|
|
|
|
This brute force approach tries all ways to partition houses into `k` groups. Without memoisation, it explores many overlapping subproblems redundantly. Included to illustrate the recursive structure before optimisation. Adding memoisation would give the same O(n^2 * k) complexity as the DP solution.
|