187 lines
7.4 KiB
YAML
187 lines
7.4 KiB
YAML
title: Median of Two Sorted Arrays
|
|
slug: median-of-two-sorted-arrays
|
|
difficulty: hard
|
|
leetcode_id: 4
|
|
leetcode_url: https://leetcode.com/problems/median-of-two-sorted-arrays/
|
|
categories:
|
|
- arrays
|
|
- binary-search
|
|
patterns:
|
|
- binary-search
|
|
|
|
function_signature: "def find_median_sorted_arrays(nums1: list[int], nums2: list[int]) -> float:"
|
|
|
|
test_cases:
|
|
visible:
|
|
- input: { nums1: [1, 3], nums2: [2] }
|
|
expected: 2.0
|
|
- input: { nums1: [1, 2], nums2: [3, 4] }
|
|
expected: 2.5
|
|
hidden:
|
|
- input: { nums1: [], nums2: [1] }
|
|
expected: 1.0
|
|
- input: { nums1: [2], nums2: [] }
|
|
expected: 2.0
|
|
- input: { nums1: [1, 2, 3], nums2: [4, 5, 6] }
|
|
expected: 3.5
|
|
- input: { nums1: [1, 3], nums2: [2, 4] }
|
|
expected: 2.5
|
|
|
|
description: |
|
|
Given two sorted arrays `nums1` and `nums2` of size `m` and `n` respectively, return **the median** of the two sorted arrays.
|
|
|
|
The overall run time complexity should be **O(log(m+n))**.
|
|
|
|
constraints: |
|
|
- `nums1.length == m`
|
|
- `nums2.length == n`
|
|
- `0 <= m <= 1000`
|
|
- `0 <= n <= 1000`
|
|
- `1 <= m + n <= 2000`
|
|
- `-10^6 <= nums1[i], nums2[i] <= 10^6`
|
|
|
|
examples:
|
|
- input: "nums1 = [1,3], nums2 = [2]"
|
|
output: "2.0"
|
|
explanation: "Merged array is [1,2,3]. Median is 2."
|
|
- input: "nums1 = [1,2], nums2 = [3,4]"
|
|
output: "2.5"
|
|
explanation: "Merged array is [1,2,3,4]. Median is (2+3)/2 = 2.5."
|
|
|
|
explanation:
|
|
intuition: |
|
|
The median divides a sorted array into two equal halves. For two sorted arrays, we need to find a **partition** that puts exactly half the total elements on the left and half on the right.
|
|
|
|
Think of it like this: imagine cutting both arrays with vertical lines. If we take `i` elements from `nums1` and `j` elements from `nums2` for the "left half", we need `i + j = (m + n + 1) // 2`. For this partition to be valid:
|
|
- Everything in the left half ≤ Everything in the right half
|
|
|
|
The key insight: once we choose `i` (how many from `nums1`), `j` is determined. So we **binary search on `i`**!
|
|
|
|
For a valid partition:
|
|
- `nums1[i-1] <= nums2[j]` (left of nums1 ≤ right of nums2)
|
|
- `nums2[j-1] <= nums1[i]` (left of nums2 ≤ right of nums1)
|
|
|
|
If not valid, adjust `i`: if `nums1[i-1] > nums2[j]`, we took too many from nums1 — decrease `i`.
|
|
|
|
approach: |
|
|
We solve this using **Binary Search on Partition**:
|
|
|
|
**Step 1: Ensure nums1 is the smaller array**
|
|
|
|
- If `m > n`, swap the arrays
|
|
- This guarantees a valid `j` always exists and improves efficiency
|
|
|
|
|
|
|
|
**Step 2: Binary search for the correct partition**
|
|
|
|
- Search for `i` in range `[0, m]` (elements taken from nums1)
|
|
- Calculate `j = half_len - i` where `half_len = (m + n + 1) // 2`
|
|
- For each `i`, check if partition is valid
|
|
|
|
|
|
|
|
**Step 3: Handle boundary cases with infinity**
|
|
|
|
- If `i = 0`, there's no left element in nums1 → use `-infinity`
|
|
- If `i = m`, there's no right element in nums1 → use `+infinity`
|
|
- Same for `j = 0` and `j = n` in nums2
|
|
|
|
|
|
|
|
**Step 4: Compute the median**
|
|
|
|
- If partition is valid:
|
|
- **Odd total**: median = `max(left1, left2)`
|
|
- **Even total**: median = `(max(left1, left2) + min(right1, right2)) / 2`
|
|
- If not valid, adjust binary search bounds
|
|
|
|
|
|
|
|
The median is formed by the boundary elements at the valid partition.
|
|
|
|
common_pitfalls:
|
|
- title: Not Handling Boundary Cases
|
|
description: |
|
|
When `i = 0` or `i = m`, there's no left or right element in nums1. Accessing `nums1[i-1]` or `nums1[i]` would be out of bounds.
|
|
|
|
Use `float('-inf')` for missing left elements and `float('inf')` for missing right elements. This ensures comparisons always work correctly.
|
|
wrong_approach: "Accessing nums1[i-1] when i = 0"
|
|
correct_approach: "nums1_left = float('-inf') if i == 0 else nums1[i-1]"
|
|
|
|
- title: Binary Searching on the Longer Array
|
|
description: |
|
|
Always search on the shorter array. If `m > n` and we search on nums1, `j = half_len - i` might become negative (invalid).
|
|
|
|
Swapping ensures `j` is always valid: `0 <= j <= n`.
|
|
wrong_approach: "Binary searching on the longer array"
|
|
correct_approach: "if m > n: swap arrays, then binary search on the shorter one"
|
|
|
|
- title: Odd vs Even Total Length
|
|
description: |
|
|
For **odd** total `(m + n)`: the median is a single value — `max(left1, left2)`.
|
|
For **even** total: the median is the average of two middle values.
|
|
|
|
Getting this wrong produces incorrect results for half the test cases.
|
|
wrong_approach: "Always averaging two values"
|
|
correct_approach: "Check (m + n) % 2 and handle odd/even separately"
|
|
|
|
key_takeaways:
|
|
- "**Binary search on partition, not values**: Search for how many elements to take from nums1"
|
|
- "**Partition both arrays to split total elements in half**: Once we choose `i`, `j` is determined"
|
|
- "**Handle boundaries with infinity**: Prevents index errors at array edges"
|
|
- "**O(log min(m,n))**: Binary search on the smaller array is sufficient"
|
|
|
|
time_complexity: "O(log min(m, n)). Binary search on the smaller array."
|
|
space_complexity: "O(1). Only constant extra variables for pointers and boundary values."
|
|
|
|
solutions:
|
|
- approach_name: Binary Search on Partition
|
|
is_optimal: true
|
|
code: |
|
|
def find_median_sorted_arrays(nums1: list[int], nums2: list[int]) -> float:
|
|
# Ensure nums1 is the smaller array for valid j values
|
|
if len(nums1) > len(nums2):
|
|
nums1, nums2 = nums2, nums1
|
|
|
|
m, n = len(nums1), len(nums2)
|
|
half_len = (m + n + 1) // 2 # Size of left half (ceiling for odd total)
|
|
|
|
left, right = 0, m # Binary search bounds for i
|
|
|
|
while left <= right:
|
|
i = (left + right) // 2 # Elements from nums1 in left half
|
|
j = half_len - i # Elements from nums2 in left half
|
|
|
|
# Handle boundary cases with infinity
|
|
nums1_left = float('-inf') if i == 0 else nums1[i - 1]
|
|
nums1_right = float('inf') if i == m else nums1[i]
|
|
nums2_left = float('-inf') if j == 0 else nums2[j - 1]
|
|
nums2_right = float('inf') if j == n else nums2[j]
|
|
|
|
# Check if partition is valid
|
|
if nums1_left <= nums2_right and nums2_left <= nums1_right:
|
|
# Valid partition found — compute median
|
|
if (m + n) % 2 == 1:
|
|
# Odd total: median is max of left half
|
|
return max(nums1_left, nums2_left)
|
|
else:
|
|
# Even total: median is average of middle two
|
|
return (max(nums1_left, nums2_left) +
|
|
min(nums1_right, nums2_right)) / 2
|
|
|
|
elif nums1_left > nums2_right:
|
|
# Too many from nums1, decrease i
|
|
right = i - 1
|
|
else:
|
|
# Too few from nums1, increase i
|
|
left = i + 1
|
|
|
|
return 0.0 # Should never reach here with valid input
|
|
explanation: |
|
|
**Time Complexity:** O(log min(m, n)) — Binary search on the smaller array.
|
|
|
|
**Space Complexity:** O(1) — Only constant extra variables.
|
|
|
|
We binary search for the correct partition point in the smaller array. A valid partition has all left elements ≤ all right elements. Once found, the median is computed from the four boundary elements: max of left side for odd totals, average of max-left and min-right for even totals.
|