190 lines
8.7 KiB
YAML
190 lines
8.7 KiB
YAML
title: Find K Closest Elements
|
|
slug: find-k-closest-elements
|
|
difficulty: medium
|
|
leetcode_id: 658
|
|
leetcode_url: https://leetcode.com/problems/find-k-closest-elements/
|
|
categories:
|
|
- arrays
|
|
- binary-search
|
|
- two-pointers
|
|
patterns:
|
|
- binary-search
|
|
- two-pointers
|
|
|
|
description: |
|
|
Given a **sorted** integer array `arr`, two integers `k` and `x`, return the `k` closest integers to `x` in the array. The result should also be sorted in ascending order.
|
|
|
|
An integer `a` is closer to `x` than an integer `b` if:
|
|
|
|
- `|a - x| < |b - x|`, or
|
|
- `|a - x| == |b - x|` and `a < b`
|
|
|
|
constraints: |
|
|
- `1 <= k <= arr.length`
|
|
- `1 <= arr.length <= 10^4`
|
|
- `arr` is sorted in **ascending** order
|
|
- `-10^4 <= arr[i], x <= 10^4`
|
|
|
|
examples:
|
|
- input: "arr = [1,2,3,4,5], k = 4, x = 3"
|
|
output: "[1,2,3,4]"
|
|
explanation: "All elements except 5 are within distance 2 of x=3. Element 4 (distance 1) is closer than 5 (distance 2)."
|
|
- input: "arr = [1,1,2,3,4,5], k = 4, x = -1"
|
|
output: "[1,1,2,3]"
|
|
explanation: "The closest elements to -1 are the smallest values. When distances are equal (both 1s have distance 2), prefer the smaller value."
|
|
|
|
explanation:
|
|
intuition: |
|
|
Imagine you're standing at position `x` on a number line, and the sorted array represents points along that line. You need to find the `k` points closest to where you're standing.
|
|
|
|
The key insight is that the answer is always a **contiguous subarray** of length `k`. Why? Because the array is sorted! If you pick element at index `i` and element at index `j` where `j > i + 1`, and they're both in your answer, then every element between them must also be closer to `x` than elements outside this range.
|
|
|
|
Think of it like this: you're looking for a **sliding window** of size `k` that captures the `k` closest elements. The question becomes: where should this window start?
|
|
|
|
Instead of searching for elements, we can **binary search for the left boundary** of this window. For any starting position, we compare whether the left edge or the element just past the right edge is further from `x`. This tells us whether to move the window left or right.
|
|
|
|
approach: |
|
|
We solve this using **Binary Search for Window Start**:
|
|
|
|
**Step 1: Define the search space**
|
|
|
|
- We're searching for the starting index of a window of size `k`
|
|
- The starting index can range from `0` to `len(arr) - k`
|
|
- Set `left = 0`, `right = len(arr) - k`
|
|
|
|
|
|
|
|
**Step 2: Binary search for optimal start position**
|
|
|
|
- While `left < right`:
|
|
- Calculate `mid = left + (right - left) // 2`
|
|
- Compare `x - arr[mid]` with `arr[mid + k] - x`
|
|
- If `x - arr[mid] > arr[mid + k] - x`:
|
|
- The left edge is further from `x` than the element just past the right edge
|
|
- Move the window right: `left = mid + 1`
|
|
- Else:
|
|
- The left edge is closer (or equal), keep it as a candidate
|
|
- `right = mid`
|
|
|
|
|
|
|
|
**Step 3: Return the window**
|
|
|
|
- Return `arr[left:left + k]`
|
|
|
|
|
|
|
|
Why compare `x - arr[mid]` instead of using absolute value? When the left edge is to the left of `x`, `x - arr[mid]` gives the distance. When the right edge past the window is to the right of `x`, `arr[mid + k] - x` gives that distance. This comparison tells us which side should be excluded.
|
|
|
|
common_pitfalls:
|
|
- title: Sorting with Custom Key
|
|
description: |
|
|
A common first approach is to sort the array by distance to `x`:
|
|
```python
|
|
sorted(arr, key=lambda a: (abs(a - x), a))[:k]
|
|
```
|
|
|
|
This works but has **O(n log n)** time complexity. Since the array is already sorted, we can do better with O(log n + k) using binary search.
|
|
wrong_approach: "Sort by distance, take first k"
|
|
correct_approach: "Binary search for window start position"
|
|
|
|
- title: Using Absolute Values in Comparison
|
|
description: |
|
|
When comparing distances during binary search, using `abs(arr[mid] - x)` vs `abs(arr[mid + k] - x)` can lead to subtle bugs.
|
|
|
|
The comparison `x - arr[mid] > arr[mid + k] - x` works because:
|
|
- If both are on the same side of `x`, we're comparing actual positions
|
|
- If they straddle `x`, the signs handle the comparison correctly
|
|
|
|
Using absolute values requires additional tie-breaking logic for the "prefer smaller value" rule.
|
|
wrong_approach: "abs(arr[mid] - x) vs abs(arr[mid + k] - x)"
|
|
correct_approach: "x - arr[mid] vs arr[mid + k] - x"
|
|
|
|
- title: Wrong Search Space Bounds
|
|
description: |
|
|
The right bound must be `len(arr) - k`, not `len(arr) - 1`. We're searching for the *start* of a window of size `k`, so the maximum valid start index is `n - k`.
|
|
|
|
If `arr = [1,2,3,4,5]` and `k = 3`, valid start indices are 0, 1, 2 (giving windows [1,2,3], [2,3,4], [3,4,5]).
|
|
wrong_approach: "right = len(arr) - 1"
|
|
correct_approach: "right = len(arr) - k"
|
|
|
|
key_takeaways:
|
|
- "**Contiguous subarray insight**: In a sorted array, the k closest elements form a contiguous window"
|
|
- "**Binary search for boundaries**: Instead of searching for elements, search for the optimal window position"
|
|
- "**Comparison without abs()**: When comparing distances on opposite sides, signed arithmetic handles it correctly"
|
|
- "**Foundation for window problems**: This technique extends to other problems about finding optimal subarrays in sorted data"
|
|
|
|
time_complexity: "O(log(n - k) + k). Binary search takes O(log(n - k)), and returning the slice takes O(k)."
|
|
space_complexity: "O(k). The returned list contains k elements. The binary search itself uses O(1) extra space."
|
|
|
|
solutions:
|
|
- approach_name: Binary Search for Window Start
|
|
is_optimal: true
|
|
code: |
|
|
def find_closest_elements(arr: list[int], k: int, x: int) -> list[int]:
|
|
# Search for the starting index of the k-element window
|
|
left, right = 0, len(arr) - k
|
|
|
|
while left < right:
|
|
mid = left + (right - left) // 2
|
|
|
|
# Compare left edge distance vs element just past right edge
|
|
if x - arr[mid] > arr[mid + k] - x:
|
|
# Left edge is further, move window right
|
|
left = mid + 1
|
|
else:
|
|
# Left edge is closer (or equal), keep as candidate
|
|
right = mid
|
|
|
|
# Return the k-element window starting at left
|
|
return arr[left:left + k]
|
|
explanation: |
|
|
**Time Complexity:** O(log(n - k) + k) — Binary search over n - k + 1 positions, plus slicing k elements.
|
|
|
|
**Space Complexity:** O(k) — Output array of k elements.
|
|
|
|
We binary search for the optimal starting position of a window of size k. The comparison `x - arr[mid] > arr[mid + k] - x` determines if the left boundary or the element just past the right boundary is further from x. This guides us toward the optimal window.
|
|
|
|
- approach_name: Two Pointers (Shrinking Window)
|
|
is_optimal: false
|
|
code: |
|
|
def find_closest_elements(arr: list[int], k: int, x: int) -> list[int]:
|
|
left, right = 0, len(arr) - 1
|
|
|
|
# Shrink window until it has exactly k elements
|
|
while right - left >= k:
|
|
# Compare distances of left and right edges to x
|
|
if abs(arr[left] - x) > abs(arr[right] - x):
|
|
# Left edge is further, exclude it
|
|
left += 1
|
|
else:
|
|
# Right edge is further (or equal), exclude it
|
|
# Equal case: prefer smaller value (left), so exclude right
|
|
right -= 1
|
|
|
|
return arr[left:right + 1]
|
|
explanation: |
|
|
**Time Complexity:** O(n - k) — We shrink the window n - k times.
|
|
|
|
**Space Complexity:** O(k) — Output array of k elements.
|
|
|
|
Start with the full array and repeatedly remove the element furthest from x until k elements remain. When distances are equal, remove the larger (right) element to satisfy the tie-breaking rule. Simpler to understand than binary search but slower for small k.
|
|
|
|
- approach_name: Sort by Distance
|
|
is_optimal: false
|
|
code: |
|
|
def find_closest_elements(arr: list[int], k: int, x: int) -> list[int]:
|
|
# Sort by distance to x, then by value for tie-breaking
|
|
sorted_arr = sorted(arr, key=lambda a: (abs(a - x), a))
|
|
|
|
# Take k closest and sort by value for output
|
|
result = sorted(sorted_arr[:k])
|
|
|
|
return result
|
|
explanation: |
|
|
**Time Complexity:** O(n log n) — Sorting dominates.
|
|
|
|
**Space Complexity:** O(n) — Sorted copy of the array.
|
|
|
|
Sort all elements by their distance to x (with value as tie-breaker), take the first k, then sort again by value for the output. This ignores the fact that the input is already sorted, making it less efficient than the binary search approach.
|