Files
codetutor/backend/data/questions/find-k-closest-elements.yaml
2025-05-25 11:47:04 +01:00

190 lines
8.7 KiB
YAML

title: Find K Closest Elements
slug: find-k-closest-elements
difficulty: medium
leetcode_id: 658
leetcode_url: https://leetcode.com/problems/find-k-closest-elements/
categories:
- arrays
- binary-search
- two-pointers
patterns:
- binary-search
- two-pointers
description: |
Given a **sorted** integer array `arr`, two integers `k` and `x`, return the `k` closest integers to `x` in the array. The result should also be sorted in ascending order.
An integer `a` is closer to `x` than an integer `b` if:
- `|a - x| < |b - x|`, or
- `|a - x| == |b - x|` and `a < b`
constraints: |
- `1 <= k <= arr.length`
- `1 <= arr.length <= 10^4`
- `arr` is sorted in **ascending** order
- `-10^4 <= arr[i], x <= 10^4`
examples:
- input: "arr = [1,2,3,4,5], k = 4, x = 3"
output: "[1,2,3,4]"
explanation: "All elements except 5 are within distance 2 of x=3. Element 4 (distance 1) is closer than 5 (distance 2)."
- input: "arr = [1,1,2,3,4,5], k = 4, x = -1"
output: "[1,1,2,3]"
explanation: "The closest elements to -1 are the smallest values. When distances are equal (both 1s have distance 2), prefer the smaller value."
explanation:
intuition: |
Imagine you're standing at position `x` on a number line, and the sorted array represents points along that line. You need to find the `k` points closest to where you're standing.
The key insight is that the answer is always a **contiguous subarray** of length `k`. Why? Because the array is sorted! If you pick element at index `i` and element at index `j` where `j > i + 1`, and they're both in your answer, then every element between them must also be closer to `x` than elements outside this range.
Think of it like this: you're looking for a **sliding window** of size `k` that captures the `k` closest elements. The question becomes: where should this window start?
Instead of searching for elements, we can **binary search for the left boundary** of this window. For any starting position, we compare whether the left edge or the element just past the right edge is further from `x`. This tells us whether to move the window left or right.
approach: |
We solve this using **Binary Search for Window Start**:
**Step 1: Define the search space**
- We're searching for the starting index of a window of size `k`
- The starting index can range from `0` to `len(arr) - k`
- Set `left = 0`, `right = len(arr) - k`
&nbsp;
**Step 2: Binary search for optimal start position**
- While `left < right`:
- Calculate `mid = left + (right - left) // 2`
- Compare `x - arr[mid]` with `arr[mid + k] - x`
- If `x - arr[mid] > arr[mid + k] - x`:
- The left edge is further from `x` than the element just past the right edge
- Move the window right: `left = mid + 1`
- Else:
- The left edge is closer (or equal), keep it as a candidate
- `right = mid`
&nbsp;
**Step 3: Return the window**
- Return `arr[left:left + k]`
&nbsp;
Why compare `x - arr[mid]` instead of using absolute value? When the left edge is to the left of `x`, `x - arr[mid]` gives the distance. When the right edge past the window is to the right of `x`, `arr[mid + k] - x` gives that distance. This comparison tells us which side should be excluded.
common_pitfalls:
- title: Sorting with Custom Key
description: |
A common first approach is to sort the array by distance to `x`:
```python
sorted(arr, key=lambda a: (abs(a - x), a))[:k]
```
This works but has **O(n log n)** time complexity. Since the array is already sorted, we can do better with O(log n + k) using binary search.
wrong_approach: "Sort by distance, take first k"
correct_approach: "Binary search for window start position"
- title: Using Absolute Values in Comparison
description: |
When comparing distances during binary search, using `abs(arr[mid] - x)` vs `abs(arr[mid + k] - x)` can lead to subtle bugs.
The comparison `x - arr[mid] > arr[mid + k] - x` works because:
- If both are on the same side of `x`, we're comparing actual positions
- If they straddle `x`, the signs handle the comparison correctly
Using absolute values requires additional tie-breaking logic for the "prefer smaller value" rule.
wrong_approach: "abs(arr[mid] - x) vs abs(arr[mid + k] - x)"
correct_approach: "x - arr[mid] vs arr[mid + k] - x"
- title: Wrong Search Space Bounds
description: |
The right bound must be `len(arr) - k`, not `len(arr) - 1`. We're searching for the *start* of a window of size `k`, so the maximum valid start index is `n - k`.
If `arr = [1,2,3,4,5]` and `k = 3`, valid start indices are 0, 1, 2 (giving windows [1,2,3], [2,3,4], [3,4,5]).
wrong_approach: "right = len(arr) - 1"
correct_approach: "right = len(arr) - k"
key_takeaways:
- "**Contiguous subarray insight**: In a sorted array, the k closest elements form a contiguous window"
- "**Binary search for boundaries**: Instead of searching for elements, search for the optimal window position"
- "**Comparison without abs()**: When comparing distances on opposite sides, signed arithmetic handles it correctly"
- "**Foundation for window problems**: This technique extends to other problems about finding optimal subarrays in sorted data"
time_complexity: "O(log(n - k) + k). Binary search takes O(log(n - k)), and returning the slice takes O(k)."
space_complexity: "O(k). The returned list contains k elements. The binary search itself uses O(1) extra space."
solutions:
- approach_name: Binary Search for Window Start
is_optimal: true
code: |
def find_closest_elements(arr: list[int], k: int, x: int) -> list[int]:
# Search for the starting index of the k-element window
left, right = 0, len(arr) - k
while left < right:
mid = left + (right - left) // 2
# Compare left edge distance vs element just past right edge
if x - arr[mid] > arr[mid + k] - x:
# Left edge is further, move window right
left = mid + 1
else:
# Left edge is closer (or equal), keep as candidate
right = mid
# Return the k-element window starting at left
return arr[left:left + k]
explanation: |
**Time Complexity:** O(log(n - k) + k) — Binary search over n - k + 1 positions, plus slicing k elements.
**Space Complexity:** O(k) — Output array of k elements.
We binary search for the optimal starting position of a window of size k. The comparison `x - arr[mid] > arr[mid + k] - x` determines if the left boundary or the element just past the right boundary is further from x. This guides us toward the optimal window.
- approach_name: Two Pointers (Shrinking Window)
is_optimal: false
code: |
def find_closest_elements(arr: list[int], k: int, x: int) -> list[int]:
left, right = 0, len(arr) - 1
# Shrink window until it has exactly k elements
while right - left >= k:
# Compare distances of left and right edges to x
if abs(arr[left] - x) > abs(arr[right] - x):
# Left edge is further, exclude it
left += 1
else:
# Right edge is further (or equal), exclude it
# Equal case: prefer smaller value (left), so exclude right
right -= 1
return arr[left:right + 1]
explanation: |
**Time Complexity:** O(n - k) — We shrink the window n - k times.
**Space Complexity:** O(k) — Output array of k elements.
Start with the full array and repeatedly remove the element furthest from x until k elements remain. When distances are equal, remove the larger (right) element to satisfy the tie-breaking rule. Simpler to understand than binary search but slower for small k.
- approach_name: Sort by Distance
is_optimal: false
code: |
def find_closest_elements(arr: list[int], k: int, x: int) -> list[int]:
# Sort by distance to x, then by value for tie-breaking
sorted_arr = sorted(arr, key=lambda a: (abs(a - x), a))
# Take k closest and sort by value for output
result = sorted(sorted_arr[:k])
return result
explanation: |
**Time Complexity:** O(n log n) — Sorting dominates.
**Space Complexity:** O(n) — Sorted copy of the array.
Sort all elements by their distance to x (with value as tie-breaker), take the first k, then sort again by value for the output. This ignores the fact that the input is already sorted, making it less efficient than the binary search approach.