questions F-L
This commit is contained in:
211
backend/data/questions/kth-largest-element-in-an-array.yaml
Normal file
211
backend/data/questions/kth-largest-element-in-an-array.yaml
Normal file
@@ -0,0 +1,211 @@
|
||||
title: Kth Largest Element in an Array
|
||||
slug: kth-largest-element-in-an-array
|
||||
difficulty: medium
|
||||
leetcode_id: 215
|
||||
leetcode_url: https://leetcode.com/problems/kth-largest-element-in-an-array/
|
||||
categories:
|
||||
- arrays
|
||||
- sorting
|
||||
- heap
|
||||
patterns:
|
||||
- heap
|
||||
- binary-search
|
||||
|
||||
description: |
|
||||
Given an integer array `nums` and an integer `k`, return *the* `k`<sup>th</sup> *largest element in the array*.
|
||||
|
||||
Note that it is the `k`<sup>th</sup> largest element in the sorted order, not the `k`<sup>th</sup> distinct element.
|
||||
|
||||
Can you solve it without sorting?
|
||||
|
||||
constraints: |
|
||||
- `1 <= k <= nums.length <= 10^5`
|
||||
- `-10^4 <= nums[i] <= 10^4`
|
||||
|
||||
examples:
|
||||
- input: "nums = [3,2,1,5,6,4], k = 2"
|
||||
output: "5"
|
||||
explanation: "The sorted array is [1,2,3,4,5,6]. The 2nd largest element is 5."
|
||||
- input: "nums = [3,2,3,1,2,4,5,5,6], k = 4"
|
||||
output: "4"
|
||||
explanation: "The sorted array is [1,2,2,3,3,4,5,5,6]. The 4th largest element is 4."
|
||||
|
||||
explanation:
|
||||
intuition: |
|
||||
Imagine you have a collection of exam scores and you want to find the student who ranked `k`<sup>th</sup> from the top. The most straightforward approach would be to sort all scores and pick the `k`<sup>th</sup> one from the end — but can we do better?
|
||||
|
||||
Think of it like this: if you only need to find *one* specific ranking, do you really need to sort *everything*? This is similar to finding the tallest person in a room versus sorting everyone by height — the first task is much simpler.
|
||||
|
||||
The key insight is that we don't need a fully sorted array. We only need to find the element that would be at position `n - k` if the array were sorted (0-indexed). This opens the door to more efficient approaches:
|
||||
|
||||
1. **Heap approach**: Maintain a "top k" collection using a min-heap of size `k`. Any element smaller than our current `k`<sup>th</sup> largest can be discarded.
|
||||
|
||||
2. **Quickselect approach**: Use the partitioning logic from quicksort, but only recurse into the half that contains our target position.
|
||||
|
||||
Both avoid the full `O(n log n)` cost of sorting when we only need partial ordering.
|
||||
|
||||
approach: |
|
||||
We'll focus on the **Min-Heap approach** as the primary solution due to its consistent performance and clarity:
|
||||
|
||||
**Step 1: Understand the heap strategy**
|
||||
|
||||
- We maintain a min-heap of size `k`
|
||||
- The min-heap always contains the `k` largest elements seen so far
|
||||
- The root of the heap (minimum of these `k` elements) is our answer
|
||||
|
||||
|
||||
|
||||
**Step 2: Initialise the heap**
|
||||
|
||||
- Create an empty min-heap
|
||||
- We'll use Python's `heapq` which implements a min-heap
|
||||
|
||||
|
||||
|
||||
**Step 3: Process each element**
|
||||
|
||||
- For each number in the array:
|
||||
- If the heap has fewer than `k` elements, push the number
|
||||
- Otherwise, if the number is larger than the heap's minimum (root), replace the root with this number
|
||||
- This ensures we always keep the `k` largest elements
|
||||
|
||||
|
||||
|
||||
**Step 4: Return the result**
|
||||
|
||||
- The root of the heap is the `k`<sup>th</sup> largest element
|
||||
- Return `heap[0]`
|
||||
|
||||
|
||||
|
||||
**Why this works**: By keeping exactly `k` elements and always removing the smallest when we exceed capacity, we guarantee that the smallest element in our heap is larger than all discarded elements — making it exactly the `k`<sup>th</sup> largest overall.
|
||||
|
||||
common_pitfalls:
|
||||
- title: Off-by-One with Heap Size
|
||||
description: |
|
||||
A common mistake is confusion about when to push vs. replace in the heap.
|
||||
|
||||
If you always push and then pop when size exceeds `k`, you might accidentally pop the element you just added if it's the smallest. The correct approach is to check if the new element is larger than the heap's minimum *before* deciding to add it.
|
||||
|
||||
Alternatively, you can push unconditionally and pop if size exceeds `k` — this is simpler and works correctly, though slightly less efficient.
|
||||
wrong_approach: "Complex conditional logic that's easy to get wrong"
|
||||
correct_approach: "Push then pop if size > k, or use heappushpop for efficiency"
|
||||
|
||||
- title: Using Max-Heap Incorrectly
|
||||
description: |
|
||||
Some attempt to use a max-heap of the entire array and pop `k-1` times. While correct, this is inefficient:
|
||||
|
||||
- Building a max-heap: `O(n)`
|
||||
- Popping `k` times: `O(k log n)`
|
||||
- Total: `O(n + k log n)`
|
||||
|
||||
With a min-heap of size `k`, we get `O(n log k)`, which is better when `k` is small relative to `n`.
|
||||
wrong_approach: "Max-heap of all elements, pop k-1 times"
|
||||
correct_approach: "Min-heap of size k, maintaining the k largest"
|
||||
|
||||
- title: Forgetting Python's heapq is Min-Heap Only
|
||||
description: |
|
||||
Python's `heapq` only provides a min-heap. To simulate a max-heap, you must negate values when pushing and negate again when popping.
|
||||
|
||||
For this problem, a min-heap is actually what we want — we keep the `k` largest elements by discarding elements smaller than our current `k`<sup>th</sup> largest.
|
||||
wrong_approach: "Assuming heapq has a max-heap option"
|
||||
correct_approach: "Use min-heap directly for finding kth largest"
|
||||
|
||||
key_takeaways:
|
||||
- "**Partial ordering insight**: When you only need one specific rank, you don't need to sort everything — use a heap or quickselect instead"
|
||||
- "**Min-heap for top-k**: A min-heap of size `k` naturally maintains the `k` largest elements, with the `k`<sup>th</sup> largest at the root"
|
||||
- "**Trade-off awareness**: Heap gives `O(n log k)` guaranteed; Quickselect gives `O(n)` average but `O(n^2)` worst case"
|
||||
- "**Foundation pattern**: This technique applies to streaming data, top-k frequent elements, and many ranking problems"
|
||||
|
||||
time_complexity: "O(n log k). We iterate through all `n` elements, and each heap operation (push/pop) takes `O(log k)` time since the heap size is bounded by `k`."
|
||||
space_complexity: "O(k). We maintain a heap containing at most `k` elements."
|
||||
|
||||
solutions:
|
||||
- approach_name: Min-Heap
|
||||
is_optimal: true
|
||||
code: |
|
||||
import heapq
|
||||
|
||||
def find_kth_largest(nums: list[int], k: int) -> int:
|
||||
# Min-heap to store the k largest elements
|
||||
heap = []
|
||||
|
||||
for num in nums:
|
||||
# Add current number to heap
|
||||
heapq.heappush(heap, num)
|
||||
|
||||
# If heap exceeds size k, remove the smallest
|
||||
# This ensures we keep only the k largest elements
|
||||
if len(heap) > k:
|
||||
heapq.heappop(heap)
|
||||
|
||||
# The root of min-heap is the kth largest
|
||||
return heap[0]
|
||||
explanation: |
|
||||
**Time Complexity:** O(n log k) — We process each of `n` elements with heap operations costing `O(log k)`.
|
||||
|
||||
**Space Complexity:** O(k) — The heap stores at most `k` elements.
|
||||
|
||||
This approach maintains a min-heap of the `k` largest elements seen so far. By keeping the heap size at `k` and using a min-heap, the smallest element in our collection (the root) is always the `k`<sup>th</sup> largest overall.
|
||||
|
||||
- approach_name: Quickselect
|
||||
is_optimal: true
|
||||
code: |
|
||||
import random
|
||||
|
||||
def find_kth_largest(nums: list[int], k: int) -> int:
|
||||
# Convert kth largest to index in sorted array
|
||||
# kth largest = element at index (n - k) in ascending order
|
||||
target_index = len(nums) - k
|
||||
|
||||
def quickselect(left: int, right: int) -> int:
|
||||
# Random pivot to avoid worst-case on sorted input
|
||||
pivot_idx = random.randint(left, right)
|
||||
pivot = nums[pivot_idx]
|
||||
|
||||
# Move pivot to end
|
||||
nums[pivot_idx], nums[right] = nums[right], nums[pivot_idx]
|
||||
|
||||
# Partition: elements < pivot go to the left
|
||||
store_idx = left
|
||||
for i in range(left, right):
|
||||
if nums[i] < pivot:
|
||||
nums[store_idx], nums[i] = nums[i], nums[store_idx]
|
||||
store_idx += 1
|
||||
|
||||
# Move pivot to its final sorted position
|
||||
nums[store_idx], nums[right] = nums[right], nums[store_idx]
|
||||
|
||||
# Check if we found the target
|
||||
if store_idx == target_index:
|
||||
return nums[store_idx]
|
||||
elif store_idx < target_index:
|
||||
# Target is in the right partition
|
||||
return quickselect(store_idx + 1, right)
|
||||
else:
|
||||
# Target is in the left partition
|
||||
return quickselect(left, store_idx - 1)
|
||||
|
||||
return quickselect(0, len(nums) - 1)
|
||||
explanation: |
|
||||
**Time Complexity:** O(n) average, O(n^2) worst case — Average case is linear because we only recurse into one half. Random pivot selection makes worst case very unlikely.
|
||||
|
||||
**Space Complexity:** O(log n) average for recursion stack, O(n) worst case.
|
||||
|
||||
Quickselect uses the partitioning logic from quicksort but only recurses into the partition containing our target index. This reduces the expected work from `O(n log n)` to `O(n)`.
|
||||
|
||||
- approach_name: Sorting
|
||||
is_optimal: false
|
||||
code: |
|
||||
def find_kth_largest(nums: list[int], k: int) -> int:
|
||||
# Sort in descending order
|
||||
nums.sort(reverse=True)
|
||||
|
||||
# Return the kth element (0-indexed, so k-1)
|
||||
return nums[k - 1]
|
||||
explanation: |
|
||||
**Time Complexity:** O(n log n) — Dominated by the sorting step.
|
||||
|
||||
**Space Complexity:** O(1) to O(n) — Depends on the sorting algorithm used (in-place vs. not).
|
||||
|
||||
The simplest approach: sort and index. While not optimal for this specific problem, it's worth knowing as a baseline. For small arrays or when `k` is close to `n`, the practical difference may be negligible.
|
||||
Reference in New Issue
Block a user