title: Minimum Interval to Include Each Query slug: minimum-interval-to-include-each-query difficulty: hard leetcode_id: 1851 leetcode_url: https://leetcode.com/problems/minimum-interval-to-include-each-query/ categories: - arrays - sorting - heap patterns: - heap description: | You are given a 2D integer array `intervals`, where `intervals[i] = [left_i, right_i]` describes the i^th interval starting at `left_i` and ending at `right_i` **(inclusive)**. The **size** of an interval is defined as the number of integers it contains, or more formally `right_i - left_i + 1`. You are also given an integer array `queries`. The answer to the j^th query is the **size of the smallest interval** `i` such that `left_i <= queries[j] <= right_i`. If no such interval exists, the answer is `-1`. Return *an array containing the answers to the queries*. constraints: | - `1 <= intervals.length <= 10^5` - `1 <= queries.length <= 10^5` - `intervals[i].length == 2` - `1 <= left_i <= right_i <= 10^7` - `1 <= queries[j] <= 10^7` examples: - input: "intervals = [[1,4],[2,4],[3,6],[4,4]], queries = [2,3,4,5]" output: "[3,3,1,4]" explanation: "Query = 2: The interval [2,4] is the smallest containing 2 (size = 3). Query = 3: [2,4] is smallest (size = 3). Query = 4: [4,4] is smallest (size = 1). Query = 5: [3,6] is smallest (size = 4)." - input: "intervals = [[2,3],[2,5],[1,8],[20,25]], queries = [2,19,5,22]" output: "[2,-1,4,6]" explanation: "Query = 2: [2,3] is smallest (size = 2). Query = 19: No interval contains 19, answer is -1. Query = 5: [2,5] is smallest (size = 4). Query = 22: [20,25] is smallest (size = 6)." explanation: intuition: | Imagine you have a collection of intervals laid out on a number line, and for each query point, you need to find which interval "wraps" around it most tightly. The brute force approach would check every interval for every query, resulting in O(n * m) complexity. With up to 10^5 intervals and 10^5 queries, this means 10^10 operations — far too slow. The key insight is to **process queries in sorted order**. If we sort both intervals and queries by their starting positions, we can efficiently manage which intervals are "active" (could potentially contain the current query) using a **min-heap**. Think of it like a sweep line moving left to right across the number line: - As we reach each query point, we "activate" all intervals that start at or before this point - We remove intervals whose right endpoint is before the query (they can't contain it) - Among the remaining active intervals, the smallest one wins The min-heap keeps intervals sorted by size, so after removing invalid ones, the top of the heap is our answer. approach: | We solve this using a **Sorted Queries + Min-Heap** approach: **Step 1: Prepare the data** - Sort the intervals by their left endpoint (starting position) - Create a list of `(query_value, original_index)` pairs and sort by query value - This allows us to process queries left-to-right while preserving original order for the result **Step 2: Initialise tracking variables** - `result`: Array of size `len(queries)` to store answers - `min_heap`: Priority queue storing `(interval_size, right_endpoint)` tuples - `interval_idx`: Pointer to track which intervals we've processed **Step 3: Process each query in sorted order** For each query (from smallest to largest): - **Add intervals**: While there are intervals with `left <= query`, push `(size, right)` onto the heap and advance `interval_idx` - **Remove expired intervals**: While the heap is non-empty and the top interval's `right < query`, pop it (it can't contain this query) - **Record answer**: If the heap is non-empty, the top element's size is our answer; otherwise, answer is `-1` **Step 4: Return the result** - Since we processed queries in sorted order but stored answers at original indices, `result` is already correctly ordered This approach works because sorting allows us to add intervals exactly once and remove them at most once, achieving optimal efficiency. common_pitfalls: - title: The Brute Force Trap description: | A naive approach checks every interval for every query: ```python for query in queries: for left, right in intervals: if left <= query <= right: # track minimum size ``` This is **O(n * m)** where n = number of intervals and m = number of queries. With constraints of 10^5 for both, this means 10^10 operations — guaranteed **Time Limit Exceeded**. wrong_approach: "Nested loops checking all interval-query pairs" correct_approach: "Sort and sweep with a min-heap for O((n + m) log n)" - title: Forgetting to Preserve Query Order description: | Since we process queries in sorted order for efficiency, we must remember their original positions. If you just sort `queries` directly and build the result in that order, you'll return answers in the wrong order. Always pair each query with its original index: `sorted_queries = sorted(enumerate(queries), key=lambda x: x[1])`, then use the index to place answers correctly. wrong_approach: "Sort queries and return answers in sorted order" correct_approach: "Track original indices and place answers accordingly" - title: Not Removing Expired Intervals description: | After adding intervals to the heap, you must check if the top interval has expired (its `right < query`). Without this cleanup step, you might return the size of an interval that doesn't actually contain the query point. The key is that the heap is sorted by size, not by validity. Always pop expired intervals before reading the answer. wrong_approach: "Assume all intervals in heap are valid" correct_approach: "Pop intervals where right < query before reading answer" - title: Wrong Heap Priority description: | The heap should be ordered by interval **size** (smallest first), not by left or right endpoint. The problem asks for the smallest interval containing the query, so size must be the primary sort key. Store tuples as `(size, right)` where `size = right - left + 1`. wrong_approach: "Heap ordered by left endpoint or right endpoint" correct_approach: "Heap ordered by interval size (right - left + 1)" key_takeaways: - "**Sweep line pattern**: Sorting queries enables efficient left-to-right processing where intervals are added once and removed at most once" - "**Min-heap for tracking minimums**: When you need the minimum among a dynamic set of candidates, a min-heap provides O(log n) operations" - "**Lazy deletion**: Instead of eagerly removing intervals, we check validity when we need the answer — a common optimisation with heaps" - "**Preserve original order**: When processing data in sorted order for efficiency, track original indices to reconstruct the expected output order" time_complexity: "O((n + m) log n). Sorting intervals takes O(n log n), sorting queries takes O(m log m). Each interval is pushed and popped from the heap at most once, giving O(n log n) heap operations. Total: O((n + m) log(n + m)), which simplifies to O((n + m) log n)." space_complexity: "O(n + m). The heap can hold up to n intervals, and we store m query-index pairs. The result array uses O(m) space." solutions: - approach_name: Sorted Queries with Min-Heap is_optimal: true code: | import heapq def min_interval(intervals: list[list[int]], queries: list[int]) -> list[int]: # Sort intervals by left endpoint intervals.sort(key=lambda x: x[0]) # Pair each query with its original index, then sort by query value sorted_queries = sorted(enumerate(queries), key=lambda x: x[1]) result = [-1] * len(queries) min_heap = [] # (interval_size, right_endpoint) interval_idx = 0 for original_idx, query in sorted_queries: # Add all intervals that start at or before this query while interval_idx < len(intervals) and intervals[interval_idx][0] <= query: left, right = intervals[interval_idx] size = right - left + 1 heapq.heappush(min_heap, (size, right)) interval_idx += 1 # Remove intervals that end before this query (can't contain it) while min_heap and min_heap[0][1] < query: heapq.heappop(min_heap) # If any valid intervals remain, the smallest is at the top if min_heap: result[original_idx] = min_heap[0][0] return result explanation: | **Time Complexity:** O((n + m) log n) — Sorting both arrays, plus each interval enters and leaves the heap at most once. **Space Complexity:** O(n + m) — Heap holds up to n intervals, result array holds m answers. By sorting queries and processing them left-to-right, we ensure each interval is considered exactly once. The min-heap maintains candidate intervals sorted by size, and lazy deletion removes expired intervals only when needed. - approach_name: Brute Force is_optimal: false code: | def min_interval(intervals: list[list[int]], queries: list[int]) -> list[int]: result = [] for query in queries: min_size = float('inf') # Check every interval for this query for left, right in intervals: # Does this interval contain the query? if left <= query <= right: size = right - left + 1 min_size = min(min_size, size) # -1 if no interval contains this query result.append(min_size if min_size != float('inf') else -1) return result explanation: | **Time Complexity:** O(n * m) — For each of m queries, we check all n intervals. **Space Complexity:** O(m) — Only the result array. This straightforward approach checks every interval for every query. While correct, it's far too slow for the given constraints (up to 10^10 operations). Included to illustrate why the optimised heap-based approach is necessary.