title: Task Scheduler slug: task-scheduler difficulty: medium leetcode_id: 621 leetcode_url: https://leetcode.com/problems/task-scheduler/ categories: - arrays - hash-tables - heap patterns: - slug: greedy is_optimal: true - slug: heap is_optimal: false function_signature: "def least_interval(tasks: list[str], n: int) -> int:" test_cases: visible: - input: { tasks: ["A", "A", "A", "B", "B", "B"], n: 2 } expected: 8 - input: { tasks: ["A", "C", "A", "B", "D", "B"], n: 1 } expected: 6 - input: { tasks: ["A", "A", "A", "B", "B", "B"], n: 3 } expected: 10 hidden: - input: { tasks: ["A", "A", "A", "B", "B", "B"], n: 0 } expected: 6 - input: { tasks: ["A", "A", "A", "A", "A", "A", "B", "C", "D", "E", "F", "G"], n: 2 } expected: 16 - input: { tasks: ["A"], n: 5 } expected: 1 - input: { tasks: ["A", "B", "C", "D", "E", "A", "B", "C", "D", "E"], n: 4 } expected: 10 - input: { tasks: ["A", "A", "B", "B", "C", "C", "D", "D"], n: 3 } expected: 8 description: | You are given an array of CPU `tasks`, each labelled with a letter from A to Z, and a number `n`. Each CPU interval can be idle or allow the completion of one task. Tasks can be completed in any order, but there's a constraint: there has to be a gap of **at least** `n` intervals between two tasks with the same label. Return *the minimum number of CPU intervals required to complete all tasks*. constraints: | - `1 <= tasks.length <= 10^4` - `tasks[i]` is an uppercase English letter - `0 <= n <= 100` examples: - input: 'tasks = ["A","A","A","B","B","B"], n = 2' output: "8" explanation: "A possible sequence is: A -> B -> idle -> A -> B -> idle -> A -> B. After completing task A, you must wait two intervals before doing A again. The same applies to task B." - input: 'tasks = ["A","C","A","B","D","B"], n = 1' output: "6" explanation: "A possible sequence is: A -> B -> C -> D -> A -> B. With a cooling interval of 1, you can repeat a task after just one other task." - input: 'tasks = ["A","A","A","B","B","B"], n = 3' output: "10" explanation: "A possible sequence is: A -> B -> idle -> idle -> A -> B -> idle -> idle -> A -> B. There are only two types of tasks, A and B, which need to be separated by 3 intervals." explanation: intuition: | Imagine you're a CPU scheduler trying to execute tasks with mandatory cooldown periods. The **most frequent task** is your bottleneck — it determines the minimum structure of your schedule. Think of it like this: if task A appears 3 times and must have 2 intervals between repetitions, you need at least the slots: `A _ _ A _ _ A`. That's a frame of `(count_A - 1) * (n + 1) + 1` intervals just for A. The key insight is that **idle slots only appear when you don't have enough other tasks to fill the gaps**. If you have many different tasks, they can fill the cooling gaps perfectly, and you never idle. But if your tasks are dominated by one or two high-frequency labels, you'll have idle slots. Visualise the schedule as a grid: - Each row represents a "cycle" of `n + 1` slots - The most frequent tasks occupy the first column - Other tasks fill in the remaining slots - Empty slots become idle time For `tasks = [A,A,A,B,B,B]` with `n = 2`: ``` | A | B | idle | | A | B | idle | | A | B | | ``` This gives us `(3-1) * 3 + 2 = 8` intervals. approach: | We solve this using a **Greedy (Math) Approach** based on the most frequent task: **Step 1: Count task frequencies** - Use a hash map or counter to count occurrences of each task - Find `max_count`: the highest frequency among all tasks - Find `num_max`: how many tasks have this maximum frequency   **Step 2: Calculate the frame size** - The most frequent task creates a "frame" of `(max_count - 1)` complete cycles - Each cycle has `n + 1` slots (the task itself plus `n` cooling intervals) - Frame size = `(max_count - 1) * (n + 1)`   **Step 3: Add the final row** - After the last cycle, we still need to execute the tasks with maximum frequency one more time - Add `num_max` to account for all tasks that appear `max_count` times   **Step 4: Handle the edge case** - If we have many diverse tasks, they can fill all gaps with no idle time - In this case, the answer is simply `len(tasks)` (no idle needed) - Return `max(len(tasks), frame_size + num_max)`   This formula works because the greedy insight is: arrange tasks by frequency, most frequent first, and idle time only exists when the frame isn't filled. common_pitfalls: - title: Simulating the Actual Schedule description: | A common approach is to simulate task execution using a heap, popping the most frequent task, decrementing it, and tracking cooldowns. While this works, it's **O(n * total_tasks)** in the worst case. For `tasks.length = 10^4` and `n = 100`, simulation can be slow. The math-based greedy approach runs in **O(tasks.length)** time with just counting. wrong_approach: "Heap-based simulation with cooldown tracking" correct_approach: "Mathematical formula based on max frequency" - title: Forgetting Tasks with Same Max Frequency description: | If multiple tasks share the maximum frequency, they all need a slot in the final row. For example, with `tasks = [A,A,A,B,B,B]` and `n = 2`: - Both A and B appear 3 times - The final row needs both A and B: `... A B` - Formula: `(3-1) * 3 + 2 = 8`, not `(3-1) * 3 + 1 = 7` Always count how many tasks have the maximum frequency (`num_max`). wrong_approach: "Only adding 1 for the final row" correct_approach: "Adding num_max (count of tasks with max frequency)" - title: Ignoring the No-Idle Case description: | When `n` is small or tasks are highly diverse, you might not need any idle time at all. For `tasks = [A,B,C,D,E,F,G,H,I,J]` with `n = 1`, each task appears once. The formula gives `(1-1) * 2 + 10 = 10`, which equals `len(tasks)`. But if the formula gave less than `len(tasks)`, you'd still need to execute all tasks! Always return `max(len(tasks), calculated_result)`. wrong_approach: "Returning the formula result directly" correct_approach: "Return max of formula result and total task count" key_takeaways: - "**Greedy insight**: The most frequent task determines the minimum schedule structure — arrange around its cooldown requirements" - "**Math over simulation**: Many scheduling problems have closed-form solutions based on counting, avoiding expensive simulation" - "**Frame visualisation**: Think of the schedule as a grid with `n+1` columns; idle slots only appear when you can't fill the frame" - "**Related problems**: This pattern applies to Task Scheduler II, Reorganize String, and other cooldown/spacing problems" time_complexity: "O(n). We iterate through the tasks once to count frequencies, then compute the result in constant time." space_complexity: "O(1). We use a fixed-size counter (at most 26 letters), which is constant regardless of input size." pattern_comparison: | **Math Formula vs Heap Simulation: Why Greedy Wins** Both approaches produce correct answers, but with vastly different performance: | Approach | Time | Space | Complexity | |----------|------|-------|------------| | **Greedy (Math)** | O(n) | O(1) | Simple counting + formula | | **Heap Simulation** | O(n × m) | O(26) | Simulates actual schedule | Where `n` = number of tasks and `m` = cooldown period. **Why the math formula is superior:** - **Insight over simulation**: The formula captures the *essence* of the problem — the most frequent task dictates the schedule structure - **Constant-time calculation**: After counting (O(n)), the formula runs in O(1) - **No edge cases in execution**: The heap simulation has tricky termination conditions **When Heap Simulation is useful:** - When you need the **actual schedule** (not just the count) - For variations where the greedy insight doesn't apply (e.g., task dependencies) - As a verification tool to validate your math solution **Interview strategy**: Mention both approaches. The math solution shows algorithmic insight; the heap shows you can handle complex state. Start with heap if unsure, optimise to math if time permits. solutions: - approach_name: Greedy (Math Formula) is_optimal: true code: | from collections import Counter def least_interval(tasks: list[str], n: int) -> int: # Count frequency of each task freq = Counter(tasks) # Find the maximum frequency max_count = max(freq.values()) # Count how many tasks have this maximum frequency num_max = sum(1 for count in freq.values() if count == max_count) # Calculate minimum intervals using the frame formula # (max_count - 1) complete cycles of (n + 1) slots each # Plus num_max tasks in the final partial row frame_size = (max_count - 1) * (n + 1) + num_max # If we have many diverse tasks, we might not need idle time # Return the maximum of frame size and total tasks return max(len(tasks), frame_size) explanation: | **Time Complexity:** O(n) — Single pass to count frequencies, O(1) math operations. **Space Complexity:** O(1) — Counter uses at most 26 keys (uppercase letters). The formula `(max_count - 1) * (n + 1) + num_max` calculates the minimum intervals by considering the most frequent task as the scheduling backbone. The `max()` with `len(tasks)` handles cases where tasks are diverse enough to avoid any idle time. - approach_name: Max Heap Simulation is_optimal: false code: | from collections import Counter import heapq def least_interval(tasks: list[str], n: int) -> int: # Count frequency of each task freq = Counter(tasks) # Max heap of remaining counts (negate for max heap behavior) heap = [-count for count in freq.values()] heapq.heapify(heap) time = 0 while heap: temp = [] # Tasks that need to wait for cooldown # Process up to n+1 tasks in this cycle for _ in range(n + 1): if heap: # Pop most frequent task and decrement count = heapq.heappop(heap) if count + 1 < 0: # Still has remaining executions temp.append(count + 1) time += 1 # If no more tasks in heap or temp, we're done if not heap and not temp: break # Push tasks back that completed their cooldown for count in temp: heapq.heappush(heap, count) return time explanation: | **Time Complexity:** O(n * m) — Where n is the number of tasks and m is the cooldown period. Each cycle processes up to n+1 tasks. **Space Complexity:** O(26) = O(1) — Heap contains at most 26 different task types. This simulation approach uses a max heap to always process the most frequent remaining task. While intuitive and correct, it's slower than the math formula for large inputs. It's useful for understanding the problem mechanics or when you need to output the actual schedule.