Files
codetutor/backend/data/questions/course-schedule.yaml

270 lines
12 KiB
YAML

title: Course Schedule
slug: course-schedule
difficulty: medium
leetcode_id: 207
leetcode_url: https://leetcode.com/problems/course-schedule/
categories:
- graphs
patterns:
- dfs
- bfs
function_signature: "def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:"
test_cases:
visible:
- input: { num_courses: 2, prerequisites: [[1, 0]] }
expected: true
- input: { num_courses: 2, prerequisites: [[1, 0], [0, 1]] }
expected: false
- input: { num_courses: 3, prerequisites: [[1, 0], [2, 1]] }
expected: true
hidden:
- input: { num_courses: 1, prerequisites: [] }
expected: true
- input: { num_courses: 4, prerequisites: [[1, 0], [2, 1], [3, 2], [1, 3]] }
expected: false
- input: { num_courses: 5, prerequisites: [[1, 0], [2, 0], [3, 1], [4, 2]] }
expected: true
description: |
There are a total of `numCourses` courses you have to take, labeled from `0` to `numCourses - 1`. You are given an array `prerequisites` where `prerequisites[i] = [a`<sub>`i`</sub>`, b`<sub>`i`</sub>`]` indicates that you **must** take course `b`<sub>`i`</sub> first if you want to take course `a`<sub>`i`</sub>.
For example, the pair `[0, 1]` indicates that to take course `0` you have to first take course `1`.
Return `true` *if you can finish all courses*. Otherwise, return `false`.
constraints: |
- `1 <= numCourses <= 2000`
- `0 <= prerequisites.length <= 5000`
- `prerequisites[i].length == 2`
- `0 <= a`<sub>`i`</sub>`, b`<sub>`i`</sub>` < numCourses`
- All the pairs `prerequisites[i]` are **unique**
examples:
- input: "numCourses = 2, prerequisites = [[1,0]]"
output: "true"
explanation: "There are a total of 2 courses to take. To take course 1 you should have finished course 0. So it is possible."
- input: "numCourses = 2, prerequisites = [[1,0],[0,1]]"
output: "false"
explanation: "There are a total of 2 courses to take. To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible."
explanation:
intuition: |
Imagine you're planning which courses to take in university. Some courses have prerequisites — you can't take Advanced Calculus without first completing Calculus 101. The question is: given all these dependency rules, is there a valid order to take all courses?
This is fundamentally a **cycle detection problem** in a directed graph. Each course is a node, and each prerequisite creates a directed edge from the required course to the dependent course. If you can find an ordering where all prerequisites are satisfied, that ordering is called a **topological sort**.
The key insight is: **a valid ordering exists if and only if the graph has no cycles**. Why? If there's a cycle like A → B → C → A, then A requires B, B requires C, and C requires A — an impossible circular dependency where no course can be taken first.
Think of it like this: you're trying to complete tasks where some tasks depend on others. If task A depends on B, and B depends on A, you're stuck in an infinite waiting loop — neither can start.
approach: |
We can solve this using **DFS with cycle detection** (also known as detecting back edges). The idea is to track the state of each node during traversal:
**Step 1: Build the adjacency list**
- Create a graph where `graph[course]` contains all courses that depend on `course`
- For each prerequisite `[a, b]`, add an edge from `b` to `a` (course `b` must be taken before course `a`)
&nbsp;
**Step 2: Set up state tracking**
- `WHITE (0)`: Not visited yet
- `GRAY (1)`: Currently being processed (in the current DFS path)
- `BLACK (2)`: Completely processed (all descendants visited)
&nbsp;
**Step 3: DFS from each unvisited node**
- Mark the current node as `GRAY` (we're exploring it)
- Visit all its neighbours recursively
- If we encounter a `GRAY` node, we've found a cycle — return `False`
- After exploring all neighbours, mark the node as `BLACK`
- If we complete without finding a cycle, return `True`
&nbsp;
**Step 4: Return the result**
- If DFS completes for all nodes without detecting a cycle, return `True`
- Otherwise, return `False`
&nbsp;
The three-colour approach is crucial: `GRAY` nodes indicate "we're currently on this path", so encountering a `GRAY` node means we've looped back — a cycle.
common_pitfalls:
- title: Confusing "Visited" with "In Current Path"
description: |
A simple visited set isn't enough for cycle detection in directed graphs. Consider:
```
A → B → C
A → C
```
When exploring A → C directly, node C might already be visited from the A → B → C path. But that's not a cycle!
The key distinction is:
- **Visited (BLACK)**: We've fully explored this node and all its descendants — safe to skip
- **In current path (GRAY)**: We're currently exploring this node — encountering it again means a cycle
Using only a visited set would incorrectly report cycles or miss them entirely.
wrong_approach: "Single visited set for all nodes"
correct_approach: "Three-state tracking (WHITE/GRAY/BLACK)"
- title: Wrong Edge Direction
description: |
The prerequisite format `[a, b]` means "to take `a`, you must first take `b`". This creates a dependency edge from `b` to `a`.
If you build the graph with edges pointing the wrong direction, your cycle detection will still work, but the semantics will be inverted.
For this problem, either direction works for cycle detection, but getting it right matters for follow-up problems like Course Schedule II where you need the actual ordering.
wrong_approach: "Adding edge from a to b"
correct_approach: "Adding edge from b to a (prerequisite points to dependent)"
- title: Not Checking All Components
description: |
The graph might be disconnected — some courses may have no prerequisites and no dependents. If you only start DFS from one node, you might miss cycles in other components.
Always iterate through all nodes and run DFS on any unvisited node.
wrong_approach: "DFS from only node 0"
correct_approach: "DFS from every unvisited node"
key_takeaways:
- "**Cycle detection pattern**: Use three states (unvisited/processing/done) to detect back edges in directed graphs"
- "**Graph modelling skill**: Recognising that dependency problems map to directed graph cycle detection is a key insight"
- "**Topological sort foundation**: No cycles means a topological ordering exists — this is the basis for Course Schedule II"
- "**DFS vs BFS**: Both work here. DFS with colouring is elegant; BFS with in-degree counting (Kahn's algorithm) is an alternative approach"
time_complexity: "O(V + E). We visit each node once and traverse each edge once, where V is `numCourses` and E is the number of prerequisites."
space_complexity: "O(V + E). We store the adjacency list (O(E)) and the state array (O(V)), plus recursion stack space (O(V) in worst case)."
solutions:
- approach_name: DFS with Cycle Detection
is_optimal: true
code: |
def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:
# Build adjacency list: graph[b] = [courses that require b]
graph = [[] for _ in range(num_courses)]
for course, prereq in prerequisites:
graph[prereq].append(course)
# States: 0 = unvisited, 1 = visiting (in current path), 2 = visited
state = [0] * num_courses
def has_cycle(node: int) -> bool:
if state[node] == 1: # Found a back edge - cycle detected!
return True
if state[node] == 2: # Already fully processed - no cycle here
return False
state[node] = 1 # Mark as currently visiting
# Check all dependent courses
for neighbour in graph[node]:
if has_cycle(neighbour):
return True
state[node] = 2 # Mark as fully processed
return False
# Check for cycles starting from each course
for course in range(num_courses):
if has_cycle(course):
return False
return True
explanation: |
**Time Complexity:** O(V + E) — Each node and edge is visited once.
**Space Complexity:** O(V + E) — Adjacency list storage plus recursion stack.
We use DFS with three-state colouring to detect cycles. A node in state `1` (visiting) that we encounter again indicates a back edge, meaning we've found a cycle. If we complete DFS on all nodes without finding a cycle, a valid course ordering exists.
- approach_name: BFS with In-degree (Kahn's Algorithm)
is_optimal: true
code: |
from collections import deque
def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:
# Build adjacency list and count in-degrees
graph = [[] for _ in range(num_courses)]
in_degree = [0] * num_courses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
# Start with courses that have no prerequisites
queue = deque()
for course in range(num_courses):
if in_degree[course] == 0:
queue.append(course)
courses_taken = 0
while queue:
course = queue.popleft()
courses_taken += 1
# "Complete" this course - reduce in-degree of dependent courses
for dependent in graph[course]:
in_degree[dependent] -= 1
# If all prerequisites met, this course is now available
if in_degree[dependent] == 0:
queue.append(dependent)
# If we took all courses, no cycle exists
return courses_taken == num_courses
explanation: |
**Time Complexity:** O(V + E) — Process each node and edge once.
**Space Complexity:** O(V + E) — Adjacency list, in-degree array, and queue.
Kahn's algorithm takes a different approach: start with courses that have no prerequisites (in-degree 0), "complete" them, and see which courses become available. If we can complete all courses, no cycle exists. If some courses remain with non-zero in-degree, they're part of a cycle.
- approach_name: DFS with Visited Set (Simplified)
is_optimal: false
code: |
def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:
# Build adjacency list
graph = [[] for _ in range(num_courses)]
for course, prereq in prerequisites:
graph[prereq].append(course)
# Track globally visited and current path
visited = set()
path = set()
def dfs(node: int) -> bool:
if node in path: # Cycle detected
return False
if node in visited: # Already processed
return True
path.add(node) # Add to current path
for neighbour in graph[node]:
if not dfs(neighbour):
return False
path.remove(node) # Remove from current path
visited.add(node) # Mark as fully visited
return True
# Check all courses
for course in range(num_courses):
if not dfs(course):
return False
return True
explanation: |
**Time Complexity:** O(V + E) — Same as the array-based approach.
**Space Complexity:** O(V) — Two sets instead of an array.
This uses sets instead of a state array, which some find more intuitive. The `path` set tracks the current DFS path (equivalent to state `1`), and `visited` tracks fully processed nodes (equivalent to state `2`). Functionally identical to the optimal solution.