codetutor/backend/data/questions/course-schedule.yaml

title: Course Schedule
slug: course-schedule
difficulty: medium
leetcode_id: 207
leetcode_url: https://leetcode.com/problems/course-schedule/
categories:
  - graphs
patterns:
  - dfs
  - bfs

function_signature: "def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:"

test_cases:
  visible:
    - input: { num_courses: 2, prerequisites: [[1, 0]] }
      expected: true
    - input: { num_courses: 2, prerequisites: [[1, 0], [0, 1]] }
      expected: false
    - input: { num_courses: 3, prerequisites: [[1, 0], [2, 1]] }
      expected: true
  hidden:
    - input: { num_courses: 1, prerequisites: [] }
      expected: true
    - input: { num_courses: 4, prerequisites: [[1, 0], [2, 1], [3, 2], [1, 3]] }
      expected: false
    - input: { num_courses: 5, prerequisites: [[1, 0], [2, 0], [3, 1], [4, 2]] }
      expected: true

description: |
  There are a total of `numCourses` courses you have to take, labeled from `0` to `numCourses - 1`. You are given an array `prerequisites` where `prerequisites[i] = [a`<sub>`i`</sub>`, b`<sub>`i`</sub>`]` indicates that you **must** take course `b`<sub>`i`</sub> first if you want to take course `a`<sub>`i`</sub>.

  For example, the pair `[0, 1]` indicates that to take course `0` you have to first take course `1`.

  Return `true` *if you can finish all courses*. Otherwise, return `false`.

constraints: |
  - `1 <= numCourses <= 2000`
  - `0 <= prerequisites.length <= 5000`
  - `prerequisites[i].length == 2`
  - `0 <= a`<sub>`i`</sub>`, b`<sub>`i`</sub>` < numCourses`
  - All the pairs `prerequisites[i]` are **unique**

examples:
  - input: "numCourses = 2, prerequisites = [[1,0]]"
    output: "true"
    explanation: "There are a total of 2 courses to take. To take course 1 you should have finished course 0. So it is possible."
  - input: "numCourses = 2, prerequisites = [[1,0],[0,1]]"
    output: "false"
    explanation: "There are a total of 2 courses to take. To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible."

explanation:
  intuition: |
    Imagine you're planning which courses to take in university. Some courses have prerequisites — you can't take Advanced Calculus without first completing Calculus 101. The question is: given all these dependency rules, is there a valid order to take all courses?

    This is fundamentally a **cycle detection problem** in a directed graph. Each course is a node, and each prerequisite creates a directed edge from the required course to the dependent course. If you can find an ordering where all prerequisites are satisfied, that ordering is called a **topological sort**.

    The key insight is: **a valid ordering exists if and only if the graph has no cycles**. Why? If there's a cycle like A → B → C → A, then A requires B, B requires C, and C requires A — an impossible circular dependency where no course can be taken first.

    Think of it like this: you're trying to complete tasks where some tasks depend on others. If task A depends on B, and B depends on A, you're stuck in an infinite waiting loop — neither can start.

  approach: |
    We can solve this using **DFS with cycle detection** (also known as detecting back edges). The idea is to track the state of each node during traversal:

    **Step 1: Build the adjacency list**

    - Create a graph where `graph[course]` contains all courses that depend on `course`
    - For each prerequisite `[a, b]`, add an edge from `b` to `a` (course `b` must be taken before course `a`)

    &nbsp;

    **Step 2: Set up state tracking**

    - `WHITE (0)`: Not visited yet
    - `GRAY (1)`: Currently being processed (in the current DFS path)
    - `BLACK (2)`: Completely processed (all descendants visited)

    &nbsp;

    **Step 3: DFS from each unvisited node**

    - Mark the current node as `GRAY` (we're exploring it)
    - Visit all its neighbours recursively
    - If we encounter a `GRAY` node, we've found a cycle — return `False`
    - After exploring all neighbours, mark the node as `BLACK`
    - If we complete without finding a cycle, return `True`

    &nbsp;

    **Step 4: Return the result**

    - If DFS completes for all nodes without detecting a cycle, return `True`
    - Otherwise, return `False`

    &nbsp;

    The three-colour approach is crucial: `GRAY` nodes indicate "we're currently on this path", so encountering a `GRAY` node means we've looped back — a cycle.

  common_pitfalls:
    - title: Confusing "Visited" with "In Current Path"
      description: |
        A simple visited set isn't enough for cycle detection in directed graphs. Consider:
        ```
        A → B → C
        A → C
        ```
        When exploring A → C directly, node C might already be visited from the A → B → C path. But that's not a cycle!

        The key distinction is:
        - **Visited (BLACK)**: We've fully explored this node and all its descendants — safe to skip
        - **In current path (GRAY)**: We're currently exploring this node — encountering it again means a cycle

        Using only a visited set would incorrectly report cycles or miss them entirely.
      wrong_approach: "Single visited set for all nodes"
      correct_approach: "Three-state tracking (WHITE/GRAY/BLACK)"

    - title: Wrong Edge Direction
      description: |
        The prerequisite format `[a, b]` means "to take `a`, you must first take `b`". This creates a dependency edge from `b` to `a`.

        If you build the graph with edges pointing the wrong direction, your cycle detection will still work, but the semantics will be inverted.

        For this problem, either direction works for cycle detection, but getting it right matters for follow-up problems like Course Schedule II where you need the actual ordering.
      wrong_approach: "Adding edge from a to b"
      correct_approach: "Adding edge from b to a (prerequisite points to dependent)"

    - title: Not Checking All Components
      description: |
        The graph might be disconnected — some courses may have no prerequisites and no dependents. If you only start DFS from one node, you might miss cycles in other components.

        Always iterate through all nodes and run DFS on any unvisited node.
      wrong_approach: "DFS from only node 0"
      correct_approach: "DFS from every unvisited node"

  key_takeaways:
    - "**Cycle detection pattern**: Use three states (unvisited/processing/done) to detect back edges in directed graphs"
    - "**Graph modelling skill**: Recognising that dependency problems map to directed graph cycle detection is a key insight"
    - "**Topological sort foundation**: No cycles means a topological ordering exists — this is the basis for Course Schedule II"
    - "**DFS vs BFS**: Both work here. DFS with colouring is elegant; BFS with in-degree counting (Kahn's algorithm) is an alternative approach"

  time_complexity: "O(V + E). We visit each node once and traverse each edge once, where V is `numCourses` and E is the number of prerequisites."
  space_complexity: "O(V + E). We store the adjacency list (O(E)) and the state array (O(V)), plus recursion stack space (O(V) in worst case)."

solutions:
  - approach_name: DFS with Cycle Detection
    is_optimal: true
    code: |
      def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:
          # Build adjacency list: graph[b] = [courses that require b]
          graph = [[] for _ in range(num_courses)]
          for course, prereq in prerequisites:
              graph[prereq].append(course)

          # States: 0 = unvisited, 1 = visiting (in current path), 2 = visited
          state = [0] * num_courses

          def has_cycle(node: int) -> bool:
              if state[node] == 1:  # Found a back edge - cycle detected!
                  return True
              if state[node] == 2:  # Already fully processed - no cycle here
                  return False

              state[node] = 1  # Mark as currently visiting

              # Check all dependent courses
              for neighbour in graph[node]:
                  if has_cycle(neighbour):
                      return True

              state[node] = 2  # Mark as fully processed
              return False

          # Check for cycles starting from each course
          for course in range(num_courses):
              if has_cycle(course):
                  return False

          return True
    explanation: |
      **Time Complexity:** O(V + E) — Each node and edge is visited once.

      **Space Complexity:** O(V + E) — Adjacency list storage plus recursion stack.

      We use DFS with three-state colouring to detect cycles. A node in state `1` (visiting) that we encounter again indicates a back edge, meaning we've found a cycle. If we complete DFS on all nodes without finding a cycle, a valid course ordering exists.

  - approach_name: BFS with In-degree (Kahn's Algorithm)
    is_optimal: true
    code: |
      from collections import deque

      def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:
          # Build adjacency list and count in-degrees
          graph = [[] for _ in range(num_courses)]
          in_degree = [0] * num_courses

          for course, prereq in prerequisites:
              graph[prereq].append(course)
              in_degree[course] += 1

          # Start with courses that have no prerequisites
          queue = deque()
          for course in range(num_courses):
              if in_degree[course] == 0:
                  queue.append(course)

          courses_taken = 0

          while queue:
              course = queue.popleft()
              courses_taken += 1

              # "Complete" this course - reduce in-degree of dependent courses
              for dependent in graph[course]:
                  in_degree[dependent] -= 1
                  # If all prerequisites met, this course is now available
                  if in_degree[dependent] == 0:
                      queue.append(dependent)

          # If we took all courses, no cycle exists
          return courses_taken == num_courses
    explanation: |
      **Time Complexity:** O(V + E) — Process each node and edge once.

      **Space Complexity:** O(V + E) — Adjacency list, in-degree array, and queue.

      Kahn's algorithm takes a different approach: start with courses that have no prerequisites (in-degree 0), "complete" them, and see which courses become available. If we can complete all courses, no cycle exists. If some courses remain with non-zero in-degree, they're part of a cycle.

  - approach_name: DFS with Visited Set (Simplified)
    is_optimal: false
    code: |
      def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:
          # Build adjacency list
          graph = [[] for _ in range(num_courses)]
          for course, prereq in prerequisites:
              graph[prereq].append(course)

          # Track globally visited and current path
          visited = set()
          path = set()

          def dfs(node: int) -> bool:
              if node in path:  # Cycle detected
                  return False
              if node in visited:  # Already processed
                  return True

              path.add(node)  # Add to current path

              for neighbour in graph[node]:
                  if not dfs(neighbour):
                      return False

              path.remove(node)  # Remove from current path
              visited.add(node)  # Mark as fully visited

              return True

          # Check all courses
          for course in range(num_courses):
              if not dfs(course):
                  return False

          return True
    explanation: |
      **Time Complexity:** O(V + E) — Same as the array-based approach.

      **Space Complexity:** O(V) — Two sets instead of an array.

      This uses sets instead of a state array, which some find more intuitive. The `path` set tracks the current DFS path (equivalent to state `1`), and `visited` tracks fully processed nodes (equivalent to state `2`). Functionally identical to the optimal solution.