codetutor/backend/data/questions/evaluate-division.yaml

title: Evaluate Division
slug: evaluate-division
difficulty: medium
leetcode_id: 399
leetcode_url: https://leetcode.com/problems/evaluate-division/
categories:
  - graphs
  - hash-tables
patterns:
  - bfs
  - dfs
  - union-find

description: |
  You are given an array of variable pairs `equations` and an array of real numbers `values`, where `equations[i] = [A_i, B_i]` and `values[i]` represent the equation `A_i / B_i = values[i]`. Each `A_i` or `B_i` is a string that represents a single variable.

  You are also given some `queries`, where `queries[j] = [C_j, D_j]` represents the j<sup>th</sup> query where you must find the answer for `C_j / D_j = ?`.

  Return *the answers to all queries*. If a single answer cannot be determined, return `-1.0`.

  **Note:** The input is always valid. You may assume that evaluating the queries will not result in division by zero and that there is no contradiction.

  **Note:** The variables that do not occur in the list of equations are undefined, so the answer cannot be determined for them.

constraints: |
  - `1 <= equations.length <= 20`
  - `equations[i].length == 2`
  - `1 <= A_i.length, B_i.length <= 5`
  - `values.length == equations.length`
  - `0.0 < values[i] <= 20.0`
  - `1 <= queries.length <= 20`
  - `queries[i].length == 2`
  - `1 <= C_j.length, D_j.length <= 5`
  - `A_i, B_i, C_j, D_j` consist of lower case English letters and digits

examples:
  - input: 'equations = [["a","b"],["b","c"]], values = [2.0,3.0], queries = [["a","c"],["b","a"],["a","e"],["a","a"],["x","x"]]'
    output: "[6.00000,0.50000,-1.00000,1.00000,-1.00000]"
    explanation: "Given: a / b = 2.0, b / c = 3.0. Queries are: a / c = 6.0 (via a/b * b/c), b / a = 0.5 (reciprocal), a / e = -1.0 (e undefined), a / a = 1.0 (same variable), x / x = -1.0 (x undefined)."
  - input: 'equations = [["a","b"],["b","c"],["bc","cd"]], values = [1.5,2.5,5.0], queries = [["a","c"],["c","b"],["bc","cd"],["cd","bc"]]'
    output: "[3.75000,0.40000,5.00000,0.20000]"
    explanation: "a / c = 1.5 * 2.5 = 3.75, c / b = 1 / 2.5 = 0.4, bc / cd = 5.0 (given), cd / bc = 1 / 5.0 = 0.2."
  - input: 'equations = [["a","b"]], values = [0.5], queries = [["a","b"],["b","a"],["a","c"],["x","y"]]'
    output: "[0.50000,2.00000,-1.00000,-1.00000]"
    explanation: "a / b = 0.5 (given), b / a = 2.0 (reciprocal), a / c = -1.0 (c undefined), x / y = -1.0 (both undefined)."

explanation:
  intuition: |
    The key insight is to **model this problem as a graph**. Think of each variable as a node, and each equation as a weighted edge connecting two nodes.

    If `a / b = 2.0`, we can draw an edge from `a` to `b` with weight `2.0`. We also draw the reverse edge from `b` to `a` with weight `1/2.0 = 0.5` (the reciprocal).

    Now, answering a query like `a / c = ?` becomes a **path-finding problem**: can we find a path from node `a` to node `c`? If we can, the answer is the **product of all edge weights along the path**.

    Think of it like currency exchange: if 1 USD = 2 EUR and 1 EUR = 3 GBP, then 1 USD = 6 GBP. We're "chaining" the ratios together by multiplication.

    This graph-based thinking transforms a seemingly algebraic problem into a classic graph traversal — we can use BFS or DFS to find paths and accumulate products along the way.

  approach: |
    We solve this using **Graph Construction + BFS/DFS**:

    **Step 1: Build the graph**

    - Create an adjacency list (dictionary of dictionaries) to store the graph
    - For each equation `[A, B]` with value `v`:
      - Add edge `A → B` with weight `v`
      - Add edge `B → A` with weight `1/v` (the reciprocal)

    &nbsp;

    **Step 2: Process each query**

    - For query `[C, D]`:
      - If either `C` or `D` is not in the graph, return `-1.0`
      - If `C == D`, return `1.0` (any variable divided by itself is 1)
      - Otherwise, use BFS/DFS to find a path from `C` to `D`

    &nbsp;

    **Step 3: BFS to find the path and compute the result**

    - Start BFS from node `C` with initial product `1.0`
    - Use a queue storing `(current_node, accumulated_product)`
    - Track visited nodes to avoid cycles
    - For each neighbor, multiply the current product by the edge weight
    - If we reach node `D`, return the accumulated product
    - If BFS completes without finding `D`, return `-1.0`

    &nbsp;

    **Step 4: Collect and return all results**

    - Apply the BFS query function to each query
    - Return the list of results

  common_pitfalls:
    - title: Forgetting the Reciprocal Edge
      description: |
        Each equation gives us two pieces of information: if `a / b = 2`, then `b / a = 0.5`. You must add **both edges** to the graph.

        Without the reverse edge, you can only traverse in one direction, missing valid paths. For example, with just `a → b`, you couldn't answer the query `b / a`.
      wrong_approach: "Only adding edge A → B"
      correct_approach: "Adding both A → B (weight v) and B → A (weight 1/v)"

    - title: Not Handling Undefined Variables
      description: |
        Variables that don't appear in any equation are undefined. The query `x / y` where neither `x` nor `y` exists should return `-1.0`.

        Even `x / x` returns `-1.0` if `x` is not in the graph — we can't assume undefined variables equal themselves because they're not defined at all.
      wrong_approach: "Assuming any variable divided by itself equals 1"
      correct_approach: "Check if variables exist in graph before assuming x/x = 1"

    - title: Missing Cycle Detection
      description: |
        Without tracking visited nodes during BFS/DFS, you could get stuck in infinite loops. For example, with edges `a ↔ b ↔ c`, a naive traversal could bounce back and forth forever.

        Always maintain a visited set and skip already-visited nodes.
      wrong_approach: "BFS/DFS without visited tracking"
      correct_approach: "Use a visited set to avoid revisiting nodes"

    - title: Incorrect Path Product Accumulation
      description: |
        When traversing the path, you must **multiply** edge weights together, not add them. Division chains work multiplicatively: `a/b * b/c = a/c`.

        Each BFS state needs to carry its accumulated product, not just the node.
      wrong_approach: "Adding edge weights along the path"
      correct_approach: "Multiplying edge weights along the path"

  key_takeaways:
    - "**Graph modeling**: Many ratio/relationship problems can be modeled as weighted graphs where finding answers means finding paths"
    - "**Bidirectional edges**: Division relationships are symmetric — `a/b = v` implies `b/a = 1/v`. Always add both edges"
    - "**BFS for path finding**: When you need to find any path between two nodes, BFS (or DFS) is the standard approach"
    - "**Related problems**: This pattern applies to currency exchange, unit conversion, and any transitive relationship problems"

  time_complexity: "O(Q × (V + E)) where Q is the number of queries, V is the number of unique variables, and E is the number of equations. Each BFS traverses at most all nodes and edges."
  space_complexity: "O(V + E) for storing the graph. The BFS queue and visited set use O(V) additional space per query."

solutions:
  - approach_name: BFS Graph Traversal
    is_optimal: true
    code: |
      from collections import defaultdict, deque

      def calc_equation(
          equations: list[list[str]],
          values: list[float],
          queries: list[list[str]]
      ) -> list[float]:
          # Build the weighted graph
          graph = defaultdict(dict)
          for (a, b), value in zip(equations, values):
              graph[a][b] = value      # a / b = value
              graph[b][a] = 1 / value  # b / a = 1/value

          def bfs(start: str, end: str) -> float:
              # Check if variables exist in graph
              if start not in graph or end not in graph:
                  return -1.0
              # Same variable divides to 1
              if start == end:
                  return 1.0

              # BFS: queue stores (node, accumulated_product)
              queue = deque([(start, 1.0)])
              visited = {start}

              while queue:
                  node, product = queue.popleft()

                  # Check all neighbors
                  for neighbor, weight in graph[node].items():
                      if neighbor == end:
                          # Found the target - return accumulated product
                          return product * weight

                      if neighbor not in visited:
                          visited.add(neighbor)
                          queue.append((neighbor, product * weight))

              # No path found
              return -1.0

          # Process all queries
          return [bfs(c, d) for c, d in queries]
    explanation: |
      **Time Complexity:** O(Q × (V + E)) — For each query, BFS may visit all vertices and edges.

      **Space Complexity:** O(V + E) — Graph storage dominates; BFS uses O(V) per query.

      We build a bidirectional weighted graph where each equation creates two edges. For each query, BFS finds a path while accumulating the product of edge weights. This handles all cases: direct edges, multi-hop paths, undefined variables, and self-division.

  - approach_name: DFS Graph Traversal
    is_optimal: true
    code: |
      from collections import defaultdict

      def calc_equation(
          equations: list[list[str]],
          values: list[float],
          queries: list[list[str]]
      ) -> list[float]:
          # Build the weighted graph
          graph = defaultdict(dict)
          for (a, b), value in zip(equations, values):
              graph[a][b] = value
              graph[b][a] = 1 / value

          def dfs(start: str, end: str, visited: set) -> float:
              # Variable not in graph
              if start not in graph:
                  return -1.0
              # Found the target
              if start == end:
                  return 1.0

              visited.add(start)

              # Explore all neighbors
              for neighbor, weight in graph[start].items():
                  if neighbor not in visited:
                      result = dfs(neighbor, end, visited)
                      # If path found, multiply weights
                      if result != -1.0:
                          return weight * result

              # No path found from this node
              return -1.0

          results = []
          for c, d in queries:
              if c not in graph or d not in graph:
                  results.append(-1.0)
              else:
                  results.append(dfs(c, d, set()))

          return results
    explanation: |
      **Time Complexity:** O(Q × (V + E)) — Same as BFS; each query may explore all nodes/edges.

      **Space Complexity:** O(V + E) — Graph storage plus O(V) recursion stack depth.

      DFS achieves the same result as BFS with a recursive approach. We explore paths depth-first, multiplying edge weights as we backtrack. Both approaches are optimal for this problem size.

  - approach_name: Union-Find with Weights
    is_optimal: true
    code: |
      class UnionFind:
          def __init__(self):
              # parent[x] = (root, weight) where x / root = weight
              self.parent = {}

          def find(self, x: str) -> tuple[str, float]:
              if x not in self.parent:
                  self.parent[x] = (x, 1.0)
                  return (x, 1.0)

              if self.parent[x][0] == x:
                  return self.parent[x]

              # Path compression with weight update
              root, weight = self.find(self.parent[x][0])
              self.parent[x] = (root, self.parent[x][1] * weight)
              return self.parent[x]

          def union(self, x: str, y: str, value: float) -> None:
              # x / y = value
              root_x, weight_x = self.find(x)  # x / root_x = weight_x
              root_y, weight_y = self.find(y)  # y / root_y = weight_y

              if root_x != root_y:
                  # Connect root_x to root_y
                  # root_x / root_y = (x / root_x)^-1 * (x / y) * (y / root_y)
                  #                 = weight_y * value / weight_x
                  self.parent[root_x] = (root_y, weight_y * value / weight_x)

          def query(self, x: str, y: str) -> float:
              if x not in self.parent or y not in self.parent:
                  return -1.0

              root_x, weight_x = self.find(x)
              root_y, weight_y = self.find(y)

              if root_x != root_y:
                  return -1.0  # Different components

              # x / y = (x / root) / (y / root) = weight_x / weight_y
              return weight_x / weight_y

      def calc_equation(
          equations: list[list[str]],
          values: list[float],
          queries: list[list[str]]
      ) -> list[float]:
          uf = UnionFind()

          # Build union-find structure
          for (a, b), value in zip(equations, values):
              uf.union(a, b, value)

          # Answer queries
          return [uf.query(c, d) for c, d in queries]
    explanation: |
      **Time Complexity:** O((E + Q) × α(V)) — Near O(1) per operation with path compression, where α is the inverse Ackermann function.

      **Space Complexity:** O(V) — Storage for parent pointers and weights.

      Union-Find tracks connected components with weighted edges. Each node stores its ratio to its root. When querying `x / y`, we find both roots — if they match, the answer is `weight_x / weight_y`. This approach excels when there are many queries on the same graph.