questions D-E

2025-05-25 11:08:40 +01:00
parent e028167a47
commit 798e0ba1df
18 changed files with 4022 additions and 0 deletions
--- a/backend/data/questions/evaluate-division.yaml
+++ b/backend/data/questions/evaluate-division.yaml
@@ -0,0 +1,307 @@
+title: Evaluate Division
+slug: evaluate-division
+difficulty: medium
+leetcode_id: 399
+leetcode_url: https://leetcode.com/problems/evaluate-division/
+categories:
+  - graphs
+  - hash-tables
+patterns:
+  - bfs
+  - dfs
+  - union-find
+
+description: |
+  You are given an array of variable pairs `equations` and an array of real numbers `values`, where `equations[i] = [A_i, B_i]` and `values[i]` represent the equation `A_i / B_i = values[i]`. Each `A_i` or `B_i` is a string that represents a single variable.
+
+  You are also given some `queries`, where `queries[j] = [C_j, D_j]` represents the j<sup>th</sup> query where you must find the answer for `C_j / D_j = ?`.
+
+  Return *the answers to all queries*. If a single answer cannot be determined, return `-1.0`.
+
+  **Note:** The input is always valid. You may assume that evaluating the queries will not result in division by zero and that there is no contradiction.
+
+  **Note:** The variables that do not occur in the list of equations are undefined, so the answer cannot be determined for them.
+
+constraints: |
+  - `1 <= equations.length <= 20`
+  - `equations[i].length == 2`
+  - `1 <= A_i.length, B_i.length <= 5`
+  - `values.length == equations.length`
+  - `0.0 < values[i] <= 20.0`
+  - `1 <= queries.length <= 20`
+  - `queries[i].length == 2`
+  - `1 <= C_j.length, D_j.length <= 5`
+  - `A_i, B_i, C_j, D_j` consist of lower case English letters and digits
+
+examples:
+  - input: 'equations = [["a","b"],["b","c"]], values = [2.0,3.0], queries = [["a","c"],["b","a"],["a","e"],["a","a"],["x","x"]]'
+    output: "[6.00000,0.50000,-1.00000,1.00000,-1.00000]"
+    explanation: "Given: a / b = 2.0, b / c = 3.0. Queries are: a / c = 6.0 (via a/b * b/c), b / a = 0.5 (reciprocal), a / e = -1.0 (e undefined), a / a = 1.0 (same variable), x / x = -1.0 (x undefined)."
+  - input: 'equations = [["a","b"],["b","c"],["bc","cd"]], values = [1.5,2.5,5.0], queries = [["a","c"],["c","b"],["bc","cd"],["cd","bc"]]'
+    output: "[3.75000,0.40000,5.00000,0.20000]"
+    explanation: "a / c = 1.5 * 2.5 = 3.75, c / b = 1 / 2.5 = 0.4, bc / cd = 5.0 (given), cd / bc = 1 / 5.0 = 0.2."
+  - input: 'equations = [["a","b"]], values = [0.5], queries = [["a","b"],["b","a"],["a","c"],["x","y"]]'
+    output: "[0.50000,2.00000,-1.00000,-1.00000]"
+    explanation: "a / b = 0.5 (given), b / a = 2.0 (reciprocal), a / c = -1.0 (c undefined), x / y = -1.0 (both undefined)."
+
+explanation:
+  intuition: |
+    The key insight is to **model this problem as a graph**. Think of each variable as a node, and each equation as a weighted edge connecting two nodes.
+
+    If `a / b = 2.0`, we can draw an edge from `a` to `b` with weight `2.0`. We also draw the reverse edge from `b` to `a` with weight `1/2.0 = 0.5` (the reciprocal).
+
+    Now, answering a query like `a / c = ?` becomes a **path-finding problem**: can we find a path from node `a` to node `c`? If we can, the answer is the **product of all edge weights along the path**.
+
+    Think of it like currency exchange: if 1 USD = 2 EUR and 1 EUR = 3 GBP, then 1 USD = 6 GBP. We're "chaining" the ratios together by multiplication.
+
+    This graph-based thinking transforms a seemingly algebraic problem into a classic graph traversal — we can use BFS or DFS to find paths and accumulate products along the way.
+
+  approach: |
+    We solve this using **Graph Construction + BFS/DFS**:
+
+    **Step 1: Build the graph**
+
+    - Create an adjacency list (dictionary of dictionaries) to store the graph
+    - For each equation `[A, B]` with value `v`:
+      - Add edge `A → B` with weight `v`
+      - Add edge `B → A` with weight `1/v` (the reciprocal)
+
+    &nbsp;
+
+    **Step 2: Process each query**
+
+    - For query `[C, D]`:
+      - If either `C` or `D` is not in the graph, return `-1.0`
+      - If `C == D`, return `1.0` (any variable divided by itself is 1)
+      - Otherwise, use BFS/DFS to find a path from `C` to `D`
+
+    &nbsp;
+
+    **Step 3: BFS to find the path and compute the result**
+
+    - Start BFS from node `C` with initial product `1.0`
+    - Use a queue storing `(current_node, accumulated_product)`
+    - Track visited nodes to avoid cycles
+    - For each neighbor, multiply the current product by the edge weight
+    - If we reach node `D`, return the accumulated product
+    - If BFS completes without finding `D`, return `-1.0`
+
+    &nbsp;
+
+    **Step 4: Collect and return all results**
+
+    - Apply the BFS query function to each query
+    - Return the list of results
+
+  common_pitfalls:
+    - title: Forgetting the Reciprocal Edge
+      description: |
+        Each equation gives us two pieces of information: if `a / b = 2`, then `b / a = 0.5`. You must add **both edges** to the graph.
+
+        Without the reverse edge, you can only traverse in one direction, missing valid paths. For example, with just `a → b`, you couldn't answer the query `b / a`.
+      wrong_approach: "Only adding edge A → B"
+      correct_approach: "Adding both A → B (weight v) and B → A (weight 1/v)"
+
+    - title: Not Handling Undefined Variables
+      description: |
+        Variables that don't appear in any equation are undefined. The query `x / y` where neither `x` nor `y` exists should return `-1.0`.
+
+        Even `x / x` returns `-1.0` if `x` is not in the graph — we can't assume undefined variables equal themselves because they're not defined at all.
+      wrong_approach: "Assuming any variable divided by itself equals 1"
+      correct_approach: "Check if variables exist in graph before assuming x/x = 1"
+
+    - title: Missing Cycle Detection
+      description: |
+        Without tracking visited nodes during BFS/DFS, you could get stuck in infinite loops. For example, with edges `a ↔ b ↔ c`, a naive traversal could bounce back and forth forever.
+
+        Always maintain a visited set and skip already-visited nodes.
+      wrong_approach: "BFS/DFS without visited tracking"
+      correct_approach: "Use a visited set to avoid revisiting nodes"
+
+    - title: Incorrect Path Product Accumulation
+      description: |
+        When traversing the path, you must **multiply** edge weights together, not add them. Division chains work multiplicatively: `a/b * b/c = a/c`.
+
+        Each BFS state needs to carry its accumulated product, not just the node.
+      wrong_approach: "Adding edge weights along the path"
+      correct_approach: "Multiplying edge weights along the path"
+
+  key_takeaways:
+    - "**Graph modeling**: Many ratio/relationship problems can be modeled as weighted graphs where finding answers means finding paths"
+    - "**Bidirectional edges**: Division relationships are symmetric — `a/b = v` implies `b/a = 1/v`. Always add both edges"
+    - "**BFS for path finding**: When you need to find any path between two nodes, BFS (or DFS) is the standard approach"
+    - "**Related problems**: This pattern applies to currency exchange, unit conversion, and any transitive relationship problems"
+
+  time_complexity: "O(Q × (V + E)) where Q is the number of queries, V is the number of unique variables, and E is the number of equations. Each BFS traverses at most all nodes and edges."
+  space_complexity: "O(V + E) for storing the graph. The BFS queue and visited set use O(V) additional space per query."
+
+solutions:
+  - approach_name: BFS Graph Traversal
+    is_optimal: true
+    code: |
+      from collections import defaultdict, deque
+
+      def calc_equation(
+          equations: list[list[str]],
+          values: list[float],
+          queries: list[list[str]]
+      ) -> list[float]:
+          # Build the weighted graph
+          graph = defaultdict(dict)
+          for (a, b), value in zip(equations, values):
+              graph[a][b] = value      # a / b = value
+              graph[b][a] = 1 / value  # b / a = 1/value
+
+          def bfs(start: str, end: str) -> float:
+              # Check if variables exist in graph
+              if start not in graph or end not in graph:
+                  return -1.0
+              # Same variable divides to 1
+              if start == end:
+                  return 1.0
+
+              # BFS: queue stores (node, accumulated_product)
+              queue = deque([(start, 1.0)])
+              visited = {start}
+
+              while queue:
+                  node, product = queue.popleft()
+
+                  # Check all neighbors
+                  for neighbor, weight in graph[node].items():
+                      if neighbor == end:
+                          # Found the target - return accumulated product
+                          return product * weight
+
+                      if neighbor not in visited:
+                          visited.add(neighbor)
+                          queue.append((neighbor, product * weight))
+
+              # No path found
+              return -1.0
+
+          # Process all queries
+          return [bfs(c, d) for c, d in queries]
+    explanation: |
+      **Time Complexity:** O(Q × (V + E)) — For each query, BFS may visit all vertices and edges.
+
+      **Space Complexity:** O(V + E) — Graph storage dominates; BFS uses O(V) per query.
+
+      We build a bidirectional weighted graph where each equation creates two edges. For each query, BFS finds a path while accumulating the product of edge weights. This handles all cases: direct edges, multi-hop paths, undefined variables, and self-division.
+
+  - approach_name: DFS Graph Traversal
+    is_optimal: true
+    code: |
+      from collections import defaultdict
+
+      def calc_equation(
+          equations: list[list[str]],
+          values: list[float],
+          queries: list[list[str]]
+      ) -> list[float]:
+          # Build the weighted graph
+          graph = defaultdict(dict)
+          for (a, b), value in zip(equations, values):
+              graph[a][b] = value
+              graph[b][a] = 1 / value
+
+          def dfs(start: str, end: str, visited: set) -> float:
+              # Variable not in graph
+              if start not in graph:
+                  return -1.0
+              # Found the target
+              if start == end:
+                  return 1.0
+
+              visited.add(start)
+
+              # Explore all neighbors
+              for neighbor, weight in graph[start].items():
+                  if neighbor not in visited:
+                      result = dfs(neighbor, end, visited)
+                      # If path found, multiply weights
+                      if result != -1.0:
+                          return weight * result
+
+              # No path found from this node
+              return -1.0
+
+          results = []
+          for c, d in queries:
+              if c not in graph or d not in graph:
+                  results.append(-1.0)
+              else:
+                  results.append(dfs(c, d, set()))
+
+          return results
+    explanation: |
+      **Time Complexity:** O(Q × (V + E)) — Same as BFS; each query may explore all nodes/edges.
+
+      **Space Complexity:** O(V + E) — Graph storage plus O(V) recursion stack depth.
+
+      DFS achieves the same result as BFS with a recursive approach. We explore paths depth-first, multiplying edge weights as we backtrack. Both approaches are optimal for this problem size.
+
+  - approach_name: Union-Find with Weights
+    is_optimal: true
+    code: |
+      class UnionFind:
+          def __init__(self):
+              # parent[x] = (root, weight) where x / root = weight
+              self.parent = {}
+
+          def find(self, x: str) -> tuple[str, float]:
+              if x not in self.parent:
+                  self.parent[x] = (x, 1.0)
+                  return (x, 1.0)
+
+              if self.parent[x][0] == x:
+                  return self.parent[x]
+
+              # Path compression with weight update
+              root, weight = self.find(self.parent[x][0])
+              self.parent[x] = (root, self.parent[x][1] * weight)
+              return self.parent[x]
+
+          def union(self, x: str, y: str, value: float) -> None:
+              # x / y = value
+              root_x, weight_x = self.find(x)  # x / root_x = weight_x
+              root_y, weight_y = self.find(y)  # y / root_y = weight_y
+
+              if root_x != root_y:
+                  # Connect root_x to root_y
+                  # root_x / root_y = (x / root_x)^-1 * (x / y) * (y / root_y)
+                  #                 = weight_y * value / weight_x
+                  self.parent[root_x] = (root_y, weight_y * value / weight_x)
+
+          def query(self, x: str, y: str) -> float:
+              if x not in self.parent or y not in self.parent:
+                  return -1.0
+
+              root_x, weight_x = self.find(x)
+              root_y, weight_y = self.find(y)
+
+              if root_x != root_y:
+                  return -1.0  # Different components
+
+              # x / y = (x / root) / (y / root) = weight_x / weight_y
+              return weight_x / weight_y
+
+      def calc_equation(
+          equations: list[list[str]],
+          values: list[float],
+          queries: list[list[str]]
+      ) -> list[float]:
+          uf = UnionFind()
+
+          # Build union-find structure
+          for (a, b), value in zip(equations, values):
+              uf.union(a, b, value)
+
+          # Answer queries
+          return [uf.query(c, d) for c, d in queries]
+    explanation: |
+      **Time Complexity:** O((E + Q) × α(V)) — Near O(1) per operation with path compression, where α is the inverse Ackermann function.
+
+      **Space Complexity:** O(V) — Storage for parent pointers and weights.
+
+      Union-Find tracks connected components with weighted edges. Each node stores its ratio to its root. When querying `x / y`, we find both roots — if they match, the answer is `weight_x / weight_y`. This approach excels when there are many queries on the same graph.