codetutor/backend/data/questions/brace-expansion-ii.yaml

title: Brace Expansion II
slug: brace-expansion-ii
difficulty: hard
leetcode_id: 1096
leetcode_url: https://leetcode.com/problems/brace-expansion-ii/
categories:
  - strings
  - recursion
  - stack
patterns:
  - backtracking
  - dfs

description: |
  Under the grammar given below, strings can represent a set of lowercase words. Let `R(expr)` denote the set of words the expression represents.

  The grammar can best be understood through simple examples:

  - Single letters represent a singleton set containing that word.
    - `R("a") = {"a"}`
    - `R("w") = {"w"}`

  - When we take a comma-delimited list of two or more expressions, we take the **union** of possibilities.
    - `R("{a,b,c}") = {"a","b","c"}`
    - `R("{{a,b},{b,c}}") = {"a","b","c"}` (notice the final set only contains each word at most once)

  - When we concatenate two expressions, we take the set of possible **concatenations** between two words where the first word comes from the first expression and the second word comes from the second expression.
    - `R("{a,b}{c,d}") = {"ac","ad","bc","bd"}`
    - `R("a{b,c}{d,e}f{g,h}") = {"abdfg", "abdfh", "abefg", "abefh", "acdfg", "acdfh", "acefg", "acefh"}`

  Formally, the three rules for our grammar:

  - For every lowercase letter `x`, we have `R(x) = {x}`.
  - For expressions `e1, e2, ... , ek` with `k >= 2`, we have `R({e1, e2, ...}) = R(e1) ∪ R(e2) ∪ ...`
  - For expressions `e1` and `e2`, we have `R(e1 + e2) = {a + b for (a, b) in R(e1) × R(e2)}`, where `+` denotes concatenation, and `×` denotes the cartesian product.

  Given an expression representing a set of words under the given grammar, return *the sorted list of words that the expression represents*.

constraints: |
  - `1 <= expression.length <= 60`
  - `expression[i]` consists of `'{'`, `'}'`, `','` or lowercase English letters
  - The given expression represents a set of words based on the grammar given in the description

examples:
  - input: 'expression = "{a,b}{c,{d,e}}"'
    output: '["ac","ad","ae","bc","bd","be"]'
    explanation: "We expand {a,b} to get {a, b}, then expand {c,{d,e}} to get {c, d, e}. The Cartesian product gives us all combinations: ac, ad, ae, bc, bd, be."
  - input: 'expression = "{{a,z},a{b,c},{ab,z}}"'
    output: '["a","ab","ac","z"]'
    explanation: "Each distinct word is written only once in the final answer. The union of {a,z}, {ab,ac}, and {ab,z} gives {a, ab, ac, z}."

explanation:
  intuition: |
    Think of this problem like evaluating a mathematical expression, but instead of numbers and arithmetic operators, we have **sets of strings** and two operations: **union** (comma) and **concatenation** (adjacency).

    Imagine you're building a tree where each node represents an expression. Leaf nodes are single letters (like `"a"`), and internal nodes combine their children using either union or concatenation. The braces `{}` group expressions together, and commas `,` inside braces indicate union.

    The key insight is recognising this as a **recursive parsing problem**. The grammar has a natural hierarchy:
    - At the top level, we have concatenation of terms
    - Inside braces, we have union of sub-expressions
    - Each sub-expression can itself contain nested braces

    Think of it like evaluating `2 * (3 + 4)` — you need to handle the parentheses first, then apply the multiplication. Here, braces act like parentheses, commas act like addition (union), and adjacency acts like multiplication (Cartesian product).

  approach: |
    We solve this using a **Recursive Descent Parser** that mimics the grammar structure:

    **Step 1: Define the grammar hierarchy**

    - `expr`: One or more terms concatenated together
    - `term`: Either a single letter or a braced group `{...}`
    - `group`: Comma-separated expressions inside braces (union)

    &nbsp;

    **Step 2: Implement recursive parsing**

    - Use an index pointer to track position in the expression string
    - `parse_expr()`: Parse concatenated terms and compute their Cartesian product
    - `parse_term()`: Parse a single letter or delegate to `parse_group()` for braces
    - `parse_group()`: Parse comma-separated expressions inside `{...}` and compute their union

    &nbsp;

    **Step 3: Handle the operations**

    - **Union**: When we see a comma, combine two sets with set union
    - **Concatenation**: When terms are adjacent, compute Cartesian product of strings
    - Use a `set` throughout to automatically handle duplicates

    &nbsp;

    **Step 4: Return sorted result**

    - Convert the final set to a sorted list before returning

    &nbsp;

    The recursive structure naturally handles nested expressions because each call to `parse_group()` can trigger new calls to `parse_expr()`, which handles the nested content.

  common_pitfalls:
    - title: Confusing Union and Concatenation
      description: |
        The comma operator `,` performs **union** (adding to the set), while adjacency performs **concatenation** (Cartesian product).

        For example, `"{a,b}c"` means: union of `a` and `b`, then concatenate with `c`, giving `{ac, bc}`.

        But `"{a}{b,c}"` means: the set `{a}` concatenated with the union `{b, c}`, giving `{ab, ac}`.

        Getting these operators mixed up leads to completely wrong results.
      wrong_approach: "Treating comma as concatenation or adjacency as union"
      correct_approach: "Comma = union (∪), adjacency = Cartesian product (×)"

    - title: Not Handling Nested Braces
      description: |
        Expressions can be deeply nested: `"{{a,b},{c,{d,e}}}"`. A simple single-pass approach won't work because you need to fully evaluate inner expressions before combining them.

        For `"{{a,b},{c,{d,e}}}"`:
        - Inner `{a,b}` → `{a, b}`
        - Inner `{d,e}` → `{d, e}`
        - `{c,{d,e}}` → `{c, d, e}` (union)
        - Outer union → `{a, b, c, d, e}`

        Each level of braces must be fully resolved before the outer level can proceed.
      wrong_approach: "Single-pass string manipulation"
      correct_approach: "Recursive parsing that evaluates from innermost to outermost"

    - title: Duplicate Handling
      description: |
        The result must contain each word **at most once**. For example, `"{{a,z},{ab,z}}"` should return `["a", "ab", "z"]`, not `["a", "ab", "z", "z"]`.

        If you use lists instead of sets during computation, you'll have duplicates that need deduplication at the end. Using sets throughout is cleaner.
      wrong_approach: "Using lists and forgetting to deduplicate"
      correct_approach: "Use sets throughout, convert to sorted list at the end"

    - title: Incorrect Cartesian Product
      description: |
        When concatenating two sets, every string from the first set must be paired with every string from the second set.

        For `{a, b}` concatenated with `{c, d}`:
        - Result: `{ac, ad, bc, bd}` (4 elements, not 2)

        A common mistake is to zip the sets instead of taking the full Cartesian product.
      wrong_approach: "Zipping sets element-by-element"
      correct_approach: "Nested loops or itertools.product for all combinations"

  key_takeaways:
    - "**Recursive descent parsing** is a powerful technique for grammar-based problems — the code structure mirrors the grammar rules"
    - "**Distinguish your operators**: union (comma) combines sets, concatenation (adjacency) computes Cartesian products"
    - "**Use sets for uniqueness**: when the problem requires distinct elements, sets handle deduplication automatically"
    - "**Similar problems**: Expression parsing, calculator problems, and nested structure evaluation all use similar recursive techniques"

  time_complexity: "O(n × 2^(n/2)) in the worst case, where `n` is the expression length. Each character can contribute to exponential string combinations through Cartesian products."
  space_complexity: "O(2^(n/2)) for storing all possible strings in the result set, plus O(n) recursion depth for parsing."

solutions:
  - approach_name: Recursive Descent Parser
    is_optimal: true
    code: |
      def brace_expansion_ii(expression: str) -> list[str]:
          # Index pointer for parsing
          idx = 0
          n = len(expression)

          def parse_expr() -> set[str]:
              """Parse concatenated terms and compute Cartesian product."""
              nonlocal idx
              # Start with set containing empty string (identity for concatenation)
              result = {""}

              while idx < n and expression[idx] not in ",}":
                  # Parse next term
                  term_set = parse_term()
                  # Cartesian product: combine each existing string with each new string
                  result = {a + b for a in result for b in term_set}

              return result

          def parse_term() -> set[str]:
              """Parse a single term: letter or braced group."""
              nonlocal idx

              if expression[idx] == "{":
                  # It's a group — delegate to parse_group
                  return parse_group()
              else:
                  # It's a single letter
                  letter = expression[idx]
                  idx += 1
                  return {letter}

          def parse_group() -> set[str]:
              """Parse comma-separated expressions inside braces (union)."""
              nonlocal idx
              idx += 1  # Skip opening '{'
              result = set()

              while True:
                  # Parse an expression and add to union
                  result |= parse_expr()

                  if expression[idx] == "}":
                      idx += 1  # Skip closing '}'
                      break
                  else:
                      idx += 1  # Skip comma ','

              return result

          # Parse the entire expression and return sorted result
          return sorted(parse_expr())
    explanation: |
      **Time Complexity:** O(n × 2^(n/2)) — Parsing is O(n), but the Cartesian products can produce exponentially many strings.

      **Space Complexity:** O(2^(n/2)) — The result set can contain exponentially many strings in the worst case.

      The parser uses three mutually recursive functions that mirror the grammar:
      - `parse_expr` handles concatenation (Cartesian product)
      - `parse_term` handles single letters or delegates to groups
      - `parse_group` handles union of comma-separated expressions

      Using a `nonlocal` index pointer allows clean parsing without slicing strings.

  - approach_name: Stack-Based Parser
    is_optimal: false
    code: |
      def brace_expansion_ii(expression: str) -> list[str]:
          # Stack holds sets of strings
          # We use two stacks: one for sets, one for operators
          stack = []
          # Current set being built
          current = {""}

          i = 0
          while i < len(expression):
              char = expression[i]

              if char == "{":
                  # Push current state and start fresh
                  stack.append(current)
                  stack.append("{")  # Marker for brace level
                  current = {""}

              elif char == "}":
                  # Pop and combine until we hit the opening brace marker
                  # First, handle any pending concatenation
                  combined = current

                  while stack and stack[-1] != "{":
                      prev = stack.pop()
                      if isinstance(prev, set):
                          # This was a union operand
                          combined = combined | prev

                  stack.pop()  # Remove the "{" marker

                  # Now concatenate with what was before the brace
                  if stack:
                      prev_set = stack.pop()
                      current = {a + b for a in prev_set for b in combined}
                  else:
                      current = combined

              elif char == ",":
                  # Union: push current set and start fresh
                  stack.append(current)
                  current = {""}

              else:
                  # Lowercase letter: concatenate with current
                  current = {s + char for s in current}

              i += 1

          # Combine any remaining union operands
          while stack:
              prev = stack.pop()
              if isinstance(prev, set):
                  current = current | prev

          return sorted(current)
    explanation: |
      **Time Complexity:** O(n × 2^(n/2)) — Same as recursive approach.

      **Space Complexity:** O(n × 2^(n/2)) — Stack can hold multiple sets.

      This iterative approach uses explicit stacks instead of recursion. While it avoids recursion depth limits, it's harder to read and maintain. The recursive descent parser is preferred for its clarity and direct correspondence to the grammar.