Files
codetutor/backend/data/questions/brace-expansion-ii.yaml

288 lines
13 KiB
YAML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
title: Brace Expansion II
slug: brace-expansion-ii
difficulty: hard
leetcode_id: 1096
leetcode_url: https://leetcode.com/problems/brace-expansion-ii/
categories:
- strings
- recursion
- stack
patterns:
- backtracking
- dfs
description: |
Under the grammar given below, strings can represent a set of lowercase words. Let `R(expr)` denote the set of words the expression represents.
The grammar can best be understood through simple examples:
- Single letters represent a singleton set containing that word.
- `R("a") = {"a"}`
- `R("w") = {"w"}`
- When we take a comma-delimited list of two or more expressions, we take the **union** of possibilities.
- `R("{a,b,c}") = {"a","b","c"}`
- `R("{{a,b},{b,c}}") = {"a","b","c"}` (notice the final set only contains each word at most once)
- When we concatenate two expressions, we take the set of possible **concatenations** between two words where the first word comes from the first expression and the second word comes from the second expression.
- `R("{a,b}{c,d}") = {"ac","ad","bc","bd"}`
- `R("a{b,c}{d,e}f{g,h}") = {"abdfg", "abdfh", "abefg", "abefh", "acdfg", "acdfh", "acefg", "acefh"}`
Formally, the three rules for our grammar:
- For every lowercase letter `x`, we have `R(x) = {x}`.
- For expressions `e1, e2, ... , ek` with `k >= 2`, we have `R({e1, e2, ...}) = R(e1) R(e2) ...`
- For expressions `e1` and `e2`, we have `R(e1 + e2) = {a + b for (a, b) in R(e1) × R(e2)}`, where `+` denotes concatenation, and `×` denotes the cartesian product.
Given an expression representing a set of words under the given grammar, return *the sorted list of words that the expression represents*.
constraints: |
- `1 <= expression.length <= 60`
- `expression[i]` consists of `'{'`, `'}'`, `','` or lowercase English letters
- The given expression represents a set of words based on the grammar given in the description
examples:
- input: 'expression = "{a,b}{c,{d,e}}"'
output: '["ac","ad","ae","bc","bd","be"]'
explanation: "We expand {a,b} to get {a, b}, then expand {c,{d,e}} to get {c, d, e}. The Cartesian product gives us all combinations: ac, ad, ae, bc, bd, be."
- input: 'expression = "{{a,z},a{b,c},{ab,z}}"'
output: '["a","ab","ac","z"]'
explanation: "Each distinct word is written only once in the final answer. The union of {a,z}, {ab,ac}, and {ab,z} gives {a, ab, ac, z}."
explanation:
intuition: |
Think of this problem like evaluating a mathematical expression, but instead of numbers and arithmetic operators, we have **sets of strings** and two operations: **union** (comma) and **concatenation** (adjacency).
Imagine you're building a tree where each node represents an expression. Leaf nodes are single letters (like `"a"`), and internal nodes combine their children using either union or concatenation. The braces `{}` group expressions together, and commas `,` inside braces indicate union.
The key insight is recognising this as a **recursive parsing problem**. The grammar has a natural hierarchy:
- At the top level, we have concatenation of terms
- Inside braces, we have union of sub-expressions
- Each sub-expression can itself contain nested braces
Think of it like evaluating `2 * (3 + 4)` — you need to handle the parentheses first, then apply the multiplication. Here, braces act like parentheses, commas act like addition (union), and adjacency acts like multiplication (Cartesian product).
approach: |
We solve this using a **Recursive Descent Parser** that mimics the grammar structure:
**Step 1: Define the grammar hierarchy**
- `expr`: One or more terms concatenated together
- `term`: Either a single letter or a braced group `{...}`
- `group`: Comma-separated expressions inside braces (union)
&nbsp;
**Step 2: Implement recursive parsing**
- Use an index pointer to track position in the expression string
- `parse_expr()`: Parse concatenated terms and compute their Cartesian product
- `parse_term()`: Parse a single letter or delegate to `parse_group()` for braces
- `parse_group()`: Parse comma-separated expressions inside `{...}` and compute their union
&nbsp;
**Step 3: Handle the operations**
- **Union**: When we see a comma, combine two sets with set union
- **Concatenation**: When terms are adjacent, compute Cartesian product of strings
- Use a `set` throughout to automatically handle duplicates
&nbsp;
**Step 4: Return sorted result**
- Convert the final set to a sorted list before returning
&nbsp;
The recursive structure naturally handles nested expressions because each call to `parse_group()` can trigger new calls to `parse_expr()`, which handles the nested content.
common_pitfalls:
- title: Confusing Union and Concatenation
description: |
The comma operator `,` performs **union** (adding to the set), while adjacency performs **concatenation** (Cartesian product).
For example, `"{a,b}c"` means: union of `a` and `b`, then concatenate with `c`, giving `{ac, bc}`.
But `"{a}{b,c}"` means: the set `{a}` concatenated with the union `{b, c}`, giving `{ab, ac}`.
Getting these operators mixed up leads to completely wrong results.
wrong_approach: "Treating comma as concatenation or adjacency as union"
correct_approach: "Comma = union (), adjacency = Cartesian product (×)"
- title: Not Handling Nested Braces
description: |
Expressions can be deeply nested: `"{{a,b},{c,{d,e}}}"`. A simple single-pass approach won't work because you need to fully evaluate inner expressions before combining them.
For `"{{a,b},{c,{d,e}}}"`:
- Inner `{a,b}` → `{a, b}`
- Inner `{d,e}` → `{d, e}`
- `{c,{d,e}}` → `{c, d, e}` (union)
- Outer union → `{a, b, c, d, e}`
Each level of braces must be fully resolved before the outer level can proceed.
wrong_approach: "Single-pass string manipulation"
correct_approach: "Recursive parsing that evaluates from innermost to outermost"
- title: Duplicate Handling
description: |
The result must contain each word **at most once**. For example, `"{{a,z},{ab,z}}"` should return `["a", "ab", "z"]`, not `["a", "ab", "z", "z"]`.
If you use lists instead of sets during computation, you'll have duplicates that need deduplication at the end. Using sets throughout is cleaner.
wrong_approach: "Using lists and forgetting to deduplicate"
correct_approach: "Use sets throughout, convert to sorted list at the end"
- title: Incorrect Cartesian Product
description: |
When concatenating two sets, every string from the first set must be paired with every string from the second set.
For `{a, b}` concatenated with `{c, d}`:
- Result: `{ac, ad, bc, bd}` (4 elements, not 2)
A common mistake is to zip the sets instead of taking the full Cartesian product.
wrong_approach: "Zipping sets element-by-element"
correct_approach: "Nested loops or itertools.product for all combinations"
key_takeaways:
- "**Recursive descent parsing** is a powerful technique for grammar-based problems — the code structure mirrors the grammar rules"
- "**Distinguish your operators**: union (comma) combines sets, concatenation (adjacency) computes Cartesian products"
- "**Use sets for uniqueness**: when the problem requires distinct elements, sets handle deduplication automatically"
- "**Similar problems**: Expression parsing, calculator problems, and nested structure evaluation all use similar recursive techniques"
time_complexity: "O(n × 2^(n/2)) in the worst case, where `n` is the expression length. Each character can contribute to exponential string combinations through Cartesian products."
space_complexity: "O(2^(n/2)) for storing all possible strings in the result set, plus O(n) recursion depth for parsing."
solutions:
- approach_name: Recursive Descent Parser
is_optimal: true
code: |
def brace_expansion_ii(expression: str) -> list[str]:
# Index pointer for parsing
idx = 0
n = len(expression)
def parse_expr() -> set[str]:
"""Parse concatenated terms and compute Cartesian product."""
nonlocal idx
# Start with set containing empty string (identity for concatenation)
result = {""}
while idx < n and expression[idx] not in ",}":
# Parse next term
term_set = parse_term()
# Cartesian product: combine each existing string with each new string
result = {a + b for a in result for b in term_set}
return result
def parse_term() -> set[str]:
"""Parse a single term: letter or braced group."""
nonlocal idx
if expression[idx] == "{":
# It's a group — delegate to parse_group
return parse_group()
else:
# It's a single letter
letter = expression[idx]
idx += 1
return {letter}
def parse_group() -> set[str]:
"""Parse comma-separated expressions inside braces (union)."""
nonlocal idx
idx += 1 # Skip opening '{'
result = set()
while True:
# Parse an expression and add to union
result |= parse_expr()
if expression[idx] == "}":
idx += 1 # Skip closing '}'
break
else:
idx += 1 # Skip comma ','
return result
# Parse the entire expression and return sorted result
return sorted(parse_expr())
explanation: |
**Time Complexity:** O(n × 2^(n/2)) — Parsing is O(n), but the Cartesian products can produce exponentially many strings.
**Space Complexity:** O(2^(n/2)) — The result set can contain exponentially many strings in the worst case.
The parser uses three mutually recursive functions that mirror the grammar:
- `parse_expr` handles concatenation (Cartesian product)
- `parse_term` handles single letters or delegates to groups
- `parse_group` handles union of comma-separated expressions
Using a `nonlocal` index pointer allows clean parsing without slicing strings.
- approach_name: Stack-Based Parser
is_optimal: false
code: |
def brace_expansion_ii(expression: str) -> list[str]:
# Stack holds sets of strings
# We use two stacks: one for sets, one for operators
stack = []
# Current set being built
current = {""}
i = 0
while i < len(expression):
char = expression[i]
if char == "{":
# Push current state and start fresh
stack.append(current)
stack.append("{") # Marker for brace level
current = {""}
elif char == "}":
# Pop and combine until we hit the opening brace marker
# First, handle any pending concatenation
combined = current
while stack and stack[-1] != "{":
prev = stack.pop()
if isinstance(prev, set):
# This was a union operand
combined = combined | prev
stack.pop() # Remove the "{" marker
# Now concatenate with what was before the brace
if stack:
prev_set = stack.pop()
current = {a + b for a in prev_set for b in combined}
else:
current = combined
elif char == ",":
# Union: push current set and start fresh
stack.append(current)
current = {""}
else:
# Lowercase letter: concatenate with current
current = {s + char for s in current}
i += 1
# Combine any remaining union operands
while stack:
prev = stack.pop()
if isinstance(prev, set):
current = current | prev
return sorted(current)
explanation: |
**Time Complexity:** O(n × 2^(n/2)) — Same as recursive approach.
**Space Complexity:** O(n × 2^(n/2)) — Stack can hold multiple sets.
This iterative approach uses explicit stacks instead of recursion. While it avoids recursion depth limits, it's harder to read and maintain. The recursive descent parser is preferred for its clarity and direct correspondence to the grammar.