codetutor/backend/data/questions/valid-parenthesis-string.yaml

title: Valid Parenthesis String
slug: valid-parenthesis-string
difficulty: medium
leetcode_id: 678
leetcode_url: https://leetcode.com/problems/valid-parenthesis-string/
categories:
  - strings
  - stack
  - dynamic-programming
patterns:
  - greedy
  - dynamic-programming

function_signature: "def check_valid_string(s: str) -> bool:"

test_cases:
  visible:
    - input: { s: "()" }
      expected: true
    - input: { s: "(*)" }
      expected: true
    - input: { s: "(*))" }
      expected: true
  hidden:
    - input: { s: "" }
      expected: true
    - input: { s: "(*" }
      expected: true
    - input: { s: "(((((*(()((((*((**(((()()*)()()()*((((**)())*)*)))))))(())(()))())((*()()(((()((()*(*)(*)*()(((((*)()" }
      expected: false
    - input: { s: "***" }
      expected: true
    - input: { s: "(((*)" }
      expected: false
    - input: { s: "(*)(*" }
      expected: true

description: |
  Given a string `s` containing only three types of characters: `'('`, `')'` and `'*'`, return `true` *if* `s` *is **valid***.

  The following rules define a **valid** string:

  - Any left parenthesis `'('` must have a corresponding right parenthesis `')'`.
  - Any right parenthesis `')'` must have a corresponding left parenthesis `'('`.
  - Left parenthesis `'('` must go before the corresponding right parenthesis `')'`.
  - `'*'` could be treated as a single right parenthesis `')'` or a single left parenthesis `'('` or an empty string `""`.

constraints: |
  - `1 <= s.length <= 100`
  - `s[i]` is `'('`, `')'` or `'*'`.

examples:
  - input: 's = "()"'
    output: "true"
    explanation: "Standard valid parentheses with matching open and close."
  - input: 's = "(*)"'
    output: "true"
    explanation: "The '*' can be treated as an empty string, leaving valid '()'."
  - input: 's = "(*))"'
    output: "true"
    explanation: "The '*' can be treated as '(', making the string '(())' which is valid."

explanation:
  intuition: |
    Think of this problem as **tracking the possible number of unmatched open parentheses** at any point in the string.

    Without wildcards, validating parentheses is straightforward: maintain a counter that increases for `'('` and decreases for `')'`. If it ever goes negative or doesn't end at zero, the string is invalid.

    The `'*'` wildcard complicates things because it can be any of three things: `'('`, `')'`, or empty. Instead of tracking a single count, we need to track a **range of possibilities**.

    Imagine you're walking through the string left to right. At each position, the number of unmatched `'('` could be anywhere within a range:
    - **Minimum count** (`lo`): The fewest unmatched `'('` we could have (if we treat `'*'` as `)` or empty when helpful)
    - **Maximum count** (`hi`): The most unmatched `'('` we could have (if we treat `'*'` as `(` when helpful)

    As long as there exists *some* valid interpretation (i.e., the range includes zero at the end), the string is valid. The key insight is that we don't need to try all 3<sup>n</sup> combinations — we just need to track the bounds.

  approach: |
    We solve this using a **Greedy Range Tracking** approach:

    **Step 1: Initialise two counters**

    - `lo`: Set to `0`, representing the minimum possible unmatched `'('`
    - `hi`: Set to `0`, representing the maximum possible unmatched `'('`

    &nbsp;

    **Step 2: Iterate through each character**

    - For `'('`: Both `lo` and `hi` increase by 1 (we must have one more unmatched open)
    - For `')'`: Both `lo` and `hi` decrease by 1 (we close one open parenthesis)
    - For `'*'`: `lo` decreases by 1 (treat as `)` or empty), `hi` increases by 1 (treat as `(`)

    &nbsp;

    **Step 3: Maintain validity of the range**

    - If `hi` goes negative, we have too many `)` that can't be matched — return `false`
    - Keep `lo` at least 0 (we can't have negative unmatched opens in reality; this just means we'd treat some `'*'` differently)

    &nbsp;

    **Step 4: Check final state**

    - If `lo == 0` at the end, there's a valid way to interpret the wildcards
    - Return `lo == 0`

    &nbsp;

    This greedy approach works because we're tracking all possible valid states simultaneously through the range `[lo, hi]`. If zero falls within this range at the end, we can construct a valid interpretation.

  common_pitfalls:
    - title: Trying All Combinations
      description: |
        A naive approach might try all possible interpretations of each `'*'` character, leading to **O(3^n) time complexity** where `n` is the number of wildcards.

        With up to 100 characters potentially being wildcards, this would be astronomically slow. The range-tracking approach reduces this to O(n) by recognising we only need to track bounds, not enumerate possibilities.
      wrong_approach: "Recursively try all 3 options for each '*'"
      correct_approach: "Track min/max range of possible open counts"

    - title: Only Tracking One Counter
      description: |
        Using a single counter like regular parenthesis validation won't work. Consider `"(*)"`:
        - If we always treat `'*'` as empty: `"()"` → valid
        - If we always treat `'*'` as `'('`: `"(()"` → invalid
        - If we always treat `'*'` as `')'`: `"())"` → invalid

        We need to consider that different `'*'` characters might need different interpretations.
      wrong_approach: "Single counter with fixed '*' interpretation"
      correct_approach: "Two counters tracking the range of possibilities"

    - title: Forgetting to Clamp the Minimum
      description: |
        When `'*'` is treated as `')'`, `lo` might go negative. But a negative count of unmatched `'('` doesn't make sense in reality — it just means we'd treat fewer `'*'` as `)`.

        If we don't clamp `lo` to at least 0, we'll get incorrect results. For example, `"*"` should be valid (treat as empty), but without clamping, `lo` would be -1.
      wrong_approach: "Let lo go negative without correction"
      correct_approach: "Use lo = max(lo, 0) after each step"

  key_takeaways:
    - "**Range tracking**: When a problem has multiple valid states, track the bounds rather than enumerating all possibilities"
    - "**Greedy with bounds**: This pattern of maintaining `[lo, hi]` range appears in other problems involving wildcards or uncertain values"
    - "**Linear scan suffices**: Even with exponential possible interpretations, clever state tracking reduces complexity to O(n)"
    - "**Extends classic pattern**: This builds on the basic parenthesis validation pattern by adding flexibility for wildcards"

  time_complexity: "O(n). We traverse the string exactly once, performing constant-time operations at each character."
  space_complexity: "O(1). We only use two integer variables (`lo` and `hi`), regardless of input size."

solutions:
  - approach_name: Greedy Range Tracking
    is_optimal: true
    code: |
      def check_valid_string(s: str) -> bool:
          # lo = minimum possible unmatched '('
          # hi = maximum possible unmatched '('
          lo = 0
          hi = 0

          for char in s:
              if char == '(':
                  # Must have one more unmatched open
                  lo += 1
                  hi += 1
              elif char == ')':
                  # Close one open parenthesis
                  lo -= 1
                  hi -= 1
              else:  # char == '*'
                  # '*' as ')' or empty decreases lo
                  # '*' as '(' increases hi
                  lo -= 1
                  hi += 1

              # Too many ')' that can't be matched
              if hi < 0:
                  return False

              # Can't have negative unmatched '(' in reality
              lo = max(lo, 0)

          # Valid if we can end with zero unmatched '('
          return lo == 0
    explanation: |
      **Time Complexity:** O(n) — Single pass through the string.

      **Space Complexity:** O(1) — Only two integer variables used.

      We track the range of possible unmatched open parentheses. At each step, `lo` represents the minimum (assuming wildcards help close) and `hi` represents the maximum (assuming wildcards add opens). If zero is achievable at the end, the string is valid.

  - approach_name: Two Stack
    is_optimal: false
    code: |
      def check_valid_string(s: str) -> bool:
          # Store indices of unmatched '(' and '*'
          open_stack = []
          star_stack = []

          for i, char in enumerate(s):
              if char == '(':
                  open_stack.append(i)
              elif char == '*':
                  star_stack.append(i)
              else:  # char == ')'
                  # Try to match with '(' first, then '*'
                  if open_stack:
                      open_stack.pop()
                  elif star_stack:
                      star_stack.pop()
                  else:
                      # No '(' or '*' to match
                      return False

          # Match remaining '(' with '*' that comes after
          while open_stack and star_stack:
              # '(' must come before '*' for valid match
              if open_stack[-1] > star_stack[-1]:
                  return False
              open_stack.pop()
              star_stack.pop()

          # Valid only if all '(' are matched
          return len(open_stack) == 0
    explanation: |
      **Time Complexity:** O(n) — Single pass plus stack cleanup.

      **Space Complexity:** O(n) — Stacks can store up to n indices.

      This approach uses two stacks to track positions of unmatched `'('` and `'*'`. When we see `')'`, we prefer matching with `'('`. After the scan, we try to match remaining `'('` with `'*'` that appear to their right. While correct, it uses more space than the greedy approach.

  - approach_name: Dynamic Programming
    is_optimal: false
    code: |
      def check_valid_string(s: str) -> bool:
          n = len(s)
          # dp[i][j] = True if s[i:] is valid with j unmatched '('
          # Use memoisation for efficiency
          memo = {}

          def dp(i: int, open_count: int) -> bool:
              # Base case: end of string
              if i == n:
                  return open_count == 0

              # Too many unmatched ')' seen
              if open_count < 0:
                  return False

              if (i, open_count) in memo:
                  return memo[(i, open_count)]

              char = s[i]
              if char == '(':
                  result = dp(i + 1, open_count + 1)
              elif char == ')':
                  result = dp(i + 1, open_count - 1)
              else:  # char == '*'
                  # Try all three options
                  result = (dp(i + 1, open_count + 1) or  # as '('
                           dp(i + 1, open_count - 1) or   # as ')'
                           dp(i + 1, open_count))         # as empty

              memo[(i, open_count)] = result
              return result

          return dp(0, 0)
    explanation: |
      **Time Complexity:** O(n^2) — At most n positions times n possible open counts.

      **Space Complexity:** O(n^2) — Memoisation table size.

      This recursive approach with memoisation explores all possibilities but caches results. For each position and open count, we determine if the remaining string can be valid. While correct and more intuitive, it's less efficient than the greedy approach for this problem.