questions S-W

2025-05-30 19:18:33 +01:00
parent ddceeec07e
commit 041a877295
46 changed files with 9696 additions and 0 deletions
--- a/backend/data/questions/valid-parenthesis-string.yaml
+++ b/backend/data/questions/valid-parenthesis-string.yaml
@@ -0,0 +1,246 @@
+title: Valid Parenthesis String
+slug: valid-parenthesis-string
+difficulty: medium
+leetcode_id: 678
+leetcode_url: https://leetcode.com/problems/valid-parenthesis-string/
+categories:
+  - strings
+  - stack
+  - dynamic-programming
+patterns:
+  - greedy
+  - dynamic-programming
+
+description: |
+  Given a string `s` containing only three types of characters: `'('`, `')'` and `'*'`, return `true` *if* `s` *is **valid***.
+
+  The following rules define a **valid** string:
+
+  - Any left parenthesis `'('` must have a corresponding right parenthesis `')'`.
+  - Any right parenthesis `')'` must have a corresponding left parenthesis `'('`.
+  - Left parenthesis `'('` must go before the corresponding right parenthesis `')'`.
+  - `'*'` could be treated as a single right parenthesis `')'` or a single left parenthesis `'('` or an empty string `""`.
+
+constraints: |
+  - `1 <= s.length <= 100`
+  - `s[i]` is `'('`, `')'` or `'*'`.
+
+examples:
+  - input: 's = "()"'
+    output: "true"
+    explanation: "Standard valid parentheses with matching open and close."
+  - input: 's = "(*)"'
+    output: "true"
+    explanation: "The '*' can be treated as an empty string, leaving valid '()'."
+  - input: 's = "(*))"'
+    output: "true"
+    explanation: "The '*' can be treated as '(', making the string '(())' which is valid."
+
+explanation:
+  intuition: |
+    Think of this problem as **tracking the possible number of unmatched open parentheses** at any point in the string.
+
+    Without wildcards, validating parentheses is straightforward: maintain a counter that increases for `'('` and decreases for `')'`. If it ever goes negative or doesn't end at zero, the string is invalid.
+
+    The `'*'` wildcard complicates things because it can be any of three things: `'('`, `')'`, or empty. Instead of tracking a single count, we need to track a **range of possibilities**.
+
+    Imagine you're walking through the string left to right. At each position, the number of unmatched `'('` could be anywhere within a range:
+    - **Minimum count** (`lo`): The fewest unmatched `'('` we could have (if we treat `'*'` as `)` or empty when helpful)
+    - **Maximum count** (`hi`): The most unmatched `'('` we could have (if we treat `'*'` as `(` when helpful)
+
+    As long as there exists *some* valid interpretation (i.e., the range includes zero at the end), the string is valid. The key insight is that we don't need to try all 3<sup>n</sup> combinations — we just need to track the bounds.
+
+  approach: |
+    We solve this using a **Greedy Range Tracking** approach:
+
+    **Step 1: Initialise two counters**
+
+    - `lo`: Set to `0`, representing the minimum possible unmatched `'('`
+    - `hi`: Set to `0`, representing the maximum possible unmatched `'('`
+
+    &nbsp;
+
+    **Step 2: Iterate through each character**
+
+    - For `'('`: Both `lo` and `hi` increase by 1 (we must have one more unmatched open)
+    - For `')'`: Both `lo` and `hi` decrease by 1 (we close one open parenthesis)
+    - For `'*'`: `lo` decreases by 1 (treat as `)` or empty), `hi` increases by 1 (treat as `(`)
+
+    &nbsp;
+
+    **Step 3: Maintain validity of the range**
+
+    - If `hi` goes negative, we have too many `)` that can't be matched — return `false`
+    - Keep `lo` at least 0 (we can't have negative unmatched opens in reality; this just means we'd treat some `'*'` differently)
+
+    &nbsp;
+
+    **Step 4: Check final state**
+
+    - If `lo == 0` at the end, there's a valid way to interpret the wildcards
+    - Return `lo == 0`
+
+    &nbsp;
+
+    This greedy approach works because we're tracking all possible valid states simultaneously through the range `[lo, hi]`. If zero falls within this range at the end, we can construct a valid interpretation.
+
+  common_pitfalls:
+    - title: Trying All Combinations
+      description: |
+        A naive approach might try all possible interpretations of each `'*'` character, leading to **O(3^n) time complexity** where `n` is the number of wildcards.
+
+        With up to 100 characters potentially being wildcards, this would be astronomically slow. The range-tracking approach reduces this to O(n) by recognising we only need to track bounds, not enumerate possibilities.
+      wrong_approach: "Recursively try all 3 options for each '*'"
+      correct_approach: "Track min/max range of possible open counts"
+
+    - title: Only Tracking One Counter
+      description: |
+        Using a single counter like regular parenthesis validation won't work. Consider `"(*)"`:
+        - If we always treat `'*'` as empty: `"()"` → valid
+        - If we always treat `'*'` as `'('`: `"(()"` → invalid
+        - If we always treat `'*'` as `')'`: `"())"` → invalid
+
+        We need to consider that different `'*'` characters might need different interpretations.
+      wrong_approach: "Single counter with fixed '*' interpretation"
+      correct_approach: "Two counters tracking the range of possibilities"
+
+    - title: Forgetting to Clamp the Minimum
+      description: |
+        When `'*'` is treated as `')'`, `lo` might go negative. But a negative count of unmatched `'('` doesn't make sense in reality — it just means we'd treat fewer `'*'` as `)`.
+
+        If we don't clamp `lo` to at least 0, we'll get incorrect results. For example, `"*"` should be valid (treat as empty), but without clamping, `lo` would be -1.
+      wrong_approach: "Let lo go negative without correction"
+      correct_approach: "Use lo = max(lo, 0) after each step"
+
+  key_takeaways:
+    - "**Range tracking**: When a problem has multiple valid states, track the bounds rather than enumerating all possibilities"
+    - "**Greedy with bounds**: This pattern of maintaining `[lo, hi]` range appears in other problems involving wildcards or uncertain values"
+    - "**Linear scan suffices**: Even with exponential possible interpretations, clever state tracking reduces complexity to O(n)"
+    - "**Extends classic pattern**: This builds on the basic parenthesis validation pattern by adding flexibility for wildcards"
+
+  time_complexity: "O(n). We traverse the string exactly once, performing constant-time operations at each character."
+  space_complexity: "O(1). We only use two integer variables (`lo` and `hi`), regardless of input size."
+
+solutions:
+  - approach_name: Greedy Range Tracking
+    is_optimal: true
+    code: |
+      def check_valid_string(s: str) -> bool:
+          # lo = minimum possible unmatched '('
+          # hi = maximum possible unmatched '('
+          lo = 0
+          hi = 0
+
+          for char in s:
+              if char == '(':
+                  # Must have one more unmatched open
+                  lo += 1
+                  hi += 1
+              elif char == ')':
+                  # Close one open parenthesis
+                  lo -= 1
+                  hi -= 1
+              else:  # char == '*'
+                  # '*' as ')' or empty decreases lo
+                  # '*' as '(' increases hi
+                  lo -= 1
+                  hi += 1
+
+              # Too many ')' that can't be matched
+              if hi < 0:
+                  return False
+
+              # Can't have negative unmatched '(' in reality
+              lo = max(lo, 0)
+
+          # Valid if we can end with zero unmatched '('
+          return lo == 0
+    explanation: |
+      **Time Complexity:** O(n) — Single pass through the string.
+
+      **Space Complexity:** O(1) — Only two integer variables used.
+
+      We track the range of possible unmatched open parentheses. At each step, `lo` represents the minimum (assuming wildcards help close) and `hi` represents the maximum (assuming wildcards add opens). If zero is achievable at the end, the string is valid.
+
+  - approach_name: Two Stack
+    is_optimal: false
+    code: |
+      def check_valid_string(s: str) -> bool:
+          # Store indices of unmatched '(' and '*'
+          open_stack = []
+          star_stack = []
+
+          for i, char in enumerate(s):
+              if char == '(':
+                  open_stack.append(i)
+              elif char == '*':
+                  star_stack.append(i)
+              else:  # char == ')'
+                  # Try to match with '(' first, then '*'
+                  if open_stack:
+                      open_stack.pop()
+                  elif star_stack:
+                      star_stack.pop()
+                  else:
+                      # No '(' or '*' to match
+                      return False
+
+          # Match remaining '(' with '*' that comes after
+          while open_stack and star_stack:
+              # '(' must come before '*' for valid match
+              if open_stack[-1] > star_stack[-1]:
+                  return False
+              open_stack.pop()
+              star_stack.pop()
+
+          # Valid only if all '(' are matched
+          return len(open_stack) == 0
+    explanation: |
+      **Time Complexity:** O(n) — Single pass plus stack cleanup.
+
+      **Space Complexity:** O(n) — Stacks can store up to n indices.
+
+      This approach uses two stacks to track positions of unmatched `'('` and `'*'`. When we see `')'`, we prefer matching with `'('`. After the scan, we try to match remaining `'('` with `'*'` that appear to their right. While correct, it uses more space than the greedy approach.
+
+  - approach_name: Dynamic Programming
+    is_optimal: false
+    code: |
+      def check_valid_string(s: str) -> bool:
+          n = len(s)
+          # dp[i][j] = True if s[i:] is valid with j unmatched '('
+          # Use memoisation for efficiency
+          memo = {}
+
+          def dp(i: int, open_count: int) -> bool:
+              # Base case: end of string
+              if i == n:
+                  return open_count == 0
+
+              # Too many unmatched ')' seen
+              if open_count < 0:
+                  return False
+
+              if (i, open_count) in memo:
+                  return memo[(i, open_count)]
+
+              char = s[i]
+              if char == '(':
+                  result = dp(i + 1, open_count + 1)
+              elif char == ')':
+                  result = dp(i + 1, open_count - 1)
+              else:  # char == '*'
+                  # Try all three options
+                  result = (dp(i + 1, open_count + 1) or  # as '('
+                           dp(i + 1, open_count - 1) or   # as ')'
+                           dp(i + 1, open_count))         # as empty
+
+              memo[(i, open_count)] = result
+              return result
+
+          return dp(0, 0)
+    explanation: |
+      **Time Complexity:** O(n^2) — At most n positions times n possible open counts.
+
+      **Space Complexity:** O(n^2) — Memoisation table size.
+
+      This recursive approach with memoisation explores all possibilities but caches results. For each position and open count, we determine if the remaining string can be valid. While correct and more intuitive, it's less efficient than the greedy approach for this problem.