254 lines
11 KiB
YAML
254 lines
11 KiB
YAML
title: Change Minimum Characters to Satisfy One of Three Conditions
|
||
slug: change-minimum-characters-to-satisfy-one-of-three-conditions
|
||
difficulty: medium
|
||
leetcode_id: 1737
|
||
leetcode_url: https://leetcode.com/problems/change-minimum-characters-to-satisfy-one-of-three-conditions/
|
||
categories:
|
||
- strings
|
||
- hash-tables
|
||
patterns:
|
||
- slug: prefix-sum
|
||
is_optimal: true
|
||
|
||
function_signature: "def min_characters(a: str, b: str) -> int:"
|
||
|
||
test_cases:
|
||
visible:
|
||
- input: { a: "aba", b: "caa" }
|
||
expected: 2
|
||
- input: { a: "dabadd", b: "cda" }
|
||
expected: 3
|
||
hidden:
|
||
- input: { a: "a", b: "b" }
|
||
expected: 0
|
||
- input: { a: "aaa", b: "aaa" }
|
||
expected: 0
|
||
- input: { a: "abc", b: "abc" }
|
||
expected: 2
|
||
- input: { a: "a", b: "a" }
|
||
expected: 0
|
||
- input: { a: "z", b: "a" }
|
||
expected: 0
|
||
- input: { a: "abc", b: "def" }
|
||
expected: 0
|
||
|
||
description: |
|
||
You are given two strings `a` and `b` that consist of lowercase letters. In one operation, you can change any character in `a` or `b` to **any lowercase letter**.
|
||
|
||
Your goal is to satisfy **one** of the following three conditions:
|
||
|
||
- **Every** letter in `a` is **strictly less** than **every** letter in `b` in the alphabet.
|
||
- **Every** letter in `b` is **strictly less** than **every** letter in `a` in the alphabet.
|
||
- **Both** `a` and `b` consist of **only one** distinct letter.
|
||
|
||
Return *the **minimum** number of operations needed to achieve your goal*.
|
||
|
||
constraints: |
|
||
- `1 <= a.length, b.length <= 10^5`
|
||
- `a` and `b` consist only of lowercase letters.
|
||
|
||
examples:
|
||
- input: 'a = "aba", b = "caa"'
|
||
output: "2"
|
||
explanation: "Consider the best way to make each condition true: 1) Change b to \"ccc\" in 2 operations, then every letter in a is less than every letter in b. 2) Change a to \"bbb\" and b to \"aaa\" in 3 operations, then every letter in b is less than every letter in a. 3) Change a to \"aaa\" and b to \"aaa\" in 2 operations, then a and b consist of one distinct letter. The best way was done in 2 operations (either condition 1 or condition 3)."
|
||
- input: 'a = "dabadd", b = "cda"'
|
||
output: "3"
|
||
explanation: 'The best way is to make condition 1 true by changing b to "eee".'
|
||
|
||
explanation:
|
||
intuition: |
|
||
Imagine the 26 lowercase letters arranged on a number line from `a` (0) to `z` (25). Each condition asks us to rearrange the characters so that certain constraints hold.
|
||
|
||
For **conditions 1 and 2**, we need to find a "dividing line" between two letters where all characters in one string fall below the line and all characters in the other string fall above it. Think of it like separating two groups of students by height — we pick a threshold, and everyone shorter goes left, everyone taller goes right.
|
||
|
||
For **condition 3**, we're making both strings consist of a single letter — like painting everything the same colour. The cost is the total number of characters that aren't already that colour.
|
||
|
||
The key insight is that we don't need to try every possible way to transform the strings. Instead, we can:
|
||
- Count character frequencies in both strings
|
||
- For conditions 1 and 2, try each possible "dividing point" between adjacent letters (25 choices: between 'a' and 'b', between 'b' and 'c', etc.)
|
||
- For condition 3, try each letter as the target (26 choices)
|
||
|
||
Using prefix sums lets us efficiently calculate how many characters would need to change for each dividing point.
|
||
|
||
approach: |
|
||
We solve this by evaluating all three conditions and taking the minimum cost.
|
||
|
||
**Step 1: Count character frequencies**
|
||
|
||
- `count_a[i]`: Number of occurrences of the i<sup>th</sup> letter (0-indexed, where 0 = 'a') in string `a`
|
||
- `count_b[i]`: Number of occurrences of the i<sup>th</sup> letter in string `b`
|
||
|
||
|
||
|
||
**Step 2: Compute prefix sums**
|
||
|
||
- `prefix_a[i]`: Total characters in `a` that are less than the i<sup>th</sup> letter
|
||
- `prefix_b[i]`: Total characters in `b` that are less than the i<sup>th</sup> letter
|
||
|
||
These let us quickly answer: "How many characters in string X are below letter Y?"
|
||
|
||
|
||
|
||
**Step 3: Evaluate condition 1 (all of `a` < all of `b`)**
|
||
|
||
- For each possible dividing point `i` from 1 to 25 (between letters):
|
||
- Characters in `a` that need to change: those >= letter `i` → `len(a) - prefix_a[i]`
|
||
- Characters in `b` that need to change: those < letter `i` → `prefix_b[i]`
|
||
- Total cost: `len(a) - prefix_a[i] + prefix_b[i]`
|
||
|
||
|
||
|
||
**Step 4: Evaluate condition 2 (all of `b` < all of `a`)**
|
||
|
||
- Same logic but swap `a` and `b`
|
||
- Cost for dividing point `i`: `len(b) - prefix_b[i] + prefix_a[i]`
|
||
|
||
|
||
|
||
**Step 5: Evaluate condition 3 (both strings same letter)**
|
||
|
||
- For each letter `i` from 0 to 25:
|
||
- Cost = total characters minus those already equal to letter `i`
|
||
- Cost = `len(a) + len(b) - count_a[i] - count_b[i]`
|
||
|
||
|
||
|
||
**Step 6: Return the minimum**
|
||
|
||
- Return the minimum cost across all evaluated options
|
||
|
||
common_pitfalls:
|
||
- title: Off-by-One Errors in Dividing Points
|
||
description: |
|
||
When checking conditions 1 and 2, the dividing point must be *between* letters, not *at* a letter. We iterate `i` from 1 to 25 (not 0 to 25) because:
|
||
- At `i = 1`, all of `a` must be 'a' (letter 0), and all of `b` must be >= 'b' (letter 1)
|
||
- At `i = 25`, all of `a` must be < 'z', and all of `b` must be 'z'
|
||
|
||
There's no valid dividing point at `i = 0` (nothing is less than 'a') or beyond 25.
|
||
wrong_approach: "Iterate from 0 to 26"
|
||
correct_approach: "Iterate dividing points from 1 to 25"
|
||
|
||
- title: Confusing Strict Less-Than with Less-Than-or-Equal
|
||
description: |
|
||
The condition says "strictly less than", meaning if we pick dividing point `i`, all characters in `a` must have values `< i`, and all in `b` must have values `>= i`. Getting this wrong by including equality will give incorrect results.
|
||
|
||
For example, if dividing at 'c' (index 2), string `a` can only have 'a' or 'b', not 'c'.
|
||
wrong_approach: "Allow a to have characters <= dividing point"
|
||
correct_approach: "String a must have all characters < dividing point"
|
||
|
||
- title: Forgetting Condition 3
|
||
description: |
|
||
Some solutions focus only on the "separation" conditions and forget that making both strings the same letter might be cheaper. For example, if both strings are already `"aaa"` and `"aaa"`, condition 3 costs 0, while conditions 1 and 2 would require changes.
|
||
wrong_approach: "Only check conditions 1 and 2"
|
||
correct_approach: "Check all three conditions and take the minimum"
|
||
|
||
key_takeaways:
|
||
- "**Prefix sums for range queries**: Pre-computing cumulative sums allows O(1) lookups for 'how many elements are below threshold X'"
|
||
- "**Enumerate the boundary**: When splitting data into two groups by a threshold, try all possible thresholds (26 letters means 25 dividing points)"
|
||
- "**Evaluate all options**: When a problem has multiple valid end states (three conditions here), compute the cost for each and take the minimum"
|
||
- "**Character frequency counting**: A common pattern for string problems — count frequencies first, then process the counts"
|
||
|
||
time_complexity: "O(n + m). We count frequencies in O(n + m) where n and m are the string lengths, then iterate through 26 letters a constant number of times."
|
||
space_complexity: "O(1). We use fixed-size arrays of length 26 for frequency counts and prefix sums, independent of input size."
|
||
|
||
solutions:
|
||
- approach_name: Prefix Sum with Frequency Counting
|
||
is_optimal: true
|
||
code: |
|
||
def min_characters(a: str, b: str) -> int:
|
||
# Count frequency of each letter in both strings
|
||
count_a = [0] * 26
|
||
count_b = [0] * 26
|
||
|
||
for c in a:
|
||
count_a[ord(c) - ord('a')] += 1
|
||
for c in b:
|
||
count_b[ord(c) - ord('a')] += 1
|
||
|
||
# Build prefix sums: prefix[i] = count of chars < letter i
|
||
prefix_a = [0] * 27
|
||
prefix_b = [0] * 27
|
||
for i in range(26):
|
||
prefix_a[i + 1] = prefix_a[i] + count_a[i]
|
||
prefix_b[i + 1] = prefix_b[i] + count_b[i]
|
||
|
||
len_a, len_b = len(a), len(b)
|
||
result = len_a + len_b # Worst case: change everything
|
||
|
||
# Condition 1: all of a < all of b
|
||
# Condition 2: all of b < all of a
|
||
# Try each dividing point from 1 to 25
|
||
for i in range(1, 26):
|
||
# Condition 1: a's chars must be < i, b's chars must be >= i
|
||
cost1 = (len_a - prefix_a[i]) + prefix_b[i]
|
||
# Condition 2: b's chars must be < i, a's chars must be >= i
|
||
cost2 = (len_b - prefix_b[i]) + prefix_a[i]
|
||
result = min(result, cost1, cost2)
|
||
|
||
# Condition 3: both strings become same letter
|
||
for i in range(26):
|
||
# Cost = chars not already equal to letter i
|
||
cost3 = (len_a - count_a[i]) + (len_b - count_b[i])
|
||
result = min(result, cost3)
|
||
|
||
return result
|
||
explanation: |
|
||
**Time Complexity:** O(n + m) — Linear pass to count frequencies, then constant-time operations over 26 letters.
|
||
|
||
**Space Complexity:** O(1) — Fixed arrays of size 26/27 regardless of input size.
|
||
|
||
We count character frequencies, build prefix sums, then evaluate all possible ways to satisfy each condition. The prefix sums let us efficiently compute how many characters fall below any threshold.
|
||
|
||
- approach_name: Brute Force (Conceptual)
|
||
is_optimal: false
|
||
code: |
|
||
def min_characters_brute(a: str, b: str) -> int:
|
||
# This is a conceptual illustration - actual brute force
|
||
# would enumerate all possible transformations
|
||
|
||
# For each condition, try all valid target configurations
|
||
result = len(a) + len(b)
|
||
|
||
# Condition 3: try each letter as the target
|
||
for target in range(26):
|
||
cost = 0
|
||
for c in a:
|
||
if ord(c) - ord('a') != target:
|
||
cost += 1
|
||
for c in b:
|
||
if ord(c) - ord('a') != target:
|
||
cost += 1
|
||
result = min(result, cost)
|
||
|
||
# Condition 1: try each dividing point
|
||
for div in range(1, 26):
|
||
cost = 0
|
||
# Count chars in a that are >= div (need to change)
|
||
for c in a:
|
||
if ord(c) - ord('a') >= div:
|
||
cost += 1
|
||
# Count chars in b that are < div (need to change)
|
||
for c in b:
|
||
if ord(c) - ord('a') < div:
|
||
cost += 1
|
||
result = min(result, cost)
|
||
|
||
# Condition 2: similar but swap roles
|
||
for div in range(1, 26):
|
||
cost = 0
|
||
for c in b:
|
||
if ord(c) - ord('a') >= div:
|
||
cost += 1
|
||
for c in a:
|
||
if ord(c) - ord('a') < div:
|
||
cost += 1
|
||
result = min(result, cost)
|
||
|
||
return result
|
||
explanation: |
|
||
**Time Complexity:** O(26 × (n + m)) — For each of 26 letters and 25 dividing points, we scan both strings.
|
||
|
||
**Space Complexity:** O(1) — No additional data structures.
|
||
|
||
This approach recounts characters for each target/dividing point instead of pre-computing frequencies. While still linear in terms of big-O (since 26 is constant), it's less efficient than the prefix sum approach because it repeatedly scans the strings.
|