Files
codetutor/backend/data/questions/longest-common-prefix.yaml

228 lines
9.6 KiB
YAML

title: Longest Common Prefix
slug: longest-common-prefix
difficulty: easy
leetcode_id: 14
leetcode_url: https://leetcode.com/problems/longest-common-prefix/
categories:
- strings
- arrays
patterns:
- slug: two-pointers
is_optimal: true
function_signature: "def longest_common_prefix(strs: list[str]) -> str:"
test_cases:
visible:
- input: { strs: ["flower", "flow", "flight"] }
expected: "fl"
- input: { strs: ["dog", "racecar", "car"] }
expected: ""
hidden:
- input: { strs: ["a"] }
expected: "a"
- input: { strs: ["", "b"] }
expected: ""
- input: { strs: ["abc", "abc", "abc"] }
expected: "abc"
- input: { strs: ["ab", "a"] }
expected: "a"
- input: { strs: ["cir", "car"] }
expected: "c"
description: |
Write a function to find the longest common prefix string amongst an array of strings.
If there is no common prefix, return an empty string `""`.
constraints: |
- `1 <= strs.length <= 200`
- `0 <= strs[i].length <= 200`
- `strs[i]` consists of only lowercase English letters if it is non-empty.
examples:
- input: 'strs = ["flower","flow","flight"]'
output: '"fl"'
explanation: "The first two characters 'f' and 'l' are common to all three strings."
- input: 'strs = ["dog","racecar","car"]'
output: '""'
explanation: "There is no common prefix among the input strings."
explanation:
intuition: |
Imagine you have a stack of papers, each with a word written on it. You want to find how many letters at the start of each word are exactly the same across all papers.
Think of it like aligning all the words vertically by their first character:
```
f l o w e r
f l o w
f l i g h t
```
You scan column by column from left to right. As long as every word has the same character in that column, you include it in your prefix. The moment you find a mismatch (like 'o' vs 'i' in column 3 above), you stop — everything before that point is your longest common prefix.
The key insight is that the common prefix can only be as long as the **shortest string** in the array, and we can stop as soon as any character differs.
approach: |
We solve this using a **Vertical Scanning** approach:
**Step 1: Handle edge case**
- If the input array is empty, return an empty string `""`
&nbsp;
**Step 2: Iterate character by character**
- Use the first string as a reference
- For each character position `i` in the first string, compare it against the character at position `i` in every other string
&nbsp;
**Step 3: Check for mismatches or end of string**
- If any string is shorter than position `i`, we've reached the end of that string — return the prefix found so far
- If any string has a different character at position `i`, we've found a mismatch — return the prefix found so far
&nbsp;
**Step 4: Build the prefix**
- If all strings match at position `i`, continue to the next position
- After checking all positions in the first string, return it entirely (it's the common prefix)
&nbsp;
This approach efficiently scans vertically through all strings simultaneously, stopping at the first point of divergence.
common_pitfalls:
- title: Forgetting the Empty Array Case
description: |
If the input array is empty, there are no strings to compare. Attempting to access `strs[0]` will cause an index error.
Always check for an empty array first and return `""` immediately.
wrong_approach: "Directly accessing strs[0] without checking array length"
correct_approach: "Check if strs is empty before processing"
- title: Index Out of Bounds on Shorter Strings
description: |
When comparing character by character, some strings may be shorter than others. For example, with `["ab", "a"]`, checking index 1 on the second string causes an error.
Always verify that the current index is within bounds for each string before accessing it: `if i >= len(strs[j])`.
wrong_approach: "Accessing strs[j][i] without checking length"
correct_approach: "Check i < len(strs[j]) before accessing the character"
- title: Using the Horizontal Scanning Inefficiently
description: |
A horizontal approach compares strings pairwise: find the common prefix of strings 1 and 2, then compare that result with string 3, and so on.
While correct, this can be less efficient in practice. If the first two strings share a long prefix but string 3 is very different, you've done unnecessary work. Vertical scanning stops at the first column with a mismatch across all strings.
wrong_approach: "Pairwise comparison accumulating prefixes"
correct_approach: "Vertical scanning comparing all strings at each position"
key_takeaways:
- "**Vertical scanning pattern**: When comparing multiple sequences, scanning position-by-position across all sequences simultaneously can be more efficient than pairwise comparison"
- "**Early termination**: Stop as soon as you find a mismatch or reach the end of any string — no need to process further"
- "**Use the shortest string**: The common prefix can never be longer than the shortest string, so checking bounds is essential"
- "**Foundation for string problems**: This pattern of character-by-character comparison appears in many string matching problems"
time_complexity: "O(S), where S is the sum of all characters in all strings. In the worst case, all strings are identical and we compare every character."
space_complexity: "O(1). We only use a few variables for iteration, not counting the output string."
solutions:
- approach_name: Vertical Scanning
is_optimal: true
code: |
def longest_common_prefix(strs: list[str]) -> str:
# Handle empty input
if not strs:
return ""
# Use the first string as reference
for i in range(len(strs[0])):
char = strs[0][i]
# Compare this character with all other strings
for j in range(1, len(strs)):
# Check if we've reached the end of this string
# or if the characters don't match
if i >= len(strs[j]) or strs[j][i] != char:
# Return prefix up to (but not including) position i
return strs[0][:i]
# All characters in first string matched all other strings
return strs[0]
explanation: |
**Time Complexity:** O(S) — where S is the sum of all characters in all strings. We compare each character at most once.
**Space Complexity:** O(1) — only using index variables, not counting the output.
We scan vertically through all strings at each character position. The moment we find any mismatch or reach the end of any string, we return what we've found so far.
- approach_name: Horizontal Scanning
is_optimal: false
code: |
def longest_common_prefix(strs: list[str]) -> str:
# Handle empty input
if not strs:
return ""
# Start with the first string as the initial prefix
prefix = strs[0]
# Compare prefix with each subsequent string
for i in range(1, len(strs)):
# Shrink prefix until it matches the start of current string
while not strs[i].startswith(prefix):
# Remove last character from prefix
prefix = prefix[:-1]
# No common prefix exists
if not prefix:
return ""
return prefix
explanation: |
**Time Complexity:** O(S) — where S is the sum of all characters. In the worst case, we compare all characters.
**Space Complexity:** O(1) — only storing the prefix reference.
This approach starts with the first string as the candidate prefix and progressively shortens it until it matches the beginning of each subsequent string. While correct, it may do more work than vertical scanning when early strings share a long prefix but later strings diverge early.
- approach_name: Binary Search
is_optimal: false
code: |
def longest_common_prefix(strs: list[str]) -> str:
# Handle empty input
if not strs:
return ""
def is_common_prefix(length: int) -> bool:
"""Check if first 'length' chars of strs[0] is a prefix of all strings."""
prefix = strs[0][:length]
return all(s.startswith(prefix) for s in strs)
# Find the minimum string length
min_len = min(len(s) for s in strs)
# Binary search for the longest valid prefix length
low, high = 0, min_len
while low < high:
# Use upper middle to avoid infinite loop
mid = (low + high + 1) // 2
if is_common_prefix(mid):
# Prefix of this length works, try longer
low = mid
else:
# Prefix too long, try shorter
high = mid - 1
return strs[0][:low]
explanation: |
**Time Complexity:** O(S * log(m)) — where S is the sum of all characters and m is the minimum string length. Binary search runs log(m) iterations, each checking all strings.
**Space Complexity:** O(1) — only using variables for binary search.
This approach uses binary search on the length of the prefix. While theoretically interesting, it's generally slower in practice than vertical scanning because it may repeatedly check the same characters. Included to demonstrate how binary search can apply to string problems.