Files
codetutor/backend/data/questions/longest-substring-without-repeating.yaml

171 lines
7.3 KiB
YAML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
title: Longest Substring Without Repeating Characters
slug: longest-substring-without-repeating
difficulty: medium
leetcode_id: 3
leetcode_url: https://leetcode.com/problems/longest-substring-without-repeating-characters/
categories:
- strings
- hash-tables
patterns:
- sliding-window
description: |
Given a string `s`, find the length of the **longest substring** without repeating characters.
constraints: |
- `0 <= s.length <= 5 × 10^4`
- `s` consists of English letters, digits, symbols and spaces
examples:
- input: 's = "abcabcbb"'
output: "3"
explanation: "The answer is 'abc', with length 3."
- input: 's = "bbbbb"'
output: "1"
explanation: "The answer is 'b', with length 1."
- input: 's = "pwwkew"'
output: "3"
explanation: "The answer is 'wke', with length 3. Note that 'pwke' is a subsequence, not a substring."
explanation:
intuition: |
Imagine a window sliding across the string. The window represents our current substring candidate. We want to expand this window as much as possible while keeping all characters inside it unique.
Think of it like this: you're scanning through a document with a highlighter. You want to find the longest stretch you can highlight where no letter appears twice. When you hit a repeat, you need to move the start of your highlight forward until the duplicate is gone.
The key insight is that we don't need to restart from scratch when we find a duplicate. If we've seen 'a' before at position 3, and we see 'a' again at position 7, we just need to move our window's left edge past position 3. Everything between positions 4 and 7 might still be valid!
This is the **sliding window** pattern: expand the right edge to explore, contract the left edge to maintain validity.
approach: |
We solve this using a **Sliding Window with a Set**:
**Step 1: Initialise the window and tracking**
- `left = 0`: Left edge of our window
- `char_set = set()`: Characters currently in our window
- `max_length = 0`: Best length found so far
- The right edge is controlled by our loop iteration
&nbsp;
**Step 2: Expand the window (right pointer)**
- For each character at position `right`:
- If `s[right]` is already in `char_set`, we have a duplicate
- Before adding it, we must shrink from the left until the duplicate is removed
&nbsp;
**Step 3: Shrink the window (left pointer)**
- While `s[right]` is in `char_set`:
- Remove `s[left]` from the set
- Increment `left`
- This "slides" the window past the previous occurrence
&nbsp;
**Step 4: Add current character and update maximum**
- Add `s[right]` to `char_set`
- Update `max_length = max(max_length, right - left + 1)`
&nbsp;
The window always contains unique characters, and we track the maximum size it achieves.
common_pitfalls:
- title: Resetting the Window Completely on Duplicate
description: |
A common mistake is to set `left = right` when finding a duplicate, effectively restarting the search. This loses valid characters that could still be part of a longer substring.
For example, in `"abcdb"`, when we hit the second `'b'`, we should move `left` from 0 to 2 (just past the first `'b'`), keeping `"cd"` in our window. Resetting to `left = right` would discard `"cd"` unnecessarily.
wrong_approach: "left = right when duplicate found"
correct_approach: "Increment left until duplicate is removed from window"
- title: Off-by-One in Length Calculation
description: |
The length of a window from index `left` to `right` (inclusive) is `right - left + 1`, not `right - left`.
For `left = 2, right = 5`, the substring has 4 characters (indices 2, 3, 4, 5), not 3.
wrong_approach: "max_length = max(max_length, right - left)"
correct_approach: "max_length = max(max_length, right - left + 1)"
- title: Not Handling Empty String
description: |
An empty string `""` should return `0`. The algorithm handles this naturally (the loop never executes), but it's worth verifying.
Similarly, a single character `"a"` should return `1`.
wrong_approach: "Assuming string has at least one character"
correct_approach: "Algorithm works for empty strings — returns 0"
key_takeaways:
- "**Sliding window for substrings**: When looking for contiguous sequences with constraints, sliding window is often the answer"
- "**Expand and contract**: Right pointer explores, left pointer maintains validity"
- "**Set for uniqueness checking**: O(1) membership testing makes the algorithm efficient"
- "**Optimisation with hash map**: Store the last index of each character to jump `left` directly instead of incrementing"
time_complexity: "O(n). Each character is visited at most twice — once by the right pointer, once by the left pointer."
space_complexity: "O(min(m, n)). The set holds at most min(n, m) characters, where m is the character set size (e.g., 26 for lowercase letters, 128 for ASCII)."
solutions:
- approach_name: Sliding Window with Set
is_optimal: true
code: |
def length_of_longest_substring(s: str) -> int:
# Set to track characters in current window
char_set = set()
left = 0
max_length = 0
# Right pointer expands the window
for right in range(len(s)):
# Shrink window until duplicate is removed
while s[right] in char_set:
char_set.remove(s[left])
left += 1
# Add current character to window
char_set.add(s[right])
# Update maximum length
max_length = max(max_length, right - left + 1)
return max_length
explanation: |
**Time Complexity:** O(n) — Each character added and removed from set at most once.
**Space Complexity:** O(min(m, n)) — Set holds unique characters in window.
We maintain a sliding window containing only unique characters. When we encounter a duplicate, we shrink from the left until it's removed. The window size at each step represents a valid substring length.
- approach_name: Optimised with Hash Map
is_optimal: true
code: |
def length_of_longest_substring(s: str) -> int:
# Map character to its most recent index
char_index = {}
left = 0
max_length = 0
for right, char in enumerate(s):
# If char seen before AND within current window
if char in char_index and char_index[char] >= left:
# Jump left pointer past the previous occurrence
left = char_index[char] + 1
# Update character's latest index
char_index[char] = right
# Update maximum length
max_length = max(max_length, right - left + 1)
return max_length
explanation: |
**Time Complexity:** O(n) — Single pass through the string.
**Space Complexity:** O(min(m, n)) — Hash map stores character indices.
Instead of shrinking the window one character at a time, we store each character's last index. When we find a duplicate, we jump `left` directly past the previous occurrence. The condition `char_index[char] >= left` ensures we only consider duplicates within the current window (old occurrences outside the window are ignored).