title: Check If a String Contains All Binary Codes of Size K slug: check-if-a-string-contains-all-binary-codes-of-size-k difficulty: medium leetcode_id: 1461 leetcode_url: https://leetcode.com/problems/check-if-a-string-contains-all-binary-codes-of-size-k/ categories: - strings - hash-tables patterns: - sliding-window function_signature: "def has_all_codes(s: str, k: int) -> bool:" test_cases: visible: - input: { s: "00110110", k: 2 } expected: true - input: { s: "0110", k: 1 } expected: true - input: { s: "0110", k: 2 } expected: false hidden: - input: { s: "0", k: 1 } expected: false - input: { s: "01", k: 1 } expected: true - input: { s: "00110", k: 2 } expected: true - input: { s: "0000000001011100", k: 4 } expected: false - input: { s: "11111111", k: 3 } expected: false - input: { s: "00011100101", k: 3 } expected: true description: | Given a binary string `s` and an integer `k`, return `true` *if every binary code of length* `k` *is a substring of* `s`. Otherwise, return `false`. A binary code of length `k` is any string consisting of exactly `k` characters, where each character is either `'0'` or `'1'`. For example, when `k = 2`, all possible binary codes are: `"00"`, `"01"`, `"10"`, and `"11"`. You need to verify that **all** `2^k` possible binary codes appear somewhere in `s` as contiguous substrings. constraints: | - `1 <= s.length <= 5 * 10^5` - `s[i]` is either `'0'` or `'1'` - `1 <= k <= 20` examples: - input: 's = "00110110", k = 2' output: "true" explanation: "The binary codes of length 2 are \"00\", \"01\", \"10\" and \"11\". They can be all found as substrings at indices 0, 1, 3 and 2 respectively." - input: 's = "0110", k = 1' output: "true" explanation: "The binary codes of length 1 are \"0\" and \"1\", it is clear that both exist as a substring." - input: 's = "0110", k = 2' output: "false" explanation: 'The binary code "00" is of length 2 and does not exist in the string.' explanation: intuition: | Think of this problem like checking off items on a checklist. For a given `k`, there are exactly `2^k` unique binary codes (just like how there are 4 two-digit binary numbers: 00, 01, 10, 11). As you slide through the string `s` with a window of size `k`, each window gives you one binary code substring. The question becomes: do you encounter **all** `2^k` possible codes while sliding through? Imagine you have a box of crayons where each crayon represents a unique binary code. As you slide through the string, every window "touches" one crayon. If by the end you've touched all crayons in the box, you return `true`. The key insight is that instead of generating all `2^k` codes and checking if each exists in `s` (which would be expensive), you simply **collect all unique substrings of length `k`** from `s` and check if you collected exactly `2^k` of them. A hash set naturally handles the uniqueness for you. approach: | We use a **Sliding Window with Hash Set** approach: **Step 1: Calculate the target count** - Compute `required = 2^k`, which is the total number of unique binary codes of length `k` - If `s` is too short to even contain `required` substrings, we can return `false` early   **Step 2: Early termination check** - The number of substrings of length `k` in `s` is `len(s) - k + 1` - If `len(s) - k + 1 < required`, it's impossible to have all codes, so return `false`   **Step 3: Slide through and collect unique substrings** - Create an empty hash set to store unique binary codes - Iterate through `s` with a sliding window of size `k` - For each position `i` from `0` to `len(s) - k`, extract the substring `s[i:i+k]` - Add each substring to the set (duplicates are automatically ignored)   **Step 4: Compare the count** - If the size of the set equals `required` (`2^k`), return `true` - Otherwise, return `false`   This approach works because a set only keeps unique elements. If we've seen all `2^k` unique codes, the set size will be exactly `2^k`. common_pitfalls: - title: Generating All Codes First description: | A tempting approach is to first generate all `2^k` binary codes, then check if each one exists in `s` using string searching. This is inefficient for two reasons: 1. Generating all codes takes O(k * 2^k) time 2. Searching for each code in `s` takes O(n) per code, leading to O(n * 2^k) total With `k = 20`, you'd have over 1 million codes to generate and search for! The sliding window approach is O(n * k) instead, much better for large inputs. wrong_approach: "Generate all 2^k codes, search for each in s" correct_approach: "Collect unique substrings from s, count them" - title: Off-by-One in Window Iteration description: | When iterating to extract substrings of length `k`, the loop should run from index `0` to `len(s) - k` (inclusive). A common mistake is iterating to `len(s) - k + 1` or `len(s)`, which either causes index out of bounds or misses the last valid window. For `s = "0110"` with `k = 2`: - Valid indices: 0, 1, 2 (giving "01", "11", "10") - Loop should be `for i in range(len(s) - k + 1)` or `for i in range(3)` wrong_approach: "range(len(s)) or range(len(s) - k)" correct_approach: "range(len(s) - k + 1)" - title: Forgetting the Early Return Optimisation description: | While not strictly a bug, failing to add the early termination check can hurt performance. If `len(s) < k`, there are zero substrings of length `k`. If `len(s) - k + 1 < 2^k`, it's mathematically impossible to have all codes. Example: For `k = 20`, you need at least `2^20 = 1,048,576` substrings, meaning `s` must have length at least `1,048,595`. wrong_approach: "Always iterate through the entire string" correct_approach: "Check if len(s) - k + 1 >= 2^k before iterating" key_takeaways: - "**Hash sets for counting unique items**: When you need to count distinct elements, a set automatically handles duplicates" - "**Sliding window for substrings**: Extracting all substrings of a fixed length is a classic sliding window pattern" - "**Think about the inverse**: Instead of checking if all codes exist, collect what exists and compare the count" - "**Early termination**: Mathematical bounds can save computation - if there aren't enough windows, the answer is definitely `false`" time_complexity: "O(n * k). We slide through the string once (O(n) positions), and at each position we extract a substring of length k (O(k) for hashing/copying)." space_complexity: "O(2^k * k). In the worst case, the set stores all 2^k unique binary codes, each of length k characters." solutions: - approach_name: Sliding Window with Hash Set is_optimal: true code: | def has_all_codes(s: str, k: int) -> bool: # Total number of unique binary codes of length k required = 1 << k # Same as 2^k # Early termination: not enough substrings possible if len(s) - k + 1 < required: return False # Collect all unique substrings of length k seen = set() for i in range(len(s) - k + 1): # Extract the substring at this window position code = s[i:i + k] seen.add(code) # Optimisation: stop early if we've found all codes if len(seen) == required: return True return len(seen) == required explanation: | **Time Complexity:** O(n * k) — We visit each of the n - k + 1 positions once, and extracting/hashing a substring of length k takes O(k) time. **Space Complexity:** O(2^k * k) — The set can hold up to 2^k strings, each of length k. We slide a window of size k across the string, collecting each unique substring in a hash set. If the set reaches size 2^k, we've found all possible binary codes. The early termination when `len(seen) == required` provides a small optimisation. - approach_name: Bit Manipulation (Rolling Hash) is_optimal: true code: | def has_all_codes(s: str, k: int) -> bool: required = 1 << k # 2^k if len(s) - k + 1 < required: return False # Use a set of integers instead of strings seen = set() # Mask to keep only k bits (e.g., k=3 -> mask=0b111=7) mask = required - 1 # Convert first k-1 characters to a number current = 0 for i in range(k - 1): current = (current << 1) | (ord(s[i]) - ord('0')) # Slide through, updating the hash in O(1) per step for i in range(k - 1, len(s)): # Shift left and add new bit, then mask to keep k bits current = ((current << 1) | (ord(s[i]) - ord('0'))) & mask seen.add(current) if len(seen) == required: return True return len(seen) == required explanation: | **Time Complexity:** O(n) — Each position is processed in O(1) time since we update the hash with bit operations. **Space Complexity:** O(2^k) — The set stores up to 2^k integers. Instead of storing substrings, we convert each k-length window to an integer. For example, "101" becomes 5. The rolling hash uses bit shifts: shift left by 1, add the new bit, and mask off the oldest bit. This avoids the O(k) cost of substring extraction and hashing, reducing time complexity from O(n * k) to O(n). - approach_name: Brute Force (Generate and Search) is_optimal: false code: | def has_all_codes(s: str, k: int) -> bool: # Generate all 2^k binary codes required = 1 << k for code_num in range(required): # Convert number to binary string of length k code = bin(code_num)[2:].zfill(k) # Check if this code exists in s if code not in s: return False return True explanation: | **Time Complexity:** O(n * 2^k) — For each of the 2^k codes, we search the string which takes O(n) time. **Space Complexity:** O(k) — We only store one code string at a time. This approach generates every possible binary code and checks if it's a substring of s. While intuitive, it's inefficient for large k values. With k = 20, we'd perform over a million substring searches. This solution may cause TLE on LeetCode but illustrates the straightforward approach that the optimal solutions improve upon.