codetutor/backend/data/questions/permutation-in-string.yaml

title: Permutation in String
slug: permutation-in-string
difficulty: medium
leetcode_id: 567
leetcode_url: https://leetcode.com/problems/permutation-in-string/
categories:
  - strings
  - hash-tables
  - two-pointers
patterns:
  - slug: sliding-window
    is_optimal: true

function_signature: "def check_inclusion(s1: str, s2: str) -> bool:"

test_cases:
  visible:
    - input: { s1: "ab", s2: "eidbaooo" }
      expected: true
    - input: { s1: "ab", s2: "eidboaoo" }
      expected: false
    - input: { s1: "adc", s2: "dcda" }
      expected: true
  hidden:
    - input: { s1: "a", s2: "a" }
      expected: true
    - input: { s1: "ab", s2: "a" }
      expected: false
    - input: { s1: "abc", s2: "ccccbbbbaaaa" }
      expected: false
    - input: { s1: "hello", s2: "ooolleoooleh" }
      expected: false

description: |
  Given two strings `s1` and `s2`, return `true` if `s2` contains a permutation of `s1`, or `false` otherwise.

  In other words, return `true` if one of `s1`'s permutations is the substring of `s2`.

constraints: |
  - `1 <= s1.length, s2.length <= 10^4`
  - `s1` and `s2` consist of lowercase English letters.

examples:
  - input: 's1 = "ab", s2 = "eidbaooo"'
    output: "true"
    explanation: "s2 contains one permutation of s1 (\"ba\")."
  - input: 's1 = "ab", s2 = "eidboaoo"'
    output: "false"
    explanation: "No permutation of s1 exists as a contiguous substring in s2."

explanation:
  intuition: |
    Think of this problem as searching for an **anagram** of `s1` hidden somewhere within `s2`.

    A permutation of a string is simply a rearrangement of its characters — which means any permutation has **exactly the same character frequencies** as the original. For example, "ab", "ba" are both permutations of each other because they both contain one 'a' and one 'b'.

    The key insight is that we don't need to generate all permutations of `s1` (which would be factorial in complexity). Instead, we can slide a **window of size `len(s1)`** across `s2` and check if the characters in that window match the character frequencies of `s1`.

    Imagine you have a magnifying glass the exact width of `s1`. As you slide it across `s2` one character at a time, you're checking: "Do the characters under my magnifying glass form an anagram of `s1`?"

    This transforms the problem from "find any permutation" to "find a window with matching character counts" — a classic sliding window pattern.

  approach: |
    We solve this using a **Fixed-Size Sliding Window** with character frequency counting:

    **Step 1: Handle edge cases**

    - If `s1` is longer than `s2`, it's impossible for `s2` to contain any permutation of `s1` — return `false` immediately

    &nbsp;

    **Step 2: Build the frequency map for s1**

    - Count the frequency of each character in `s1`
    - This is our "target" that we want to match

    &nbsp;

    **Step 3: Initialise the sliding window**

    - Create a frequency map for the first `len(s1)` characters of `s2`
    - This is our initial window

    &nbsp;

    **Step 4: Check initial window**

    - If the window's frequency map matches `s1`'s frequency map, we found a permutation — return `true`

    &nbsp;

    **Step 5: Slide the window across s2**

    - For each new position, add the incoming character (right side) to the window
    - Remove the outgoing character (left side) from the window
    - If a character's count drops to zero, remove it from the map entirely (for clean comparison)
    - Compare the window's frequency map with `s1`'s — if they match, return `true`

    &nbsp;

    **Step 6: Return the result**

    - If no matching window is found after sliding through all of `s2`, return `false`

  common_pitfalls:
    - title: Generating All Permutations
      description: |
        A naive approach might try to generate all permutations of `s1` and check if any exists in `s2`.

        For a string of length `n`, there are `n!` (factorial) permutations. With `s1.length <= 10^4`, this would mean up to `10000!` permutations — an astronomically large number that's computationally impossible.

        The sliding window approach avoids this entirely by recognising that **character frequency equality implies permutation**.
      wrong_approach: "Generate all permutations of s1 and search for each"
      correct_approach: "Compare character frequencies using sliding window"

    - title: Comparing Strings Instead of Frequencies
      description: |
        Sorting each window and comparing to sorted `s1` works but is inefficient.

        Sorting a window of size `k` takes O(k log k). Doing this for each of the `n - k + 1` windows gives O(n * k log k) overall. For large inputs, this is too slow.

        Using hash maps for frequency comparison gives O(1) comparison per window slide (amortised), resulting in O(n) total time.
      wrong_approach: "Sort each window and compare to sorted s1"
      correct_approach: "Use hash maps to track and compare character frequencies"

    - title: Not Cleaning Up Zero Counts
      description: |
        When a character's count reaches zero in the window map, failing to remove it can break map equality comparisons.

        For example, `{'a': 1, 'b': 0}` is not equal to `{'a': 1}` in most implementations, even though they represent the same character set.

        Always remove characters from the map when their count reaches zero.

    - title: Off-by-One Errors in Window Boundaries
      description: |
        The window size must be exactly `len(s1)`. Common mistakes include:
        - Starting the slide from index 0 instead of `len(s1)`
        - Removing the wrong character when sliding (should remove `s2[i - len(s1)]`)

        Trace through a small example manually to verify your indices.

  key_takeaways:
    - "**Permutation = same character frequencies**: Recognising this transforms the problem from combinatorial to linear"
    - "**Fixed-size sliding window**: When searching for a pattern of known length, use a window of that exact size"
    - "**Hash map comparison**: Comparing character counts is more efficient than generating/sorting permutations"
    - "**Pattern recognition**: This problem is nearly identical to *Find All Anagrams in a String* (LeetCode 438) — same technique, different return type"

  time_complexity: "O(n). We traverse `s2` once, and each character is added to and removed from the window exactly once. Hash map operations are O(1) amortised."
  space_complexity: "O(1). The frequency maps store at most 26 entries (lowercase English letters), which is constant regardless of input size."

solutions:
  - approach_name: Sliding Window with Hash Map
    is_optimal: true
    code: |
      from collections import Counter

      def check_inclusion(s1: str, s2: str) -> bool:
          # Edge case: s1 longer than s2
          if len(s1) > len(s2):
              return False

          # Build frequency map for s1 (our target)
          s1_count = Counter(s1)
          window_size = len(s1)

          # Build frequency map for initial window in s2
          window_count = Counter(s2[:window_size])

          # Check if initial window matches
          if window_count == s1_count:
              return True

          # Slide the window across s2
          for i in range(window_size, len(s2)):
              # Add incoming character (right side of window)
              window_count[s2[i]] += 1

              # Remove outgoing character (left side of window)
              left_char = s2[i - window_size]
              window_count[left_char] -= 1

              # Clean up zero counts for proper comparison
              if window_count[left_char] == 0:
                  del window_count[left_char]

              # Check if current window matches s1
              if window_count == s1_count:
                  return True

          return False
    explanation: |
      **Time Complexity:** O(n) — We iterate through `s2` once, with O(1) operations per step.

      **Space Complexity:** O(1) — The hash maps contain at most 26 keys (one per lowercase letter).

      The `Counter` class from Python's collections module provides a clean way to count character frequencies. Comparing two `Counter` objects with `==` checks if they have the same keys with the same values.

  - approach_name: Sliding Window with Array (Optimised)
    is_optimal: true
    code: |
      def check_inclusion(s1: str, s2: str) -> bool:
          if len(s1) > len(s2):
              return False

          # Use arrays instead of hash maps (26 lowercase letters)
          s1_count = [0] * 26
          window_count = [0] * 26

          # Build frequency array for s1
          for c in s1:
              s1_count[ord(c) - ord('a')] += 1

          # Build frequency array for initial window
          for i in range(len(s1)):
              window_count[ord(s2[i]) - ord('a')] += 1

          # Check initial window
          if window_count == s1_count:
              return True

          # Slide the window
          for i in range(len(s1), len(s2)):
              # Add incoming character
              window_count[ord(s2[i]) - ord('a')] += 1
              # Remove outgoing character
              window_count[ord(s2[i - len(s1)]) - ord('a')] -= 1

              if window_count == s1_count:
                  return True

          return False
    explanation: |
      **Time Complexity:** O(n) — Same as the hash map approach.

      **Space Complexity:** O(1) — Fixed arrays of size 26.

      This variant uses fixed-size arrays instead of hash maps. Since we know the input contains only lowercase English letters, we can map each character to an index (0-25). Array comparison is slightly faster than hash map comparison in practice.

  - approach_name: Sorting Each Window
    is_optimal: false
    code: |
      def check_inclusion(s1: str, s2: str) -> bool:
          if len(s1) > len(s2):
              return False

          # Sort s1 once as our target
          sorted_s1 = sorted(s1)
          window_size = len(s1)

          # Check each window by sorting and comparing
          for i in range(len(s2) - window_size + 1):
              window = s2[i:i + window_size]
              if sorted(window) == sorted_s1:
                  return True

          return False
    explanation: |
      **Time Complexity:** O(n * k log k) — For each of the `n - k + 1` windows, we sort `k` characters.

      **Space Complexity:** O(k) — Space for the sorted window.

      While correct, this approach is inefficient for large inputs. Sorting each window repeatedly wastes computation. The sliding window with frequency counting avoids this by incrementally updating counts instead of recomputing from scratch.