codetutor/backend/data/questions/majority-element-ii.yaml

title: Majority Element II
slug: majority-element-ii
difficulty: medium
leetcode_id: 229
leetcode_url: https://leetcode.com/problems/majority-element-ii/
categories:
  - arrays
  - hash-tables
patterns:
  - slug: greedy
    is_optimal: true

function_signature: "def majority_element(nums: list[int]) -> list[int]:"

test_cases:
  visible:
    - input: { nums: [3, 2, 3] }
      expected: [3]
    - input: { nums: [1] }
      expected: [1]
    - input: { nums: [1, 2] }
      expected: [1, 2]
  hidden:
    - input: { nums: [1, 1, 1, 2, 2, 3] }
      expected: [1]
    - input: { nums: [1, 2, 3, 4, 5] }
      expected: []
    - input: { nums: [2, 2, 2, 2] }
      expected: [2]
    - input: { nums: [1, 1, 2, 2, 3] }
      expected: [1, 2]
    - input: { nums: [0, 0, 0] }
      expected: [0]
    - input: { nums: [-1, -1, -1, 2, 2] }
      expected: [-1]
    - input: { nums: [1, 2, 1, 2, 1, 2, 3] }
      expected: [1, 2]

description: |
  Given an integer array of size `n`, find all elements that appear **more than** `⌊n/3⌋` times.

examples:
  - input: "nums = [3,2,3]"
    output: "[3]"
    explanation: "The element 3 appears twice out of 3 elements. Since ⌊3/3⌋ = 1, and 2 > 1, the answer is [3]."
  - input: "nums = [1]"
    output: "[1]"
    explanation: "The element 1 appears once out of 1 element. Since ⌊1/3⌋ = 0, and 1 > 0, the answer is [1]."
  - input: "nums = [1,2]"
    output: "[1,2]"
    explanation: "Both 1 and 2 appear once out of 2 elements. Since ⌊2/3⌋ = 0, and 1 > 0, both qualify."

constraints: |
  - `1 <= nums.length <= 5 * 10^4`
  - `-10^9 <= nums[i] <= 10^9`

explanation:
  intuition: |
    This problem extends the classic Majority Element problem. Instead of finding elements appearing more than `n/2` times, we're looking for elements appearing more than `n/3` times.

    Here's the key mathematical insight: **at most two elements** can appear more than `n/3` times. Why? If three elements each appeared more than `n/3` times, we'd need more than `n` elements total — impossible!

    Think of it like a three-way election where a candidate needs more than 33% of votes to win. At most two candidates can achieve this threshold. If all three had over 33%, the percentages would exceed 100%.

    This observation allows us to extend the **Boyer-Moore Voting Algorithm** to track two candidates instead of one. We run a "battle royale" where elements compete for two slots. When we encounter a third distinct element, it cancels out one vote from each candidate.

    At the end, we verify which candidates (if any) actually exceed the `n/3` threshold — unlike the original problem, there's no guarantee any element qualifies.

  approach: |
    We solve this using the **Extended Boyer-Moore Voting Algorithm**:

    **Step 1: Initialise two candidate slots**

    - `candidate1`, `candidate2`: Will store our two potential majority elements
    - `count1`, `count2`: Set to `0`, track the "strength" of each candidate

    &nbsp;

    **Step 2: First pass — find the candidates**

    - For each element in the array:
      - If it matches `candidate1`, increment `count1`
      - Else if it matches `candidate2`, increment `count2`
      - Else if `count1 == 0`, adopt this element as `candidate1` and set `count1 = 1`
      - Else if `count2 == 0`, adopt this element as `candidate2` and set `count2 = 1`
      - Else decrement both `count1` and `count2` (three distinct elements cancel out)

    &nbsp;

    **Step 3: Second pass — verify the candidates**

    - Count actual occurrences of `candidate1` and `candidate2`
    - Only include candidates that appear more than `n/3` times in the result
    - Unlike the original problem, neither candidate may qualify

    &nbsp;

    The cancellation logic works because if an element appears more than `n/3` times, it cannot be fully cancelled by all other elements, ensuring it survives as one of the two candidates.

  common_pitfalls:
    - title: Forgetting the Verification Pass
      description: |
        Unlike Majority Element I where the majority is guaranteed, this problem may have zero, one, or two valid answers.

        For example, with `nums = [1,2,3,4,5]`, no element appears more than `⌊5/3⌋ = 1` time. The Boyer-Moore phase will still produce two candidates, but neither actually qualifies.

        Always verify candidates with a second pass to count their actual occurrences.
      wrong_approach: "Returning candidates without verification"
      correct_approach: "Count actual occurrences and filter by threshold"

    - title: Using Hash Map Without Space Constraint Awareness
      description: |
        A hash map solution works and runs in O(n) time, but uses **O(n) space**. The follow-up specifically asks for O(1) space, which the extended Boyer-Moore algorithm achieves.

        The hash map approach is acceptable if space isn't a concern, but the optimal solution uses constant space.
      wrong_approach: "Hash map counting with O(n) space"
      correct_approach: "Extended Boyer-Moore with O(1) space"

    - title: Incorrect Order of Candidate Checks
      description: |
        The order of checks matters in the first pass. You must check if the element matches existing candidates *before* checking if a slot is available.

        If you check `count1 == 0` first, you might reassign `candidate1` to an element that should have been counted under `candidate2`, corrupting your counts.
      wrong_approach: "Checking for empty slots before checking for matches"
      correct_approach: "Check matches first, then check for empty slots"

    - title: Not Handling Duplicate Candidates
      description: |
        When assigning the second candidate, ensure it's different from the first candidate. If both slots hold the same value, you're effectively only tracking one element.

        When `count2 == 0` and you adopt a new candidate, verify it's not equal to `candidate1`.
      wrong_approach: "Allowing candidate1 and candidate2 to hold the same value"
      correct_approach: "Ensure candidates are always distinct"

  key_takeaways:
    - "**Mathematical bound**: At most `k-1` elements can appear more than `n/k` times — this generalises Boyer-Moore"
    - "**Verification is essential**: Unlike guaranteed-majority problems, always verify candidates when existence isn't guaranteed"
    - "**Order of operations matters**: Check existing candidates before checking for empty slots"
    - "**Foundation for generalisations**: The same technique extends to finding elements appearing more than `n/4`, `n/5`, etc., by tracking more candidate slots"

  time_complexity: "O(n). We make two passes through the array — one for candidate selection, one for verification."
  space_complexity: "O(1). We only use a fixed number of variables (two candidates, two counts) regardless of input size."

solutions:
  - approach_name: Extended Boyer-Moore Voting
    is_optimal: true
    code: |
      def majority_element(nums: list[int]) -> list[int]:
          # At most 2 elements can appear more than n/3 times
          candidate1, candidate2 = None, None
          count1, count2 = 0, 0

          # First pass: find potential candidates
          for num in nums:
              # Check matches first (order matters!)
              if candidate1 == num:
                  count1 += 1
              elif candidate2 == num:
                  count2 += 1
              # Then check for empty slots
              elif count1 == 0:
                  candidate1 = num
                  count1 = 1
              elif count2 == 0:
                  candidate2 = num
                  count2 = 1
              # Three distinct elements: cancel one from each
              else:
                  count1 -= 1
                  count2 -= 1

          # Second pass: verify candidates actually exceed threshold
          threshold = len(nums) // 3
          result = []

          # Count actual occurrences
          count1 = sum(1 for num in nums if num == candidate1)
          count2 = sum(1 for num in nums if num == candidate2)

          # Only include if they exceed n/3
          if count1 > threshold:
              result.append(candidate1)
          if candidate2 != candidate1 and count2 > threshold:
              result.append(candidate2)

          return result
    explanation: |
      **Time Complexity:** O(n) — Two passes through the array.

      **Space Complexity:** O(1) — Only a fixed number of variables used.

      The algorithm extends Boyer-Moore to track two candidates. The key insight is that at most two elements can exceed the `n/3` threshold. When three distinct elements are seen, they cancel each other out. A verification pass confirms which candidates actually qualify.

  - approach_name: Hash Map Counting
    is_optimal: false
    code: |
      from collections import Counter

      def majority_element(nums: list[int]) -> list[int]:
          # Count occurrences of each element
          counts = Counter(nums)
          threshold = len(nums) // 3

          # Return all elements exceeding the threshold
          return [num for num, count in counts.items() if count > threshold]
    explanation: |
      **Time Complexity:** O(n) — Single pass to build the counter.

      **Space Complexity:** O(n) — Hash map stores up to n distinct elements.

      This approach is intuitive and easy to implement. It counts all elements and filters by the threshold. While correct, it uses more space than the optimal Boyer-Moore solution.