196 lines
8.2 KiB
YAML
196 lines
8.2 KiB
YAML
title: Majority Element
|
|
slug: majority-element
|
|
difficulty: easy
|
|
leetcode_id: 169
|
|
leetcode_url: https://leetcode.com/problems/majority-element/
|
|
categories:
|
|
- arrays
|
|
- hash-tables
|
|
patterns:
|
|
- greedy
|
|
|
|
function_signature: "def majority_element(nums: list[int]) -> int:"
|
|
|
|
test_cases:
|
|
visible:
|
|
- input: { nums: [3, 2, 3] }
|
|
expected: 3
|
|
- input: { nums: [2, 2, 1, 1, 1, 2, 2] }
|
|
expected: 2
|
|
- input: { nums: [1] }
|
|
expected: 1
|
|
hidden:
|
|
- input: { nums: [6, 5, 5] }
|
|
expected: 5
|
|
- input: { nums: [1, 1, 1, 2, 2, 2, 2] }
|
|
expected: 2
|
|
- input: { nums: [3, 3, 4] }
|
|
expected: 3
|
|
|
|
description: |
|
|
Given an array `nums` of size `n`, return *the majority element*.
|
|
|
|
The majority element is the element that appears **more than** `⌊n / 2⌋` times. You may assume that the majority element always exists in the array.
|
|
|
|
constraints: |
|
|
- `n == nums.length`
|
|
- `1 <= n <= 5 * 10^4`
|
|
- `-10^9 <= nums[i] <= 10^9`
|
|
- The input is generated such that a majority element will exist in the array
|
|
|
|
examples:
|
|
- input: "nums = [3,2,3]"
|
|
output: "3"
|
|
explanation: "The element 3 appears twice out of 3 elements, which is more than ⌊3/2⌋ = 1 time."
|
|
- input: "nums = [2,2,1,1,1,2,2]"
|
|
output: "2"
|
|
explanation: "The element 2 appears 4 times out of 7 elements, which is more than ⌊7/2⌋ = 3 times."
|
|
|
|
explanation:
|
|
intuition: |
|
|
Imagine a rowdy crowd where two groups are shouting different slogans. If one group has **more than half** the people, their voice will always dominate — no matter how the other groups combine.
|
|
|
|
This is the core insight behind the **Boyer-Moore Voting Algorithm**. Think of it as a "battle royale" where each element fights against others:
|
|
|
|
- When you encounter the same element, it gains strength (count increases)
|
|
- When you encounter a different element, they cancel each other out (count decreases)
|
|
|
|
Since the majority element appears more than `n/2` times, it's guaranteed to have "survivors" at the end. Even if every other element teams up against it, they can't outnumber it — the majority element will always be the last one standing.
|
|
|
|
The key insight is that if we pair up different elements to "eliminate" each other, the majority element will always have at least one unpaired instance remaining.
|
|
|
|
approach: |
|
|
We solve this using the **Boyer-Moore Voting Algorithm**:
|
|
|
|
**Step 1: Initialise candidate tracking**
|
|
|
|
- `candidate`: Will store our current guess for the majority element
|
|
- `count`: Set to `0`, tracks the "strength" of our current candidate
|
|
|
|
|
|
|
|
**Step 2: First pass — find the candidate**
|
|
|
|
- For each element in the array:
|
|
- If `count == 0`, adopt the current element as our new `candidate`
|
|
- If the current element equals `candidate`, increment `count` (gains strength)
|
|
- Otherwise, decrement `count` (different elements cancel out)
|
|
|
|
|
|
|
|
**Step 3: Return the candidate**
|
|
|
|
- Since the problem guarantees a majority element exists, our candidate is the answer
|
|
- No verification pass is needed (but would be required if existence wasn't guaranteed)
|
|
|
|
|
|
|
|
This works because the majority element, appearing more than half the time, cannot be fully cancelled out by all other elements combined.
|
|
|
|
common_pitfalls:
|
|
- title: Using Extra Space with Hash Map
|
|
description: |
|
|
A common first approach is to count occurrences using a hash map:
|
|
- Iterate through the array, counting each element
|
|
- Return the element with count > `n/2`
|
|
|
|
While this works and runs in O(n) time, it uses **O(n) space** for the hash map. The follow-up specifically asks for O(1) space, which the Boyer-Moore algorithm achieves.
|
|
wrong_approach: "Hash map counting with O(n) space"
|
|
correct_approach: "Boyer-Moore Voting Algorithm with O(1) space"
|
|
|
|
- title: Sorting and Taking the Middle
|
|
description: |
|
|
Another approach is to sort the array and return the middle element. Since the majority element appears more than `n/2` times, it must occupy the middle position after sorting.
|
|
|
|
This works but has **O(n log n)** time complexity due to sorting. The Boyer-Moore algorithm achieves O(n) time.
|
|
wrong_approach: "Sorting with O(n log n) time"
|
|
correct_approach: "Single pass with O(n) time"
|
|
|
|
- title: Forgetting to Reset the Candidate
|
|
description: |
|
|
A critical part of Boyer-Moore is resetting the candidate when `count` reaches zero. If you only decrement without adopting a new candidate, you'll miss the majority element.
|
|
|
|
When `count == 0`, it means all previously seen elements have cancelled out, so we start fresh with the current element as our new candidate.
|
|
wrong_approach: "Only incrementing/decrementing without resetting candidate"
|
|
correct_approach: "Reset candidate when count becomes zero"
|
|
|
|
key_takeaways:
|
|
- "**Boyer-Moore Voting Algorithm**: A brilliant technique for finding majority elements in O(n) time and O(1) space"
|
|
- "**Cancellation principle**: Different elements cancel each other out; the majority survives because it can't be fully cancelled"
|
|
- "**Space-time optimisation**: When a hash map solution exists, ask if there's a pattern-based approach using constant space"
|
|
- "**Foundation for variations**: This extends to finding elements appearing more than `n/3` times (Boyer-Moore generalisation)"
|
|
|
|
time_complexity: "O(n). We traverse the array exactly once, performing constant-time operations at each step."
|
|
space_complexity: "O(1). We only use two variables (`candidate` and `count`), regardless of input size."
|
|
|
|
solutions:
|
|
- approach_name: Boyer-Moore Voting Algorithm
|
|
is_optimal: true
|
|
code: |
|
|
def majority_element(nums: list[int]) -> int:
|
|
# Current candidate for majority element
|
|
candidate = None
|
|
# Count tracks the "strength" of our candidate
|
|
count = 0
|
|
|
|
for num in nums:
|
|
# If count is zero, adopt current element as new candidate
|
|
if count == 0:
|
|
candidate = num
|
|
|
|
# Same as candidate? Gains strength. Different? Cancel out.
|
|
if num == candidate:
|
|
count += 1
|
|
else:
|
|
count -= 1
|
|
|
|
# Candidate is guaranteed to be the majority element
|
|
return candidate
|
|
explanation: |
|
|
**Time Complexity:** O(n) — Single pass through the array.
|
|
|
|
**Space Complexity:** O(1) — Only two variables used regardless of input size.
|
|
|
|
The algorithm works by maintaining a candidate and a count. When elements match, we increase confidence. When they differ, they cancel out. Since the majority element appears more than half the time, it will always be the survivor.
|
|
|
|
- approach_name: Hash Map Counting
|
|
is_optimal: false
|
|
code: |
|
|
from collections import Counter
|
|
|
|
def majority_element(nums: list[int]) -> int:
|
|
# Count occurrences of each element
|
|
counts = Counter(nums)
|
|
n = len(nums)
|
|
|
|
# Find the element appearing more than n/2 times
|
|
for num, count in counts.items():
|
|
if count > n // 2:
|
|
return num
|
|
|
|
# Problem guarantees majority exists, so we'll always return above
|
|
return -1
|
|
explanation: |
|
|
**Time Complexity:** O(n) — Single pass to build the counter.
|
|
|
|
**Space Complexity:** O(n) — Hash map stores up to n/2 distinct elements in worst case.
|
|
|
|
This approach is intuitive and easy to implement. It counts all elements and returns the one exceeding the threshold. While correct, it uses more space than necessary.
|
|
|
|
- approach_name: Sorting
|
|
is_optimal: false
|
|
code: |
|
|
def majority_element(nums: list[int]) -> int:
|
|
# Sort the array
|
|
nums.sort()
|
|
|
|
# The majority element must be at the middle index
|
|
# Since it appears > n/2 times, it spans the middle
|
|
return nums[len(nums) // 2]
|
|
explanation: |
|
|
**Time Complexity:** O(n log n) — Dominated by the sorting step.
|
|
|
|
**Space Complexity:** O(1) or O(n) — Depends on sorting algorithm (in-place vs. not).
|
|
|
|
After sorting, the majority element must occupy the middle position because it appears more than half the time. Simple but slower than optimal.
|