questions F-L
This commit is contained in:
156
backend/data/questions/group-anagrams.yaml
Normal file
156
backend/data/questions/group-anagrams.yaml
Normal file
@@ -0,0 +1,156 @@
|
||||
title: Group Anagrams
|
||||
slug: group-anagrams
|
||||
difficulty: medium
|
||||
leetcode_id: 49
|
||||
leetcode_url: https://leetcode.com/problems/group-anagrams/
|
||||
categories:
|
||||
- strings
|
||||
- hash-tables
|
||||
- sorting
|
||||
patterns:
|
||||
- hashing
|
||||
|
||||
description: |
|
||||
Given an array of strings `strs`, group the **anagrams** together. You can return the answer in **any order**.
|
||||
|
||||
An **anagram** is a word or phrase formed by rearranging the letters of a different word or phrase, using all the original letters exactly once.
|
||||
|
||||
constraints: |
|
||||
- `1 <= strs.length <= 10^4`
|
||||
- `0 <= strs[i].length <= 100`
|
||||
- `strs[i]` consists of lowercase English letters
|
||||
|
||||
examples:
|
||||
- input: 'strs = ["eat","tea","tan","ate","nat","bat"]'
|
||||
output: '[["bat"],["nat","tan"],["ate","eat","tea"]]'
|
||||
explanation: "Words with the same letters are grouped together."
|
||||
- input: 'strs = [""]'
|
||||
output: '[[""]]'
|
||||
explanation: "Empty string forms its own group."
|
||||
- input: 'strs = ["a"]'
|
||||
output: '[["a"]]'
|
||||
explanation: "Single character forms its own group."
|
||||
|
||||
explanation:
|
||||
intuition: |
|
||||
What makes two words anagrams? They have exactly the same letters in exactly the same quantities. "eat" and "tea" both have one 'e', one 'a', and one 't'.
|
||||
|
||||
Think of it like this: if you sort the letters of any anagram, you get the same result. `sorted("eat") = "aet"` and `sorted("tea") = "aet"`. This sorted form is a **canonical representation** — a fingerprint that's identical for all anagrams.
|
||||
|
||||
So the strategy is simple: for each word, compute its fingerprint (sorted letters), and group words with the same fingerprint together. A hash map is perfect for this — the fingerprint is the key, and each key maps to a list of original words.
|
||||
|
||||
There's an alternative fingerprint: instead of sorting, count each letter's frequency. `"eat"` becomes `(1,0,0,0,1,0,...,1,0,0)` — a tuple of 26 counts. This is O(k) instead of O(k log k), better for long strings.
|
||||
|
||||
approach: |
|
||||
We solve this using **Hash Map with Sorted String Keys**:
|
||||
|
||||
**Step 1: Create a hash map for grouping**
|
||||
|
||||
- Use a `defaultdict(list)` so we can append to non-existent keys
|
||||
- Keys will be the canonical form (sorted string)
|
||||
- Values will be lists of original strings
|
||||
|
||||
|
||||
|
||||
**Step 2: Process each string**
|
||||
|
||||
- For each string `s` in the input:
|
||||
- Compute the key: `''.join(sorted(s))`
|
||||
- Append the original string to `groups[key]`
|
||||
|
||||
|
||||
|
||||
**Step 3: Return all groups**
|
||||
|
||||
- Return `list(groups.values())` — each value is one anagram group
|
||||
|
||||
|
||||
|
||||
Why does sorting work? Two strings are anagrams if and only if they contain the same characters. Sorting arranges characters in a canonical order, so anagrams produce identical sorted strings.
|
||||
|
||||
common_pitfalls:
|
||||
- title: Using Unhashable Types as Dictionary Keys
|
||||
description: |
|
||||
In Python, `sorted(s)` returns a **list**, which can't be a dictionary key (lists are mutable, hence unhashable).
|
||||
|
||||
You must convert to a hashable type:
|
||||
- `''.join(sorted(s))` → string key
|
||||
- `tuple(sorted(s))` → tuple key
|
||||
wrong_approach: "groups[sorted(s)].append(s)"
|
||||
correct_approach: "groups[''.join(sorted(s))].append(s)"
|
||||
|
||||
- title: Forgetting Empty Strings
|
||||
description: |
|
||||
An empty string `""` is a valid input. `sorted("")` returns `[]`, and `''.join([])` returns `""`. The algorithm handles this correctly, but edge case testing is important.
|
||||
wrong_approach: "Assuming all strings are non-empty"
|
||||
correct_approach: "Empty strings are handled naturally — they form their own group"
|
||||
|
||||
- title: Using Regular Dict Without Default
|
||||
description: |
|
||||
With a regular `dict`, you must check if a key exists before appending:
|
||||
```python
|
||||
if key not in groups:
|
||||
groups[key] = []
|
||||
groups[key].append(s)
|
||||
```
|
||||
Using `defaultdict(list)` eliminates this boilerplate.
|
||||
wrong_approach: "groups[key].append(s) with regular dict (KeyError)"
|
||||
correct_approach: "Use defaultdict(list) for automatic list creation"
|
||||
|
||||
key_takeaways:
|
||||
- "**Canonical form for grouping**: Anagrams share a canonical representation (sorted or counted)"
|
||||
- "**Hash map for grouping**: When grouping by some property, use that property as the key"
|
||||
- "**Sorting vs counting**: Sorting is O(k log k), counting is O(k) — counting is faster for long strings"
|
||||
- "**defaultdict simplifies code**: Eliminates key-existence checks when building lists"
|
||||
|
||||
time_complexity: "O(n × k log k). We process n strings, and sorting each string of length k takes O(k log k). With the counting approach, this becomes O(n × k)."
|
||||
space_complexity: "O(n × k). We store all n strings in the hash map. Each string has length up to k."
|
||||
|
||||
solutions:
|
||||
- approach_name: Sorted String Key
|
||||
is_optimal: true
|
||||
code: |
|
||||
from collections import defaultdict
|
||||
|
||||
def group_anagrams(strs: list[str]) -> list[list[str]]:
|
||||
# Map: sorted string -> list of original strings
|
||||
groups = defaultdict(list)
|
||||
|
||||
for s in strs:
|
||||
# All anagrams sort to the same string
|
||||
key = ''.join(sorted(s))
|
||||
groups[key].append(s)
|
||||
|
||||
# Return all groups (order doesn't matter)
|
||||
return list(groups.values())
|
||||
explanation: |
|
||||
**Time Complexity:** O(n × k log k) — Sorting each of n strings of average length k.
|
||||
|
||||
**Space Complexity:** O(n × k) — Storing all strings in the hash map.
|
||||
|
||||
Sorting gives each string a canonical form. All anagrams produce the same sorted string, so they end up in the same bucket. Simple, readable, and efficient enough for most cases.
|
||||
|
||||
- approach_name: Character Count Key
|
||||
is_optimal: true
|
||||
code: |
|
||||
from collections import defaultdict
|
||||
|
||||
def group_anagrams(strs: list[str]) -> list[list[str]]:
|
||||
groups = defaultdict(list)
|
||||
|
||||
for s in strs:
|
||||
# Count frequency of each letter (a-z)
|
||||
count = [0] * 26
|
||||
for c in s:
|
||||
count[ord(c) - ord('a')] += 1
|
||||
|
||||
# Use tuple of counts as key (tuples are hashable)
|
||||
groups[tuple(count)].append(s)
|
||||
|
||||
return list(groups.values())
|
||||
explanation: |
|
||||
**Time Complexity:** O(n × k) — Counting is O(k) per string, better than O(k log k) sorting.
|
||||
|
||||
**Space Complexity:** O(n × k) — Same as sorted approach.
|
||||
|
||||
Instead of sorting, we count the frequency of each letter. Two strings are anagrams if and only if they have identical character counts. The count array is converted to a tuple (hashable) for use as a dictionary key. This is faster for long strings.
|
||||
Reference in New Issue
Block a user