title: Redundant Connection slug: redundant-connection difficulty: medium leetcode_id: 684 leetcode_url: https://leetcode.com/problems/redundant-connection/ categories: - graphs patterns: - union-find - dfs function_signature: "def find_redundant_connection(edges: list[list[int]]) -> list[int]:" test_cases: visible: - input: { edges: [[1, 2], [1, 3], [2, 3]] } expected: [2, 3] - input: { edges: [[1, 2], [2, 3], [3, 4], [1, 4], [1, 5]] } expected: [1, 4] hidden: - input: { edges: [[1, 2], [2, 3], [3, 1]] } expected: [3, 1] - input: { edges: [[1, 4], [3, 4], [1, 3], [1, 2], [4, 5]] } expected: [1, 3] - input: { edges: [[1, 5], [3, 4], [3, 5], [4, 5], [2, 4]] } expected: [4, 5] - input: { edges: [[9, 10], [5, 8], [2, 6], [1, 5], [3, 8], [4, 9], [8, 10], [4, 10], [6, 8], [7, 9]] } expected: [4, 10] description: | In this problem, a tree is an **undirected graph** that is connected and has no cycles. You are given a graph that started as a tree with `n` nodes labeled from `1` to `n`, with one additional edge added. The added edge has two **different** vertices chosen from `1` to `n`, and was not an edge that already existed. The graph is represented as an array `edges` of length `n` where `edges[i] = [a_i, b_i]` indicates that there is an edge between nodes `a_i` and `b_i` in the graph. Return *an edge that can be removed so that the resulting graph is a tree of* `n` *nodes*. If there are multiple answers, return the answer that occurs last in the input. constraints: | - `n == edges.length` - `3 <= n <= 1000` - `edges[i].length == 2` - `1 <= a_i < b_i <= edges.length` - `a_i != b_i` - There are no repeated edges - The given graph is connected examples: - input: "edges = [[1,2],[1,3],[2,3]]" output: "[2,3]" explanation: "Adding edge [2,3] creates a cycle 1-2-3-1. Removing it restores the tree structure." - input: "edges = [[1,2],[2,3],[3,4],[1,4],[1,5]]" output: "[1,4]" explanation: "Adding edge [1,4] creates a cycle 1-2-3-4-1. Removing it (the last edge that could be removed) restores the tree." explanation: intuition: | Imagine building a forest from scratch by adding edges one by one. Each time you add an edge, you're either **connecting two separate trees** (valid) or **connecting two nodes that are already in the same tree** (creates a cycle). Think of it like connecting islands with bridges. If two islands are already connected (possibly through other islands), building another bridge between them creates a loop — that's the redundant connection. The key insight is: **the first edge that connects two already-connected nodes is the edge that creates the cycle**. Since we want the last such edge in the input (if multiple exist), we process edges in order and return the last one that would create a cycle. This is exactly what **Union-Find** (Disjoint Set Union) excels at: efficiently tracking which nodes belong to the same connected component and detecting when an edge would connect nodes already in the same component. approach: | We solve this using **Union-Find (Disjoint Set Union)**: **Step 1: Initialise the Union-Find structure** - `parent`: Array where `parent[i]` initially equals `i` (each node is its own parent) - `rank`: Array to track tree depth for union by rank optimisation   **Step 2: Define helper functions** - `find(x)`: Returns the root of `x`'s component, with path compression - `union(x, y)`: Merges components of `x` and `y`, returns `False` if already connected   **Step 3: Process each edge in order** - For each edge `[a, b]`, attempt to union nodes `a` and `b` - If `find(a) == find(b)`, they're already in the same component — this edge is redundant - Return the first (and in our iteration, last-checked) edge that causes a cycle   **Step 4: Return the redundant edge** - The problem guarantees exactly one redundant edge exists - Since we process edges in order, we naturally find the last one that creates a cycle   The Union-Find approach is ideal here because it efficiently handles dynamic connectivity queries. Path compression and union by rank ensure near-constant time operations. common_pitfalls: - title: Forgetting Path Compression description: | Without path compression, `find()` can degrade to O(n) per call, making the overall solution O(n^2). Path compression flattens the tree structure by making each node point directly to the root during `find()` operations. This keeps the tree shallow and ensures near-constant time lookups. wrong_approach: "Simple find without path compression" correct_approach: "find() with path compression: parent[x] = find(parent[x])" - title: Using 0-indexed Arrays with 1-indexed Nodes description: | The problem uses nodes labeled `1` to `n`, but many implementations use 0-indexed arrays. Either allocate arrays of size `n + 1` (indices 0 to n, ignoring index 0), or subtract 1 from each node value. Mixing conventions leads to off-by-one errors. wrong_approach: "Array of size n with 1-indexed node access" correct_approach: "Array of size n+1 to accommodate 1-indexed nodes" - title: Using DFS/BFS for Each Edge description: | A naive approach runs DFS/BFS before each edge to check if the two nodes are already connected. This works but is O(n^2) in the worst case. Union-Find provides amortised O(α(n)) per operation (nearly constant), making the total time complexity O(n × α(n)) ≈ O(n). wrong_approach: "DFS/BFS connectivity check before each edge" correct_approach: "Union-Find with path compression and union by rank" key_takeaways: - "**Union-Find pattern**: The go-to data structure for dynamic connectivity problems — tracking which elements belong to the same group" - "**Cycle detection in graphs**: An edge creates a cycle if and only if it connects two nodes already in the same connected component" - "**Path compression + union by rank**: These two optimisations together give nearly O(1) amortised time per operation" - "**Foundation for harder problems**: This pattern extends to problems like accounts merge, number of provinces, and minimum spanning trees (Kruskal's algorithm)" time_complexity: "O(n × α(n)), where α is the inverse Ackermann function. With path compression and union by rank, each union/find operation is nearly O(1), and we perform n operations." space_complexity: "O(n). We store parent and rank arrays of size n+1 to represent the Union-Find structure." solutions: - approach_name: Union-Find is_optimal: true code: | def find_redundant_connection(edges: list[list[int]]) -> list[int]: n = len(edges) # Initialise parent array: each node is its own parent parent = list(range(n + 1)) # Rank for union by rank optimisation rank = [0] * (n + 1) def find(x: int) -> int: # Path compression: make each node point to root if parent[x] != x: parent[x] = find(parent[x]) return parent[x] def union(x: int, y: int) -> bool: # Find roots of both nodes root_x, root_y = find(x), find(y) # Already in same component - this edge creates a cycle if root_x == root_y: return False # Union by rank: attach smaller tree under larger tree if rank[root_x] < rank[root_y]: parent[root_x] = root_y elif rank[root_x] > rank[root_y]: parent[root_y] = root_x else: parent[root_y] = root_x rank[root_x] += 1 return True # Process each edge - first one that fails union is redundant for a, b in edges: if not union(a, b): return [a, b] return [] # Problem guarantees a redundant edge exists explanation: | **Time Complexity:** O(n × α(n)) — Each union/find is nearly O(1) with path compression and union by rank. **Space Complexity:** O(n) — Parent and rank arrays. We process edges in order. For each edge, we try to union the two nodes. If they're already in the same component (same root), this edge would create a cycle — it's the redundant connection. Since we process edges in input order, we naturally return the last such edge. - approach_name: DFS Cycle Detection is_optimal: false code: | from collections import defaultdict def find_redundant_connection(edges: list[list[int]]) -> list[int]: graph = defaultdict(set) def has_path(source: int, target: int, visited: set) -> bool: # DFS to check if path exists between source and target if source == target: return True visited.add(source) for neighbour in graph[source]: if neighbour not in visited: if has_path(neighbour, target, visited): return True return False for a, b in edges: # Before adding edge, check if nodes are already connected if has_path(a, b, set()): return [a, b] # Add edge to graph graph[a].add(b) graph[b].add(a) return [] explanation: | **Time Complexity:** O(n^2) — For each of n edges, we potentially traverse up to n nodes. **Space Complexity:** O(n) — Graph adjacency list and recursion stack. Before adding each edge, we use DFS to check if the two nodes are already connected. If they are, adding this edge would create a cycle. While correct, this approach is slower than Union-Find for large inputs.