title: Redundant Connection
slug: redundant-connection
difficulty: medium
leetcode_id: 684
leetcode_url: https://leetcode.com/problems/redundant-connection/
categories:
  - graphs
patterns:
  - union-find
  - dfs

function_signature: "def find_redundant_connection(edges: list[list[int]]) -> list[int]:"

test_cases:
  visible:
    - input: { edges: [[1, 2], [1, 3], [2, 3]] }
      expected: [2, 3]
    - input: { edges: [[1, 2], [2, 3], [3, 4], [1, 4], [1, 5]] }
      expected: [1, 4]
  hidden:
    - input: { edges: [[1, 2], [2, 3], [3, 1]] }
      expected: [3, 1]
    - input: { edges: [[1, 4], [3, 4], [1, 3], [1, 2], [4, 5]] }
      expected: [1, 3]
    - input: { edges: [[1, 5], [3, 4], [3, 5], [4, 5], [2, 4]] }
      expected: [4, 5]
    - input: { edges: [[9, 10], [5, 8], [2, 6], [1, 5], [3, 8], [4, 9], [8, 10], [4, 10], [6, 8], [7, 9]] }
      expected: [4, 10]

description: |
  In this problem, a tree is an **undirected graph** that is connected and has no cycles.

  You are given a graph that started as a tree with `n` nodes labeled from `1` to `n`, with one additional edge added. The added edge has two **different** vertices chosen from `1` to `n`, and was not an edge that already existed. The graph is represented as an array `edges` of length `n` where `edges[i] = [a_i, b_i]` indicates that there is an edge between nodes `a_i` and `b_i` in the graph.

  Return *an edge that can be removed so that the resulting graph is a tree of* `n` *nodes*. If there are multiple answers, return the answer that occurs last in the input.

constraints: |
  - `n == edges.length`
  - `3 <= n <= 1000`
  - `edges[i].length == 2`
  - `1 <= a_i < b_i <= edges.length`
  - `a_i != b_i`
  - There are no repeated edges
  - The given graph is connected

examples:
  - input: "edges = [[1,2],[1,3],[2,3]]"
    output: "[2,3]"
    explanation: "Adding edge [2,3] creates a cycle 1-2-3-1. Removing it restores the tree structure."
  - input: "edges = [[1,2],[2,3],[3,4],[1,4],[1,5]]"
    output: "[1,4]"
    explanation: "Adding edge [1,4] creates a cycle 1-2-3-4-1. Removing it (the last edge that could be removed) restores the tree."

explanation:
  intuition: |
    Imagine building a forest from scratch by adding edges one by one. Each time you add an edge, you're either **connecting two separate trees** (valid) or **connecting two nodes that are already in the same tree** (creates a cycle).

    Think of it like connecting islands with bridges. If two islands are already connected (possibly through other islands), building another bridge between them creates a loop — that's the redundant connection.

    The key insight is: **the first edge that connects two already-connected nodes is the edge that creates the cycle**. Since we want the last such edge in the input (if multiple exist), we process edges in order and return the last one that would create a cycle.

    This is exactly what **Union-Find** (Disjoint Set Union) excels at: efficiently tracking which nodes belong to the same connected component and detecting when an edge would connect nodes already in the same component.

  approach: |
    We solve this using **Union-Find (Disjoint Set Union)**:

    **Step 1: Initialise the Union-Find structure**

    - `parent`: Array where `parent[i]` initially equals `i` (each node is its own parent)
    - `rank`: Array to track tree depth for union by rank optimisation

    &nbsp;

    **Step 2: Define helper functions**

    - `find(x)`: Returns the root of `x`'s component, with path compression
    - `union(x, y)`: Merges components of `x` and `y`, returns `False` if already connected

    &nbsp;

    **Step 3: Process each edge in order**

    - For each edge `[a, b]`, attempt to union nodes `a` and `b`
    - If `find(a) == find(b)`, they're already in the same component — this edge is redundant
    - Return the first (and in our iteration, last-checked) edge that causes a cycle

    &nbsp;

    **Step 4: Return the redundant edge**

    - The problem guarantees exactly one redundant edge exists
    - Since we process edges in order, we naturally find the last one that creates a cycle

    &nbsp;

    The Union-Find approach is ideal here because it efficiently handles dynamic connectivity queries. Path compression and union by rank ensure near-constant time operations.

  common_pitfalls:
    - title: Forgetting Path Compression
      description: |
        Without path compression, `find()` can degrade to O(n) per call, making the overall solution O(n^2).

        Path compression flattens the tree structure by making each node point directly to the root during `find()` operations. This keeps the tree shallow and ensures near-constant time lookups.
      wrong_approach: "Simple find without path compression"
      correct_approach: "find() with path compression: parent[x] = find(parent[x])"

    - title: Using 0-indexed Arrays with 1-indexed Nodes
      description: |
        The problem uses nodes labeled `1` to `n`, but many implementations use 0-indexed arrays.

        Either allocate arrays of size `n + 1` (indices 0 to n, ignoring index 0), or subtract 1 from each node value. Mixing conventions leads to off-by-one errors.
      wrong_approach: "Array of size n with 1-indexed node access"
      correct_approach: "Array of size n+1 to accommodate 1-indexed nodes"

    - title: Using DFS/BFS for Each Edge
      description: |
        A naive approach runs DFS/BFS before each edge to check if the two nodes are already connected. This works but is O(n^2) in the worst case.

        Union-Find provides amortised O(α(n)) per operation (nearly constant), making the total time complexity O(n × α(n)) ≈ O(n).
      wrong_approach: "DFS/BFS connectivity check before each edge"
      correct_approach: "Union-Find with path compression and union by rank"

  key_takeaways:
    - "**Union-Find pattern**: The go-to data structure for dynamic connectivity problems — tracking which elements belong to the same group"
    - "**Cycle detection in graphs**: An edge creates a cycle if and only if it connects two nodes already in the same connected component"
    - "**Path compression + union by rank**: These two optimisations together give nearly O(1) amortised time per operation"
    - "**Foundation for harder problems**: This pattern extends to problems like accounts merge, number of provinces, and minimum spanning trees (Kruskal's algorithm)"

  time_complexity: "O(n × α(n)), where α is the inverse Ackermann function. With path compression and union by rank, each union/find operation is nearly O(1), and we perform n operations."
  space_complexity: "O(n). We store parent and rank arrays of size n+1 to represent the Union-Find structure."

solutions:
  - approach_name: Union-Find
    is_optimal: true
    code: |
      def find_redundant_connection(edges: list[list[int]]) -> list[int]:
          n = len(edges)
          # Initialise parent array: each node is its own parent
          parent = list(range(n + 1))
          # Rank for union by rank optimisation
          rank = [0] * (n + 1)

          def find(x: int) -> int:
              # Path compression: make each node point to root
              if parent[x] != x:
                  parent[x] = find(parent[x])
              return parent[x]

          def union(x: int, y: int) -> bool:
              # Find roots of both nodes
              root_x, root_y = find(x), find(y)

              # Already in same component - this edge creates a cycle
              if root_x == root_y:
                  return False

              # Union by rank: attach smaller tree under larger tree
              if rank[root_x] < rank[root_y]:
                  parent[root_x] = root_y
              elif rank[root_x] > rank[root_y]:
                  parent[root_y] = root_x
              else:
                  parent[root_y] = root_x
                  rank[root_x] += 1

              return True

          # Process each edge - first one that fails union is redundant
          for a, b in edges:
              if not union(a, b):
                  return [a, b]

          return []  # Problem guarantees a redundant edge exists
    explanation: |
      **Time Complexity:** O(n × α(n)) — Each union/find is nearly O(1) with path compression and union by rank.

      **Space Complexity:** O(n) — Parent and rank arrays.

      We process edges in order. For each edge, we try to union the two nodes. If they're already in the same component (same root), this edge would create a cycle — it's the redundant connection. Since we process edges in input order, we naturally return the last such edge.

  - approach_name: DFS Cycle Detection
    is_optimal: false
    code: |
      from collections import defaultdict

      def find_redundant_connection(edges: list[list[int]]) -> list[int]:
          graph = defaultdict(set)

          def has_path(source: int, target: int, visited: set) -> bool:
              # DFS to check if path exists between source and target
              if source == target:
                  return True
              visited.add(source)
              for neighbour in graph[source]:
                  if neighbour not in visited:
                      if has_path(neighbour, target, visited):
                          return True
              return False

          for a, b in edges:
              # Before adding edge, check if nodes are already connected
              if has_path(a, b, set()):
                  return [a, b]
              # Add edge to graph
              graph[a].add(b)
              graph[b].add(a)

          return []
    explanation: |
      **Time Complexity:** O(n^2) — For each of n edges, we potentially traverse up to n nodes.

      **Space Complexity:** O(n) — Graph adjacency list and recursion stack.

      Before adding each edge, we use DFS to check if the two nodes are already connected. If they are, adding this edge would create a cycle. While correct, this approach is slower than Union-Find for large inputs.