codetutor/backend/data/questions/split-array-largest-sum.yaml

title: Split Array Largest Sum
slug: split-array-largest-sum
difficulty: hard
leetcode_id: 410
leetcode_url: https://leetcode.com/problems/split-array-largest-sum/
categories:
  - arrays
  - binary-search
  - dynamic-programming
patterns:
  - slug: binary-search
    is_optimal: true
  - slug: greedy
    is_optimal: false

function_signature: "def split_array(nums: list[int], k: int) -> int:"

test_cases:
  visible:
    - input: { nums: [7, 2, 5, 10, 8], k: 2 }
      expected: 18
    - input: { nums: [1, 2, 3, 4, 5], k: 2 }
      expected: 9
  hidden:
    - input: { nums: [1, 4, 4], k: 3 }
      expected: 4
    - input: { nums: [10, 5, 13, 4, 8, 4, 5, 11, 14, 9, 16, 10, 20, 8], k: 8 }
      expected: 25
    - input: { nums: [1, 2, 3, 4, 5], k: 1 }
      expected: 15
    - input: { nums: [1, 2, 3, 4, 5], k: 5 }
      expected: 5
    - input: { nums: [2, 3, 1, 2, 4, 3], k: 5 }
      expected: 4

description: |
  Given an integer array `nums` and an integer `k`, split `nums` into `k` non-empty subarrays such that the largest sum of any subarray is **minimized**.

  Return *the minimized largest sum of the split*.

  A **subarray** is a contiguous part of the array.

constraints: |
  - `1 <= nums.length <= 1000`
  - `0 <= nums[i] <= 10^6`
  - `1 <= k <= min(50, nums.length)`

examples:
  - input: "nums = [7,2,5,10,8], k = 2"
    output: "18"
    explanation: "There are four ways to split nums into two subarrays. The best way is to split it into [7,2,5] and [10,8], where the largest sum among the two subarrays is only 18."
  - input: "nums = [1,2,3,4,5], k = 2"
    output: "9"
    explanation: "There are four ways to split nums into two subarrays. The best way is to split it into [1,2,3] and [4,5], where the largest sum among the two subarrays is only 9."

explanation:
  intuition: |
    This problem asks us to split an array into `k` parts to minimize the maximum subarray sum. At first glance, it seems like we need to try all possible ways to partition the array — but that's exponentially complex.

    Here's the key insight: **instead of searching for where to split, search for what the answer could be**.

    Think of it like this: imagine you're a manager assigning work to `k` workers. Each worker must handle a contiguous segment of tasks, and you want to minimize the maximum workload any single worker receives. You could ask: "Is it possible to distribute the work so that no worker handles more than X units?"

    If you can answer that question efficiently, you can use **binary search** to find the smallest valid X. The answer lies somewhere between:
    - **Lower bound**: the largest single element (one subarray must contain at least the max element)
    - **Upper bound**: the sum of all elements (one subarray contains everything)

    For any candidate answer `mid`, we greedily check: can we split the array into at most `k` subarrays where each has sum ≤ `mid`? If yes, we might be able to do better (search lower). If no, we need a larger limit (search higher).

  approach: |
    We solve this using **Binary Search on the Answer**:

    **Step 1: Define the search space**

    - `left`: Set to `max(nums)` — the answer can't be smaller than the largest element
    - `right`: Set to `sum(nums)` — the answer can't exceed putting everything in one subarray

    &nbsp;

    **Step 2: Binary search for the minimum valid answer**

    - Calculate `mid = (left + right) // 2`
    - Check if we can split the array into at most `k` subarrays with each sum ≤ `mid`
    - If yes: this `mid` works, but maybe we can do better — set `right = mid`
    - If no: we need a larger limit — set `left = mid + 1`

    &nbsp;

    **Step 3: Greedy feasibility check (can_split function)**

    - Iterate through the array, accumulating a running sum
    - When adding the next element would exceed `mid`, start a new subarray
    - Count how many subarrays we need
    - Return `True` if we need ≤ `k` subarrays

    &nbsp;

    **Step 4: Return the result**

    - When `left == right`, we've found the minimum valid maximum subarray sum
    - Return `left`

  common_pitfalls:
    - title: Trying All Partitions (Exponential Blowup)
      description: |
        A natural first thought is to enumerate all ways to split the array into `k` parts. For an array of length `n`, there are `C(n-1, k-1)` ways to place `k-1` dividers among `n-1` gaps.

        With `n = 1000` and `k = 50`, this is astronomically large — far too many combinations to check. This approach won't pass time limits.
      wrong_approach: "Enumerate all partition combinations"
      correct_approach: "Binary search on the answer with greedy validation"

    - title: Wrong Search Space Bounds
      description: |
        Setting `left = 0` is incorrect because the answer must be at least `max(nums)` — a subarray containing only the largest element has that sum.

        For example, with `nums = [10, 1, 1, 1]` and `k = 4`, the answer is `10` (each element in its own subarray), not `4`.
      wrong_approach: "left = 0, right = sum(nums)"
      correct_approach: "left = max(nums), right = sum(nums)"

    - title: Off-by-One in Subarray Counting
      description: |
        When checking feasibility, remember that you start with one subarray. Each time you "cut" to start a new subarray, increment the count.

        A common bug is initializing `count = 0` instead of `count = 1`, which underestimates the number of subarrays needed.
      wrong_approach: "Initialize subarray count to 0"
      correct_approach: "Initialize subarray count to 1 (first subarray)"

    - title: Using Exclusive Upper Bound Incorrectly
      description: |
        This is a "minimize the maximum" binary search. When `can_split(mid)` returns `True`, set `right = mid` (not `mid - 1`) because `mid` itself might be the answer.

        Using `right = mid - 1` could skip the optimal answer.
      wrong_approach: "right = mid - 1 when feasible"
      correct_approach: "right = mid when feasible (mid might be optimal)"

  key_takeaways:
    - "**Binary search on the answer**: When searching for an optimal value in a range, binary search the answer space and validate with a greedy check"
    - "**Greedy feasibility**: The `can_split` function greedily packs elements into subarrays — this works because we're checking a fixed limit, not optimizing"
    - "**Minimize-the-maximum pattern**: This problem structure (minimize the max of subarray sums) appears in many variants: allocating pages to students, shipping packages within D days, etc."
    - "**Search space bounds matter**: The lower bound is `max(nums)`, not `0` — understand why the bounds are what they are"

  time_complexity: "O(n × log(sum(nums) - max(nums))). We perform binary search over a range of size `sum - max`, and each feasibility check takes O(n) time."
  space_complexity: "O(1). We only use a constant number of variables for tracking bounds and subarray sums."

solutions:
  - approach_name: Binary Search on Answer
    is_optimal: true
    code: |
      def splitArray(nums: list[int], k: int) -> int:
          def can_split(max_sum: int) -> bool:
              """Check if we can split into <= k subarrays with each sum <= max_sum."""
              subarrays = 1  # Start with one subarray
              current_sum = 0

              for num in nums:
                  # Would adding this element exceed our limit?
                  if current_sum + num > max_sum:
                      # Start a new subarray with this element
                      subarrays += 1
                      current_sum = num
                  else:
                      # Add to current subarray
                      current_sum += num

              return subarrays <= k

          # Search space: [max element, total sum]
          left = max(nums)
          right = sum(nums)

          # Binary search for minimum valid maximum
          while left < right:
              mid = (left + right) // 2

              if can_split(mid):
                  # mid works, try to find something smaller
                  right = mid
              else:
                  # mid is too small, need larger limit
                  left = mid + 1

          return left
    explanation: |
      **Time Complexity:** O(n × log(S)) where S = sum(nums) - max(nums) — binary search with O(n) validation per iteration.

      **Space Complexity:** O(1) — only constant extra space used.

      We binary search over possible answers. For each candidate `mid`, we greedily check if the array can be split into at most `k` subarrays where each has sum ≤ `mid`. The greedy approach works because if we can fit more elements in the current subarray without exceeding `mid`, we should — this minimizes the number of subarrays needed.

  - approach_name: Dynamic Programming
    is_optimal: false
    code: |
      def splitArray(nums: list[int], k: int) -> int:
          n = len(nums)

          # Precompute prefix sums for O(1) range sum queries
          prefix = [0] * (n + 1)
          for i in range(n):
              prefix[i + 1] = prefix[i] + nums[i]

          # dp[i][j] = min largest sum to split nums[0:i] into j parts
          dp = [[float('inf')] * (k + 1) for _ in range(n + 1)]
          dp[0][0] = 0  # Base case: empty array, 0 parts

          for i in range(1, n + 1):
              for j in range(1, min(i, k) + 1):
                  # Try all possible last subarray starting points
                  for m in range(j - 1, i):
                      # Sum of nums[m:i] = prefix[i] - prefix[m]
                      last_sum = prefix[i] - prefix[m]
                      # Max of (best for first m elements in j-1 parts) and last subarray
                      dp[i][j] = min(dp[i][j], max(dp[m][j - 1], last_sum))

          return dp[n][k]
    explanation: |
      **Time Complexity:** O(n² × k) — three nested loops over positions and partitions.

      **Space Complexity:** O(n × k) — the DP table.

      This DP approach defines `dp[i][j]` as the minimum largest subarray sum when splitting the first `i` elements into exactly `j` parts. For each state, we try all possible positions for the last cut.

      While correct, this is slower than binary search for large inputs. It's useful for understanding the problem structure and can be optimized further with monotonic queue techniques.