codetutor/backend/data/questions/avoid-flood-in-the-city.yaml

title: Avoid Flood in The City
slug: avoid-flood-in-the-city
difficulty: medium
leetcode_id: 1488
leetcode_url: https://leetcode.com/problems/avoid-flood-in-the-city/
categories:
  - arrays
  - hash-tables
patterns:
  - slug: greedy
    is_optimal: false
  - slug: binary-search
    is_optimal: true

function_signature: "def avoid_flood(rains: list[int]) -> list[int]:"

test_cases:
  visible:
    - input: { rains: [1, 2, 3, 4] }
      expected: [-1, -1, -1, -1]
    - input: { rains: [1, 2, 0, 0, 2, 1] }
      expected: [-1, -1, 2, 1, -1, -1]
    - input: { rains: [1, 2, 0, 1, 2] }
      expected: []
  hidden:
    - input: { rains: [1, 0, 1] }
      expected: [-1, 1, -1]
    - input: { rains: [0, 1, 1] }
      expected: []
    - input: { rains: [1, 1] }
      expected: []
    - input: { rains: [0] }
      expected: [1]
    - input: { rains: [1] }
      expected: [-1]

description: |
  Your country has an infinite number of lakes. Initially, all the lakes are empty, but when it rains over the n<sup>th</sup> lake, that lake becomes full of water. If it rains over a lake that is **full of water**, there will be a **flood**.

  Your goal is to avoid floods in any lake.

  Given an integer array `rains` where:

  - `rains[i] > 0` means it will rain over lake `rains[i]`.
  - `rains[i] == 0` means there is no rain this day, and you **must** choose **one lake** to **dry**.

  Return *an array* `ans` where:

  - `ans.length == rains.length`
  - `ans[i] == -1` if `rains[i] > 0`.
  - `ans[i]` is the lake you choose to dry on the i<sup>th</sup> day if `rains[i] == 0`.

  If there are multiple valid answers, return **any** of them. If it is impossible to avoid a flood, return **an empty array**.

  **Note:** If you choose to dry a full lake, it becomes empty. If you dry an empty lake, nothing changes.

constraints: |
  - `1 <= rains.length <= 10^5`
  - `0 <= rains[i] <= 10^9`

examples:
  - input: "rains = [1,2,3,4]"
    output: "[-1,-1,-1,-1]"
    explanation: "No dry days needed. Lakes 1, 2, 3, 4 each fill once without any lake being rained on twice."
  - input: "rains = [1,2,0,0,2,1]"
    output: "[-1,-1,2,1,-1,-1]"
    explanation: "On day 3, we dry lake 2. On day 4, we dry lake 1. This prevents floods when lakes 2 and 1 are rained on again on days 5 and 6."
  - input: "rains = [1,2,0,1,2]"
    output: "[]"
    explanation: "After day 2, lakes 1 and 2 are full. We only have one dry day (day 3). On days 4 and 5, both lakes 1 and 2 are rained on again. We can only dry one, so a flood is unavoidable."

explanation:
  intuition: |
    Imagine you're a city planner with weather forecasts for the coming days. You know exactly when each lake will be rained on, and on dry days, you get to send a crew to empty one lake.

    The key insight is that **not all dry days are equal**. When a lake is about to flood (because it's full and will be rained on again), you need a dry day *between* the two rain events for that lake. This is a scheduling problem: you must match dry days to the right lakes.

    Think of it like this: when you see rain coming for lake X, and lake X is already full, you need to look *backwards* and find a dry day you haven't used yet that falls *after* the last time lake X was filled. You're essentially doing just-in-time scheduling — you don't decide what to dry until you *need* to dry it.

    This is where **greedy + binary search** shines. We save up our dry days, and when a flood is imminent, we find the earliest available dry day that can prevent it. Using the earliest valid day is optimal because it preserves later dry days for future emergencies.

  approach: |
    We use a **Greedy with Binary Search** approach to optimally schedule which lakes to dry:

    **Step 1: Initialise data structures**

    - `full_lakes`: A hash map storing `{lake_number: day_it_was_last_filled}` — tracks which lakes are currently full
    - `dry_days`: A sorted list of day indices where `rains[i] == 0` — our available dry days to use
    - `result`: Output array initialised with `-1` (we'll update dry day values later)

    &nbsp;

    **Step 2: Iterate through each day**

    - If `rains[i] == 0` (dry day):
      - Add day `i` to our `dry_days` list (we'll decide later what to dry)
      - For now, set `result[i] = 1` (placeholder — any lake number works if we end up not needing it)

    - If `rains[i] > 0` (rain day for lake `rains[i]`):
      - Check if this lake is already in `full_lakes`
      - If **not full**: add it to `full_lakes` with the current day index
      - If **already full**: we need to dry it before today
        - Binary search in `dry_days` for the smallest day index > when the lake was last filled
        - If no such day exists, return `[]` (flood is unavoidable)
        - Otherwise, use that dry day: set `result[dry_day] = lake_number`, remove the day from `dry_days`
        - Update `full_lakes[lake]` to the current day

    &nbsp;

    **Step 3: Return the result**

    - If we processed all days without returning early, return `result`

    &nbsp;

    The greedy choice is to use the **earliest valid dry day** when a flood is imminent. This is optimal because it maximises flexibility for future scheduling.

  common_pitfalls:
    - title: Pre-assigning Dry Days
      description: |
        A common mistake is trying to decide what lake to dry *on* a dry day. But you don't have enough information yet — you don't know which lakes will need drying in the future.

        For example, with `rains = [1, 0, 1]`, if you arbitrarily dry lake 1 on day 2, great! But with `rains = [1, 0, 2, 1]`, you don't know on day 2 whether you'll need that dry day for lake 1 or some other lake.

        The correct approach is to *defer* the decision until you actually need to prevent a flood.
      wrong_approach: "Decide what to dry immediately on dry days"
      correct_approach: "Save dry days and assign them when needed"

    - title: Using Any Available Dry Day
      description: |
        When a lake is about to flood, you might think any unused dry day would work. But the dry day must occur *after* the lake was last filled.

        For example, with `rains = [0, 1, 1]`:
        - Day 0 is dry
        - Day 1 fills lake 1
        - Day 2 rains on lake 1 again

        You cannot use day 0 to dry lake 1 — the lake wasn't even full yet! The dry day must be between the two rain events. This is why binary search is needed to find the first dry day **after** the lake was filled.
      wrong_approach: "Use any available dry day"
      correct_approach: "Binary search for a dry day after the lake was filled"

    - title: Linear Search for Dry Days
      description: |
        With up to `10^5` days and potentially many dry days to search through, a linear search for each flood prevention would result in O(n²) time complexity.

        Using a sorted list with binary search (or a balanced BST / SortedList in Python) reduces each lookup to O(log n), making the overall algorithm O(n log n).
      wrong_approach: "Linear scan through dry days each time"
      correct_approach: "Binary search in a sorted structure"

  key_takeaways:
    - "**Deferred decision-making**: Don't assign resources until you know they're needed. Saving dry days and using them just-in-time gives maximum flexibility."
    - "**Greedy + Binary Search**: When scheduling limited resources, use the earliest valid option to preserve later options for future needs."
    - "**Hash map for state tracking**: `full_lakes` provides O(1) lookup to check if a lake is full and when it was last filled."
    - "**Similar problems**: This pattern of matching resources to constraints appears in interval scheduling, task assignment, and meeting room problems."

  time_complexity: "O(n log n). Each of the n days is processed once, and dry day lookups use binary search (O(log n)). Insertions and deletions in a sorted structure are O(log n)."
  space_complexity: "O(n). The hash map `full_lakes` and sorted list `dry_days` each store at most n entries."

solutions:
  - approach_name: Greedy with Binary Search
    is_optimal: true
    code: |
      from sortedcontainers import SortedList

      def avoid_flood(rains: list[int]) -> list[int]:
          n = len(rains)
          result = [-1] * n

          # Track which lakes are full: {lake_id: day_it_was_filled}
          full_lakes = {}

          # Sorted list of available dry day indices
          dry_days = SortedList()

          for day in range(n):
              lake = rains[day]

              if lake == 0:
                  # Dry day - save it for later, use placeholder value
                  dry_days.add(day)
                  result[day] = 1  # Placeholder (dry any lake)
              else:
                  # Rain day for this lake
                  if lake in full_lakes:
                      # Lake is already full - we need to dry it!
                      last_filled = full_lakes[lake]

                      # Find the earliest dry day AFTER the lake was filled
                      idx = dry_days.bisect_right(last_filled)

                      if idx == len(dry_days):
                          # No valid dry day exists - flood is unavoidable
                          return []

                      # Use this dry day to dry the lake
                      dry_day = dry_days[idx]
                      result[dry_day] = lake
                      dry_days.remove(dry_day)

                  # Mark the lake as full (or update when it was filled)
                  full_lakes[lake] = day

          return result
    explanation: |
      **Time Complexity:** O(n log n) — Each day is processed once. Binary search and sorted list operations are O(log n).

      **Space Complexity:** O(n) — Hash map and sorted list store at most n elements.

      We use `SortedList` from the `sortedcontainers` library for efficient binary search with insertion/deletion. When a lake is about to flood, we find the earliest dry day after it was filled. If no such day exists, a flood is unavoidable.

  - approach_name: Greedy with Heap (Alternative)
    is_optimal: false
    code: |
      import heapq
      from bisect import bisect_right

      def avoid_flood(rains: list[int]) -> list[int]:
          n = len(rains)
          result = [-1] * n

          # Track which lakes are full: {lake_id: day_it_was_filled}
          full_lakes = {}

          # List of dry days (will use bisect for searching)
          dry_days = []

          for day in range(n):
              lake = rains[day]

              if lake == 0:
                  dry_days.append(day)
                  result[day] = 1  # Placeholder
              else:
                  if lake in full_lakes:
                      last_filled = full_lakes[lake]

                      # Binary search for first dry day > last_filled
                      idx = bisect_right(dry_days, last_filled)

                      if idx == len(dry_days):
                          return []

                      # Use and remove this dry day
                      dry_day = dry_days.pop(idx)
                      result[dry_day] = lake

                  full_lakes[lake] = day

          return result
    explanation: |
      **Time Complexity:** O(n²) in worst case — While binary search is O(log n), `list.pop(idx)` is O(n) for middle elements.

      **Space Complexity:** O(n) — Same storage requirements.

      This uses Python's built-in `bisect` module instead of `SortedList`. It's simpler but less efficient because removing from the middle of a list is O(n). For interview purposes, this solution is often acceptable if you explain the trade-off and mention that a balanced BST or `SortedList` would improve it.