Published on

leetcode-295 Find Median from Data Stream

Authors
  • avatar
    Name
    Gene Zhang
    Twitter

[295] Find Median from Data Stream

Key Concept: Two Heaps (Max-Heap + Min-Heap) - Maintain two heaps: max-heap for smaller half, min-heap for larger half. Median is either the max of smaller half, min of larger half, or their average.

Balance: Keep heaps balanced in size (differ by at most 1).

# The median is the middle value in an ordered integer list. If the size of
# the list is even, there is no middle value, and the median is the mean of
# the two middle values.
# Implement the MedianFinder class with:
# - MedianFinder() initializes the object.
# - void addNum(int num) adds num to the data structure.
# - double findMedian() returns the median of all elements so far.
#
# Example 1:
# Input: ["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
#        [[], [1], [2], [], [3], []]
# Output: [null, null, null, 1.5, null, 2.0]
#
# Constraints:
# -10^5 <= num <= 10^5
# There will be at least one element before calling findMedian.
# At most 5 * 10^4 calls will be made to addNum and findMedian.

import heapq

class MedianFinder:
    def __init__(self):
        # Max-heap for smaller half (negate values for max-heap in Python)
        self.small = []  # max-heap (use negative values)
        # Min-heap for larger half
        self.large = []  # min-heap

    def addNum(self, num: int) -> None:
        # Always add to small (max-heap) first
        heapq.heappush(self.small, -num)

        # Ensure every element in small <= every element in large
        if self.small and self.large and (-self.small[0] > self.large[0]):
            val = -heapq.heappop(self.small)
            heapq.heappush(self.large, val)

        # Balance the heaps (small can have at most 1 more element)
        if len(self.small) > len(self.large) + 1:
            val = -heapq.heappop(self.small)
            heapq.heappush(self.large, val)
        if len(self.large) > len(self.small):
            val = heapq.heappop(self.large)
            heapq.heappush(self.small, -val)

    def findMedian(self) -> float:
        # If odd number of elements, median is max of small heap
        if len(self.small) > len(self.large):
            return -self.small[0]
        # If even, median is average of tops of both heaps
        return (-self.small[0] + self.large[0]) / 2.0

# Time Complexity:
#   addNum: O(log n) - heap operations
#   findMedian: O(1) - just peek at heap tops
# Space Complexity: O(n) - storing all numbers
#
# Key Pattern: Two heaps for running median
# small (max-heap): stores smaller half
# large (min-heap): stores larger half
# Median is at the "meeting point" of the two heaps
#
# This pattern extends to: sliding window median, median in stream, etc.