LeetCode 295: Find Median from Data Stream Solution in Python – A Step-by-Step Guide
Imagine you’re keeping track of numbers coming in one by one—like 1, 3, 2—and at any moment, you need to figure out the middle value, the median, without waiting for all the numbers to arrive. That’s the neat challenge of LeetCode 295: Find Median from Data Stream, a hard-level problem that’s all about building a smart system to handle a flowing stream of data. Using Python, we’ll explore two solutions: the Best Solution, using two heaps to keep things balanced and fast, and an Alternative Solution, a sorted list that’s simpler but slower. With detailed examples, clear code, and a friendly tone—especially for the heaps breakdown—this guide will help you master this streaming median, whether you’re new to coding or leveling up. Let’s dive into the stream and find that middle number!
What Is LeetCode 295: Find Median from Data Stream?
In LeetCode 295: Find Median from Data Stream, you need to design a class MedianFinder
with two methods: addNum(num)
to add a number to the stream, and findMedian()
to return the median of all numbers seen so far. The median is the middle value in a sorted list—if the count is odd, it’s the middle number; if even, it’s the average of the two middle numbers. For example, after adding 1, 3, 2, the median is 2; add 4, it’s 2.5. This problem tests your ability to manage data efficiently, sharing a vibe with heap-based challenges like LeetCode 703: Kth Largest Element in a Stream.
Problem Statement
- Input: A stream of integers via addNum(num); findMedian() queries.
- Output: A class with:
- addNum(num): Adds num to the stream (returns nothing).
- findMedian(): Returns the median as a float.
- Rules: Handle odd and even counts; efficient for frequent adds and queries.
Constraints
- Numbers: -10⁵ to 10⁵.
- Up to 5 * 10⁴ calls to addNum and findMedian.
- At least one number before findMedian.
Examples
- Input: ["MedianFinder", "addNum(1)", "addNum(2)", "findMedian()", "addNum(3)", "findMedian()"]
- Output: [null, null, null, 1.5, null, 2.0]
- Input: ["MedianFinder", "addNum(1)", "findMedian()"]
- Output: [null, null, 1.0]
Understanding the Problem: Tracking the Middle
To solve LeetCode 295: Find Median from Data Stream in Python, we need a class that adds numbers to a growing collection and quickly finds the median. With a static list, we’d sort it each time—slow! Instead, we need a way to keep numbers organized so the middle is always easy to grab. We’ll use:
- Best Solution (Two Heaps): O(log n) add, O(1) find—fast and balanced.
- Alternative Solution (Sorted List): O(n) add, O(1) find—simple but slow.
Let’s dive into the two heaps solution with a friendly breakdown that’s easy to follow.
Best Solution: Two Heaps
Why This Is the Best Solution
The two heaps approach is the top pick for LeetCode 295 because it’s efficient: O(log n) to add a number and O(1) to find the median, with O(n) space. It uses two heaps—a max heap for the lower half of numbers and a min heap for the upper half—keeping them balanced so the median is always at the top of one or the average of both. It’s like splitting your numbers into two tidy piles—super quick for streaming data!
How It Works
Let’s picture this like organizing a line of numbers:
- Step 1: Split into Two Piles:
- Use a max heap (small) for the lower half—biggest number on top.
- Use a min heap (large) for the upper half—smallest number on top.
- Keep them balanced: same size (even count) or small has one more (odd count).
- Step 2: Add a Number:
- If small is empty or the number’s less than small’s top, add to small (max heap).
- Otherwise, add to large (min heap).
- Balance: If small has 2+ more than large, move small’s top to large; if large has more, move large’s top to small.
- Step 3: Find the Median:
- Odd count (small has one more): Median is small’s top.
- Even count (same size): Median is average of small’s top and large’s top.
- Step 4: Why It Works:
- Max heap keeps lower half’s biggest at the top; min heap keeps upper half’s smallest at the top.
- Balancing ensures the median is always ready—middle number or average of two.
It’s like keeping two sorted teams where the middle’s always in reach!
Step-by-Step Example
Example: Stream = [1, 3, 2, 4]
- Init: small (max heap) = [], large (min heap) = [].
- Add 1:
- small = [1], large = [].
- Add 3:
- 3 > 1, large = [3], small = [1].
- Balance: same size, good.
- Median: (1 + 3) / 2 = 2.0.
- Add 2:
- 2 > 1, large = [2, 3], small = [1].
- Balance: large has 2, small has 1 → move large’s min (2) to small.
- small = [2, 1] (max heap: 2 > 1), large = [3].
- Median: 2 (small’s top, odd count).
- Add 4:
- 4 > 2, large = [3, 4], small = [2, 1].
- Balance: even, good.
- Median: (2 + 3) / 2 = 2.5.
Code with Detailed Line-by-Line Explanation
Here’s the Python code, broken down so it’s easy to follow:
import heapq
class MedianFinder:
def __init__(self):
# Step 1: Set up two heaps
self.small = [] # Max heap for lower half (use negative for max)
self.large = [] # Min heap for upper half
def addNum(self, num: int) -> None:
# Step 2: Add to the right heap
if not self.small or num < -self.small[0]: # Compare with max of small
heapq.heappush(self.small, -num) # Negative for max heap
else:
heapq.heappush(self.large, num) # Min heap
# Step 3: Balance the heaps
if len(self.small) > len(self.large) + 1: # Small too big
heapq.heappush(self.large, -heapq.heappop(self.small))
elif len(self.large) > len(self.small): # Large too big
heapq.heappush(self.small, -heapq.heappop(self.large))
def findMedian(self) -> float:
# Step 4: Get the median
if len(self.small) > len(self.large): # Odd count
return float(-self.small[0]) # Top of small (negated back)
else: # Even count
return (-self.small[0] + self.large[0]) / 2.0
- Line 5-7: In __init__:
- small: Max heap (use negatives since Python has min heap only).
- large: Min heap.
- Line 10-14: In addNum:
- If empty or num < small’s max (-small[0]), push to small (negated).
- Else, push to large.
- Line 16-19: Balance:
- If small has 2+ more, pop its max (-negated) to large.
- If large has more, pop its min to small (negated).
- Line 22-25: In findMedian:
- Odd: Return small’s top (negated back).
- Even: Average small’s top and large’s top.
- Time Complexity: O(log n) add (heap ops), O(1) find.
- Space Complexity: O(n)—stores all numbers in heaps.
This is like a balancing act—keep the middle ready!
Alternative Solution: Sorted List
Why an Alternative Approach?
The sorted list approach keeps all numbers in a sorted array—O(n) add, O(1) find, O(n) space. It’s simpler but slower for adding, like sorting a deck of cards each time you get a new one—not ideal for a stream but easy to understand.
How It Works
Let’s think of this as a growing list:
- Step 1: Keep a sorted list of numbers.
- Step 2: Add by inserting in the right spot (sorted).
- Step 3: Find median by indexing the middle or averaging two middles.
It’s like lining up numbers in order each time!
Step-by-Step Example
Example: Stream = [1, 3, 2]
- Init: nums = [].
- Add 1: nums = [1].
- Add 3: nums = [1, 3].
- Median: (1 + 3) / 2 = 2.0.
- Add 2: nums = [1, 2, 3].
- Median: 2.
Code for Sorted List Approach
class MedianFinder:
def __init__(self):
self.nums = []
def addNum(self, num: int) -> None:
# Insert in sorted position
i = 0
while i < len(self.nums) and self.nums[i] < num:
i += 1
self.nums.insert(i, num)
def findMedian(self) -> float:
n = len(self.nums)
if n % 2 == 1: # Odd
return float(self.nums[n // 2])
else: # Even
return (self.nums[n // 2 - 1] + self.nums[n // 2]) / 2.0
- Time Complexity: O(n) add (insert), O(1) find.
- Space Complexity: O(n)—stores the list.
It’s a slow build but quick lookup!
Comparing the Two Solutions
- Two Heaps (Best):
- Pros: O(log n) add, O(1) find, O(n) space, fast.
- Cons: Heap logic to learn.
- Sorted List (Alternative):
- Pros: O(n) add, O(1) find, O(n) space, simple.
- Cons: Slow adds.
Heaps win for streams.
Additional Examples and Edge Cases
- [1]: 1.0.
- [2, 1]: 1.5.
- [1, 1, 1]: 1.0.
Both work, heaps faster.
Complexity Breakdown
- Heaps: Add O(log n), Find O(1), Space O(n).
- List: Add O(n), Find O(1), Space O(n).
Heaps rule.
Key Takeaways
- Heaps: Balance for speed—awesome!
- List: Sort it out—easy but slow!
- Streams: Dynamic data is fun.
- Python Tip: Heaps rock—see [Python Basics](/python/basics).
Final Thoughts: Stream That Median
LeetCode 295: Find Median from Data Stream in Python is a slick data challenge. Two heaps keep it fast, while sorted lists show the basics. Want more? Try LeetCode 703: Kth Largest Element in a Stream or LeetCode 480: Sliding Window Median. Ready to flow? Head to Solve LeetCode 295 on LeetCode and catch that median today!