LeetCode 652: Find Duplicate Subtrees Solution in Python – A Step-by-Step Guide
Imagine you’re a tree inspector examining a binary tree—like one with nodes 1, 2, 4—and you need to spot any subtrees that appear more than once, such as two identical subtrees rooted at 2 with a child 4. Your task is to find all such duplicates and report their roots. That’s the challenge of LeetCode 652: Find Duplicate Subtrees, a medium-level problem that’s all about identifying repeated structures in a binary tree. Using Python, we’ll explore two solutions: the Best Solution, a tree serialization approach with a hash map for efficiency, and an Alternative Solution, a recursive subtree comparison that’s more intuitive but slower. With detailed examples, beginner-friendly breakdowns—especially for the serialization method—and clear code, this guide will help you find those duplicates, whether you’re new to coding or leveling up. Let’s climb that tree and start inspecting!
What Is LeetCode 652: Find Duplicate Subtrees?
In LeetCode 652: Find Duplicate Subtrees, you’re given the root of a binary tree (nodes with val, left, and right attributes), and your task is to find all subtrees that appear more than once, returning a list of their root nodes. A subtree is duplicate if its structure and values match another subtree exactly. For example, in the tree [1, 2, 3, 4, null, 2, 4, null, null, 4], the subtree rooted at 2 with a left child 4 appears twice, and the subtree rooted at 4 appears three times, so you’d return nodes for one instance of each duplicate. This problem tests your ability to compare tree structures efficiently.
Problem Statement
- Input:
- root: Root node of a binary tree.
- Output: A list of TreeNode objects—roots of duplicate subtrees.
- Rules:
- Subtree: Includes a node and all its descendants.
- Duplicate: Identical structure and values.
- Return one root per unique duplicate subtree.
Constraints
- Number of nodes: 1 to 10⁴.
- -200 ≤ Node.val ≤ 200.
Examples
- Input: root = [1, 2, 3, 4, null, 2, 4, null, null, 4]
- Output: [2, 4] (Subtree [2, 4] appears twice, [4] appears thrice)
- Input: root = [2, 1, 1]
- Output: [1] (Two subtrees [1] are identical)
- Input: root = [2, 2, 2, 3, null, 3, null]
- Output: [2, 3] (Subtree [2, 3] appears twice, [3] twice)
These examples set the tree—let’s solve it!
Understanding the Problem: Finding Duplicates
To solve LeetCode 652: Find Duplicate Subtrees in Python, we need to identify all subtrees in a binary tree that appear more than once, returning their root nodes. A brute-force approach—comparing every subtree with every other—would be O(n²) with O(n) space for comparison, inefficient for n ≤ 10⁴. Since subtrees are defined by structure and values, we can optimize with serialization or recursive matching. We’ll use:
- Best Solution (Tree Serialization with Hash Map): O(n) time, O(n) space—fast and scalable.
- Alternative Solution (Recursive Subtree Comparison): O(n²) time, O(n) space—intuitive but slower.
Let’s start with the serialization solution, breaking it down for beginners.
Best Solution: Tree Serialization with Hash Map
Why This Is the Best Solution
The tree serialization with hash map approach is the top choice for LeetCode 652 because it’s efficient—O(n) time with O(n) space—and converts each subtree into a unique string representation, using a hash map to track duplicates in a single pass. It fits constraints (n ≤ 10⁴) perfectly and is elegant once you grasp the serialization trick. It’s like fingerprinting each subtree to spot the twins!
How It Works
Think of this as a tree fingerprinter:
- Step 1: Initialize Hash Map:
- Map to store serialized subtree strings to their counts and nodes.
- Step 2: Serialize Subtrees:
- Post-order traversal (left, right, root).
- Serialize each subtree as a string (e.g., "(left,right,root)").
- Use delimiters to ensure uniqueness (e.g., "#" for null).
- Step 3: Track Duplicates:
- If a serialized string appears twice, add its root to result.
- Step 4: Return Result:
- List of duplicate subtree roots.
It’s like a subtree ID checker—serialize and count!
Step-by-Step Example
Example: root = [1, 2, 3, 4, null, 2, 4, null, null, 4]
- Step 1: Init: subtrees = {}, result = [].
- Step 2: Serialize (post-order):
- Left subtree [2, 4]:
- Left [4]: "(#,#,4)" → count 1.
- Root [2, 4]: "(#,#,4,#,2)" → count 1.
- Right subtree [3, 2, 4]:
- Left [2]: "(#,#,2)" → count 1.
- Right [4]: "(#,#,4)" → count 2, add node 4 to result.
- Right [2, 4]: "(#,#,4,#,2)" → count 2, add node 2 to result.
- Root [1, 2, 3]: "(#,#,4,#,2,(#,#,2,(#,#,4,#,2),3),1)" → count 1.
- Step 3: Result = [node 4, node 2].
- Output: [TreeNode(4), TreeNode(2)].
Code with Detailed Line-by-Line Explanation
Here’s the Python code, explained clearly:
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
class Solution:
def findDuplicateSubtrees(self, root: TreeNode) -> List[TreeNode]:
# Step 1: Initialize hash map and result
subtrees = {} # serialized string -> (count, node)
result = []
# Step 2: Helper to serialize subtrees
def serialize(node):
if not node:
return "#" # Null marker
# Post-order: left, right, root
left = serialize(node.left)
right = serialize(node.right)
curr = f"({left},{right},{node.val})" # Unique serialization
# Step 3: Track duplicates
if curr in subtrees:
count, first_node = subtrees[curr]
if count == 1: # Second occurrence, add to result
result.append(first_node)
subtrees[curr] = (count + 1, first_node)
else:
subtrees[curr] = (1, node)
return curr
# Step 4: Start serialization
serialize(root)
return result
- Lines 1-6: TreeNode class (standard).
- Lines 10-12: Init: Hash map for subtrees, result list.
- Lines 15-31: serialize:
- Return "#" for null.
- Recursively serialize left and right, form string.
- Track in hash map, add to result on second occurrence.
- Line 34: Start from root.
- Time Complexity: O(n)—each node processed once.
- Space Complexity: O(n)—hash map size.
This is like a tree matcher—serialize and spot!
Alternative Solution: Recursive Subtree Comparison
Why an Alternative Approach?
The recursive subtree comparison approach checks all pairs—O(n²) time, O(n) space. It’s more intuitive, comparing subtrees directly with recursion, but slower due to repeated comparisons. It’s like eyeballing every branch pair to find twins!
How It Works
Picture this as a tree comparer:
- Step 1: Collect all subtrees in a list.
- Step 2: Compare each pair:
- Recursive function to check if two subtrees are identical.
- Step 3: Track seen subtrees, add duplicates to result.
- Step 4: Return unique duplicate roots.
It’s like a branch checker—compare and list!
Step-by-Step Example
Example: root = [1, 2, 3, 4, null, 2, 4]
- Step 1: Subtrees = [[1, 2, 3], [2, 4], [3, 2, 4], [4], [2, 4], [4]].
- Step 2: Compare:
- [2, 4] vs [2, 4]: Match, add one [2].
- [4] vs [4] vs [4]: Match, add one [4].
- Step 3: Result = [node 2, node 4].
- Output: [TreeNode(2), TreeNode(4)].
Code for Recursive Approach
class Solution:
def findDuplicateSubtrees(self, root: TreeNode) -> List[TreeNode]:
# Step 1: Collect all subtrees
subtrees = []
def collect(node):
if not node:
return
subtrees.append(node)
collect(node.left)
collect(node.right)
collect(root)
# Step 2: Compare subtrees
def is_same(tree1, tree2):
if not tree1 and not tree2:
return True
if not tree1 or not tree2:
return False
return (tree1.val == tree2.val and
is_same(tree1.left, tree2.left) and
is_same(tree1.right, tree2.right))
# Step 3: Find duplicates
result = []
seen = set()
for i in range(len(subtrees)):
if id(subtrees[i]) in seen:
continue
for j in range(i + 1, len(subtrees)):
if is_same(subtrees[i], subtrees[j]):
result.append(subtrees[i])
seen.add(id(subtrees[j]))
break
# Step 4: Return result
return result
- Lines 4-11: Collect all subtree roots.
- Lines 14-22: is_same: Compare two subtrees recursively.
- Lines 25-35: Find duplicates, use IDs to avoid duplicates in result.
- Time Complexity: O(n²)—n comparisons, O(n) each.
- Space Complexity: O(n)—subtrees list and seen set.
It’s a tree comparer—check and list!
Comparing the Two Solutions
- Serialization (Best):
- Pros: O(n) time, O(n) space, fast and scalable.
- Cons: Serialization less intuitive.
- Recursive (Alternative):
- Pros: O(n²) time, O(n) space, straightforward.
- Cons: Slower, repeated comparisons.
Serialization wins for efficiency.
Additional Examples and Edge Cases
- Input: [1]
- Output: [] (No duplicates).
- Input: [2, 1, 1, 1, null, 1]
- Output: [1] (Multiple [1] subtrees).
Both handle these well.
Complexity Breakdown
- Serialization: Time O(n), Space O(n).
- Recursive: Time O(n²), Space O(n).
Serialization is optimal.
Key Takeaways
- Serialization: Tree IDs—smart!
- Recursive: Direct checks—clear!
- Duplicates: Trees are fun.
- Python Tip: Recursion rocks—see Python Basics.
Final Thoughts: Spot Those Subtrees
LeetCode 652: Find Duplicate Subtrees in Python is a neat tree challenge. Serialization with a hash map offers speed and elegance, while recursive comparison provides clarity. Want more? Try LeetCode 572: Subtree of Another Tree or LeetCode 437: Path Sum III. Ready to inspect? Head to Solve LeetCode 652 on LeetCode and find those duplicates today!