LeetCode 652: Find Duplicate Subtrees Solution in Python – A Step-by-Step Guide

Imagine you’re a tree inspector examining a binary tree—like one with nodes 1, 2, 4—and you need to spot any subtrees that appear more than once, such as two identical subtrees rooted at 2 with a child 4. Your task is to find all such duplicates and report their roots. That’s the challenge of LeetCode 652: Find Duplicate Subtrees, a medium-level problem that’s all about identifying repeated structures in a binary tree. Using Python, we’ll explore two solutions: the Best Solution, a tree serialization approach with a hash map for efficiency, and an Alternative Solution, a recursive subtree comparison that’s more intuitive but slower. With detailed examples, beginner-friendly breakdowns—especially for the serialization method—and clear code, this guide will help you find those duplicates, whether you’re new to coding or leveling up. Let’s climb that tree and start inspecting!

What Is LeetCode 652: Find Duplicate Subtrees?

In LeetCode 652: Find Duplicate Subtrees, you’re given the root of a binary tree (nodes with val, left, and right attributes), and your task is to find all subtrees that appear more than once, returning a list of their root nodes. A subtree is duplicate if its structure and values match another subtree exactly. For example, in the tree [1, 2, 3, 4, null, 2, 4, null, null, 4], the subtree rooted at 2 with a left child 4 appears twice, and the subtree rooted at 4 appears three times, so you’d return nodes for one instance of each duplicate. This problem tests your ability to compare tree structures efficiently.

Problem Statement

  • Input:
    • root: Root node of a binary tree.
  • Output: A list of TreeNode objects—roots of duplicate subtrees.
  • Rules:
    • Subtree: Includes a node and all its descendants.
    • Duplicate: Identical structure and values.
    • Return one root per unique duplicate subtree.

Constraints

  • Number of nodes: 1 to 10⁴.
  • -200 ≤ Node.val ≤ 200.

Examples

  • Input: root = [1, 2, 3, 4, null, 2, 4, null, null, 4]
    • Output: [2, 4] (Subtree [2, 4] appears twice, [4] appears thrice)
  • Input: root = [2, 1, 1]
    • Output: [1] (Two subtrees [1] are identical)
  • Input: root = [2, 2, 2, 3, null, 3, null]
    • Output: [2, 3] (Subtree [2, 3] appears twice, [3] twice)

These examples set the tree—let’s solve it!

Understanding the Problem: Finding Duplicates

To solve LeetCode 652: Find Duplicate Subtrees in Python, we need to identify all subtrees in a binary tree that appear more than once, returning their root nodes. A brute-force approach—comparing every subtree with every other—would be O(n²) with O(n) space for comparison, inefficient for n ≤ 10⁴. Since subtrees are defined by structure and values, we can optimize with serialization or recursive matching. We’ll use:

  • Best Solution (Tree Serialization with Hash Map): O(n) time, O(n) space—fast and scalable.
  • Alternative Solution (Recursive Subtree Comparison): O(n²) time, O(n) space—intuitive but slower.

Let’s start with the serialization solution, breaking it down for beginners.

Best Solution: Tree Serialization with Hash Map

Why This Is the Best Solution

The tree serialization with hash map approach is the top choice for LeetCode 652 because it’s efficient—O(n) time with O(n) space—and converts each subtree into a unique string representation, using a hash map to track duplicates in a single pass. It fits constraints (n ≤ 10⁴) perfectly and is elegant once you grasp the serialization trick. It’s like fingerprinting each subtree to spot the twins!

How It Works

Think of this as a tree fingerprinter:

  • Step 1: Initialize Hash Map:
    • Map to store serialized subtree strings to their counts and nodes.
  • Step 2: Serialize Subtrees:
    • Post-order traversal (left, right, root).
    • Serialize each subtree as a string (e.g., "(left,right,root)").
    • Use delimiters to ensure uniqueness (e.g., "#" for null).
  • Step 3: Track Duplicates:
    • If a serialized string appears twice, add its root to result.
  • Step 4: Return Result:
    • List of duplicate subtree roots.

It’s like a subtree ID checker—serialize and count!

Step-by-Step Example

Example: root = [1, 2, 3, 4, null, 2, 4, null, null, 4]

  • Step 1: Init: subtrees = {}, result = [].
  • Step 2: Serialize (post-order):
    • Left subtree [2, 4]:
      • Left [4]: "(#,#,4)" → count 1.
      • Root [2, 4]: "(#,#,4,#,2)" → count 1.
    • Right subtree [3, 2, 4]:
      • Left [2]: "(#,#,2)" → count 1.
      • Right [4]: "(#,#,4)" → count 2, add node 4 to result.
      • Right [2, 4]: "(#,#,4,#,2)" → count 2, add node 2 to result.
    • Root [1, 2, 3]: "(#,#,4,#,2,(#,#,2,(#,#,4,#,2),3),1)" → count 1.
  • Step 3: Result = [node 4, node 2].
  • Output: [TreeNode(4), TreeNode(2)].

Code with Detailed Line-by-Line Explanation

Here’s the Python code, explained clearly:

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

class Solution:
    def findDuplicateSubtrees(self, root: TreeNode) -> List[TreeNode]:
        # Step 1: Initialize hash map and result
        subtrees = {}  # serialized string -> (count, node)
        result = []

        # Step 2: Helper to serialize subtrees
        def serialize(node):
            if not node:
                return "#"  # Null marker

            # Post-order: left, right, root
            left = serialize(node.left)
            right = serialize(node.right)
            curr = f"({left},{right},{node.val})"  # Unique serialization

            # Step 3: Track duplicates
            if curr in subtrees:
                count, first_node = subtrees[curr]
                if count == 1:  # Second occurrence, add to result
                    result.append(first_node)
                subtrees[curr] = (count + 1, first_node)
            else:
                subtrees[curr] = (1, node)

            return curr

        # Step 4: Start serialization
        serialize(root)
        return result
  • Lines 1-6: TreeNode class (standard).
  • Lines 10-12: Init: Hash map for subtrees, result list.
  • Lines 15-31: serialize:
    • Return "#" for null.
    • Recursively serialize left and right, form string.
    • Track in hash map, add to result on second occurrence.
  • Line 34: Start from root.
  • Time Complexity: O(n)—each node processed once.
  • Space Complexity: O(n)—hash map size.

This is like a tree matcher—serialize and spot!

Alternative Solution: Recursive Subtree Comparison

Why an Alternative Approach?

The recursive subtree comparison approach checks all pairs—O(n²) time, O(n) space. It’s more intuitive, comparing subtrees directly with recursion, but slower due to repeated comparisons. It’s like eyeballing every branch pair to find twins!

How It Works

Picture this as a tree comparer:

  • Step 1: Collect all subtrees in a list.
  • Step 2: Compare each pair:
    • Recursive function to check if two subtrees are identical.
  • Step 3: Track seen subtrees, add duplicates to result.
  • Step 4: Return unique duplicate roots.

It’s like a branch checker—compare and list!

Step-by-Step Example

Example: root = [1, 2, 3, 4, null, 2, 4]

  • Step 1: Subtrees = [[1, 2, 3], [2, 4], [3, 2, 4], [4], [2, 4], [4]].
  • Step 2: Compare:
    • [2, 4] vs [2, 4]: Match, add one [2].
    • [4] vs [4] vs [4]: Match, add one [4].
  • Step 3: Result = [node 2, node 4].
  • Output: [TreeNode(2), TreeNode(4)].

Code for Recursive Approach

class Solution:
    def findDuplicateSubtrees(self, root: TreeNode) -> List[TreeNode]:
        # Step 1: Collect all subtrees
        subtrees = []
        def collect(node):
            if not node:
                return
            subtrees.append(node)
            collect(node.left)
            collect(node.right)
        collect(root)

        # Step 2: Compare subtrees
        def is_same(tree1, tree2):
            if not tree1 and not tree2:
                return True
            if not tree1 or not tree2:
                return False
            return (tree1.val == tree2.val and 
                    is_same(tree1.left, tree2.left) and 
                    is_same(tree1.right, tree2.right))

        # Step 3: Find duplicates
        result = []
        seen = set()
        for i in range(len(subtrees)):
            if id(subtrees[i]) in seen:
                continue
            for j in range(i + 1, len(subtrees)):
                if is_same(subtrees[i], subtrees[j]):
                    result.append(subtrees[i])
                    seen.add(id(subtrees[j]))
                    break

        # Step 4: Return result
        return result
  • Lines 4-11: Collect all subtree roots.
  • Lines 14-22: is_same: Compare two subtrees recursively.
  • Lines 25-35: Find duplicates, use IDs to avoid duplicates in result.
  • Time Complexity: O(n²)—n comparisons, O(n) each.
  • Space Complexity: O(n)—subtrees list and seen set.

It’s a tree comparer—check and list!

Comparing the Two Solutions

  • Serialization (Best):
    • Pros: O(n) time, O(n) space, fast and scalable.
    • Cons: Serialization less intuitive.
  • Recursive (Alternative):
    • Pros: O(n²) time, O(n) space, straightforward.
    • Cons: Slower, repeated comparisons.

Serialization wins for efficiency.

Additional Examples and Edge Cases

  • Input: [1]
    • Output: [] (No duplicates).
  • Input: [2, 1, 1, 1, null, 1]
    • Output: [1] (Multiple [1] subtrees).

Both handle these well.

Complexity Breakdown

  • Serialization: Time O(n), Space O(n).
  • Recursive: Time O(n²), Space O(n).

Serialization is optimal.

Key Takeaways

  • Serialization: Tree IDs—smart!
  • Recursive: Direct checks—clear!
  • Duplicates: Trees are fun.
  • Python Tip: Recursion rocks—see Python Basics.

Final Thoughts: Spot Those Subtrees

LeetCode 652: Find Duplicate Subtrees in Python is a neat tree challenge. Serialization with a hash map offers speed and elegance, while recursive comparison provides clarity. Want more? Try LeetCode 572: Subtree of Another Tree or LeetCode 437: Path Sum III. Ready to inspect? Head to Solve LeetCode 652 on LeetCode and find those duplicates today!