LeetCode 274: H-Index Solution in Python – A Step-by-Step Guide

Imagine you’re a researcher tallying up your academic impact, counting how many of your papers have been cited at least that many times—like finding a sweet spot where your work’s reach shines. That’s the cool challenge of LeetCode 274: H-Index! This medium-level problem asks you to calculate a researcher’s h-index from an array of citation counts, where the h-index is the largest number h such that h papers have at least h citations. Using Python, we’ll explore two solutions: the Best Solution, a sorting approach with a linear scan that’s fast and clear, and an Alternative Solution, a counting sort method that’s clever but uses more space. With detailed examples, clear code, and friendly, easy-to-follow explanations—especially for the best solution—this guide will help you master the h-index and boost your coding skills. Let’s start counting those citations!

What Is LeetCode 274: H-Index?

Section link icon

In LeetCode 274: H-Index, you’re given an array citations of non-negative integers representing the number of citations for each of a researcher’s papers. Your task is to return the h-index: the maximum value h where the researcher has at least h papers with at least h citations each. For example, in [3, 0, 6, 1, 5], the h-index is 3 because there are 3 papers (3, 6, 5) with at least 3 citations. This problem introduces array manipulation and optimization, related to challenges like LeetCode 215: Kth Largest Element in an Array, but focuses on a specific citation metric.

Problem Statement

  • Input: An integer array citations.
  • Output: An integer—the researcher’s h-index.
  • Rules: h is the largest number where h papers have ≥ h citations; citations are non-negative.

Constraints

  • citations.length: 1 to 10^5.
  • citations[i]: 0 to 10^3.

Examples

  • Input: citations = [3, 0, 6, 1, 5]
    • Output: 3 (3 papers [3, 5, 6] have ≥ 3 citations).
  • Input: citations = [1, 1, 1]
    • Output: 1 (1 paper with ≥ 1 citation).
  • Input: citations = [0]
    • Output: 0 (no papers with ≥ 1 citation).

Understanding the Problem: Calculating the H-Index

Section link icon

To solve LeetCode 274: H-Index in Python, we need to find the largest h where at least h elements in the citations array are ≥ h. For [3, 0, 6, 1, 5], sort it to [0, 1, 3, 5, 6]—h=3 works because the last 3 values (3, 5, 6) are ≥ 3, but h=4 doesn’t (only 2 values ≥ 4). A basic way—trying every h from 0 to n and counting—works but takes O(n²) time. We’ll use two methods: 1. Best Solution (Sorting with Linear Scan): O(n log n) time—fast and simple. 2. Alternative Solution (Counting Sort): O(n) time—faster but uses more space.

Let’s dive into the best solution with a friendly, detailed walkthrough.

Best Solution: Sorting with Linear Scan

Section link icon

Why This Is the Best Solution

The sorting with linear scan approach is the top pick for LeetCode 274 because it runs in O(n log n) time—quick for most cases—and uses O(1) extra space beyond the input array. It sorts the citations in descending order, then scans to find the point where the count of papers matches or exceeds their citation values, making it both efficient and easy to understand once you see it in action.

How It Works

Picture this solution as lining up your papers from most-cited to least-cited, then walking down the list to find where the number of papers you’ve seen equals or exceeds their citation count—that’s your h-index. Here’s how it works, step-by-step, explained simply:

  • Step 1: Sort in Descending Order:
    • Arrange citations from largest to smallest (e.g., [3, 0, 6, 1, 5][6, 5, 3, 1, 0]).
    • This puts the highest citations first, making it easy to count papers with high citations.
  • Step 2: Scan the List:
    • For each position i (0-based index):
      • i + 1 is the number of papers seen so far.
      • Compare citations[i] (current citation count) with i + 1.
      • If citations[i] ≥ i + 1, at least i + 1 papers have ≥ citations[i] citations—keep going.
      • If citations[i] < i + 1, stop; h is the number of papers before this point.
  • Step 3: Handle Edge Cases:
    • If all values ≥ array length (e.g., [100, 100]), h is the length.
    • If all 0s, h is 0.
  • Step 4: Return h:
    • h is the last valid count where citations[i] ≥ i + 1.

It’s like counting votes—line them up big to small, and find where the tally matches the score!

Step-by-Step Example

Example: citations = [3, 0, 6, 1, 5]

  • Step 1: Sort descending: [6, 5, 3, 1, 0].
  • Step 2: Scan:
    • i=0: 6 ≥ 1 (1 paper), h=1 so far.
    • i=1: 5 ≥ 2 (2 papers), h=2.
    • i=2: 3 ≥ 3 (3 papers), h=3.
    • i=3: 1 < 4 (4 papers), stop, h=3 (3 papers ≥ 3).
  • Step 3: No edge issues (n=5, h=3 valid).
  • Result: 3.

Example: citations = [1, 1, 1]

  • Step 1: Sort: [1, 1, 1].
  • Step 2:
    • i=0: 1 ≥ 1, h=1.
    • i=1: 1 < 2, stop, h=1.
  • Result: 1.

Code with Detailed Line-by-Line Explanation

Here’s the Python code, explained in a friendly way:

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        # Step 1: Sort in descending order
        citations.sort(reverse=True)

        # Step 2: Scan to find h
        h = 0
        for i in range(len(citations)):
            if citations[i] >= i + 1:
                h = i + 1  # Update h if current citation ≥ papers seen
            else:
                break  # Stop when citation < papers

        # Step 3: Return h
        return h
  • Line 3-4: Sort citations descending (e.g., [6, 5, 3, 1, 0]).
  • Line 6-11: Loop through indices:
    • i + 1 is number of papers seen.
    • If citations[i] ≥ i + 1, set h = i + 1.
    • If not, break—h is set.
  • Line 13: Return final h.
  • Time Complexity: O(n log n)—sorting dominates.
  • Space Complexity: O(1)—in-place sort.

This solution is like a quick roll call—sort and count until the numbers align!

Alternative Solution: Counting Sort

Section link icon

Why an Alternative Approach?

The counting sort method uses a frequency array to tally citations, then scans backward to find h, running in O(n) time—faster than sorting—but uses O(n) space. It’s a clever trade-off, offering a linear-time view that’s great for seeing how counts can solve the problem differently.

How It Works

Think of this as filling buckets with citation counts, then counting backward from the highest bucket to find where papers match citations. Here’s how it works, step-by-step:

  • Step 1: Create a frequency array up to n+1 (n is array length).
  • Step 2: Count citations, capping at n (higher values don’t affect h > n).
  • Step 3: Scan backward, summing papers until sum ≥ index—h is the index.

It’s like stacking citation votes—pile them up, then count down to the winner!

Step-by-Step Example

Example: citations = [3, 0, 6, 1, 5]

  • Step 1: n = 5, array = [0] * 6 (0 to 5).
  • Step 2: Count: [1, 1, 0, 1, 0, 2] (0:1, 1:1, 3:1, 5:2).
  • Step 3: Backward:
    • i=5: Sum = 2 < 5.
    • i=4: Sum = 2 < 4.
    • i=3: Sum = 2 + 1 = 3 ≥ 3, h=3, stop.
  • Result: 3.

Code for Counting Sort Approach

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        # Step 1: Set up frequency array
        n = len(citations)
        counts = [0] * (n + 1)

        # Step 2: Fill counts, cap at n
        for citation in citations:
            counts[min(citation, n)] += 1

        # Step 3: Scan backward for h
        total = 0
        for i in range(n, -1, -1):
            total += counts[i]
            if total >= i:
                return i

        return 0
  • Time Complexity: O(n)—two linear passes.
  • Space Complexity: O(n)—counts array.

It’s a tally-and-scan—linear but spacey.

Comparing the Two Solutions

Section link icon
  • Best Solution (Sorting):
    • Pros: O(n log n) time, O(1) space, simple scan.
    • Cons: Sorting overhead.
  • Alternative Solution (Counting):
    • Pros: O(n) time, linear speed.
    • Cons: O(n) space, array setup.

Sorting wins for space and clarity.

Additional Examples and Edge Cases

Section link icon

All Zeros

  • [0, 0]0

All High

  • [100, 100]2

Mixed

  • [1, 4, 1]2

Both handle these well.

Complexity Breakdown

Section link icon
  • Sorting:
    • Time: O(n log n)—sort.
    • Space: O(1)—in-place.
  • Counting:
    • Time: O(n)—linear.
    • Space: O(n)—array.

Sorting is leaner on space.

Key Takeaways

Section link icon
  • Sorting: Order simplifies counting.
  • Counting: Buckets speed it up.
  • H-Index: Balance papers and citations.
  • Python Tip: Sorting’s a breeze—see [Python Basics](/python/basics).

Final Thoughts: Measure Impact Like a Pro

Section link icon

LeetCode 274: H-Index in Python is a rewarding citation puzzle. The sorting solution aligns papers neatly, while counting sort tallies fast. Want more? Try LeetCode 215: Kth Largest Element or LeetCode 275: H-Index II. Ready to count? Head to Solve LeetCode 274 on LeetCode and calculate that h-index today!