Understanding pandas DataFrame iloc[]: A Comprehensive Guide

Pandas is a powerful data manipulation and analysis library in Python, and DataFrames are one of its core features. DataFrames allow you to store and manipulate tabular data, where data is aligned in rows and columns. For effective data analysis, it’s crucial to understand how to access and manipulate this data. In this guide, we will delve deep into one such method – iloc[] , which is used for index-based selection.

Introduction to DataFrame iloc[]

link to this section

iloc[] stands for 'integer location' and is a function used in pandas DataFrames to select elements by their index positions. Unlike loc[] , which uses labels, iloc[] uses integer index positions to make selections.

Basic Syntax of iloc[]

DataFrame.iloc[row_indexer, column_indexer] 
  • row_indexer : The integer positions of the rows that you want to select.
  • column_indexer : The integer positions of the columns that you want to select.

Selecting Rows with iloc[]

link to this section

Single Row Selection

To select a single row by its index position, pass the integer position to iloc[] .

import pandas as pd 
    
data = {'Name': ['Alice', 'Bob', 'Charlie'], 
    'Age': [25, 30, 35], 
    'City': ['New York', 'San Francisco', 'Los Angeles']} 
    
df = pd.DataFrame(data) 
selected_row = df.iloc[0] 
print(selected_row) 

This code will display the data for Alice.

Multiple Row Selection

You can select multiple rows by passing a list of integer positions.

selected_rows = df.iloc[0:2]
print(selected_rows) 

Note that with iloc[] , the end position is exclusive.

Selecting Columns with iloc[]

link to this section

Single Column Selection

To select a single column, pass the integer position of the column.

ages = df.iloc[:, 1] 
print(ages) 

Multiple Column Selection

Select multiple columns by passing a list of integer positions.

subset = df.iloc[:, 0:2] 
print(subset) 

Conditional Selection is NOT Directly Possible with iloc[]

link to this section

iloc[] is purely integer-based indexing, so you cannot pass a boolean condition directly to it. However, you can achieve conditional selection by combining it with other Pandas functions.

Modifying Data with iloc[]

link to this section

Updating a Single Value

df.iloc[0, 1] = 26 

Updating an Entire Row

df.iloc[0] = ['Alicia', 26, 'Boston'] 

Updating an Entire Column

df.iloc[:, 1] = [26, 31, 36] 

Conclusion

link to this section

Pandas iloc[] is an essential tool for data manipulation in Python, providing a fast and efficient way to access and modify your data based on integer positions. Understanding how to use iloc[] effectively will greatly enhance your data analysis and manipulation capabilities, allowing you to work with large datasets with ease. With the skills and knowledge gained from this guide, you are now equipped to navigate through your DataFrames and harness the full power of pandas in your data analysis journey. Happy data wrangling!