How to index pandas DataFrames
The indexing of Python pandas DataFrames allows for efficient and direct access to data. It makes it easier to select specific rows and columns, simplifying data analysis tasks.
- 99.9% uptime and super-fast loading
- Advanced security features
- Domain and email included
What is indexing in pandas?
Indexing in pandas refers to different methods you can use to select rows or columns. Using labels of rows and columns or their numerical position within the DataFrame, you can easily select elements in a DataFrame. An index serves as a type of address system for your data, helping you locate and manage your data more efficiently.
What is pandas DataFrame.index
?
You can view the index labels of a DataFrame in pandas using the index
attribute. The syntax looks like this:
DataFrame.index
pythonWhat is the syntax for indexing DataFrames in pandas?
There are several ways to index pandas DataFrames, and the syntax varies depending on the operation you want to perform.
Indexing with labels (column labels)
You can use column names to index pandas DataFrames. Here’s an example of how to create a sample DataFrame:
import pandas as pd
# Creating a sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['Nottingham', 'London', 'Cardiff']
}
df = pd.DataFrame(data)
print(df)
pythonHere’s what the DataFrame looks like:
Name Age City
0 Alice 25 Nottingham
1 Bob 30 London
2 Charlie 35 Cardiff
To access all the values in a column, you can use the column name together with the indexing operator []
. Just enter the column name as a Python string inside the indexing operator:
# Access the Age column
print(df['Age'])
pythonThe output is a list of ages:
0 25
1 30
2 35
Name: Age, dtype: int64
To retrieve the data of more than one column, write the column names in the indexing operator and place commas between them to separate them.
Indexing with loc[]
(row labels)
If you need to access a specific row in your DataFrame, you can use the pandas loc indexer. You can pass the row label or row number to the indexer. In this example, we’re going to use the same DataFrame as above and extract the values from the first row (row 0):
print(df.loc[0])
pythonThe code above outputs the values for Alice, which are contained in the first row of the DataFrame:
Name Alice
Age 25
City Nottingham
Name: 0, dtype: object
Indexing with iloc[]
(row and column numbers)
Another way to access specific elements in your DataFrame is through row and column numbers. This is probably the most popular way to locate elements in a DataFrame. To use the numeric index, you need the DataFrame attribute iloc.
# Access the first row
print(df.iloc[0])
# Access the value in the first row and second column
print(df.iloc[0, 1])
pythonHere’s the result when working with iloc[]
:
Name Alice
Age 25
City Nottingham
Name: 0, dtype: object
25
Accessing individual values
If you just want to access a single value, the at
indexer is a quick, straightforward way to do so. With this indexer, you can define rows and columns using their labels. For example, if you want to find out where Bob lives, type ‘1’ for the row and ‘City’ for the column:
print(df.at[1, 'City'])
pythonHere, we get the output London.
Alternatively, you can use the iat
indexer, which works similarly to at
but uses the integer-based column position instead of the name. The code below yields the same result as the previous example:
print(df.iat[1, 2])
pythonBoolean indexing
You can also create subsets of a DataFrame based on a condition. This is known as Boolean indexing. The condition should evaluate to either True
or False
and is placed directly in the indexing operator. For example, if you want to select rows that contain people who are over 30 years old, you can use the following code:
# Select rows where Age is greater than 30
print(df[df['Age'] > 30])
pythonThe only person who is over 30 is Charlie, resulting in the following output:
Name Age City
2 Charlie 35 Cardiff
Remember, when performing Boolean indexing, you can use any Boolean comparison operators that evaluate to True
or False
. Learn more about different Python operators in our dedicated article on the topic.