The DataFrame.mean() function in Python pandas is used to calculate averages across one or more axes of a DataFrame. Pandas mean() is essential for analysing numerical data. In addition to computing average values, it also offers insights on the dis­tri­bu­tion of data.

Web hosting
The hosting your website deserves at an un­beat­able price
  • Loading 3x faster for happier customers
  • Rock-solid 99.99% uptime and advanced pro­tec­tion
  • Only at IONOS: up to 500 GB included

What is the syntax for DataFrame.mean()?

The pandas mean() function accepts up to three para­met­ers and has the following syntax:

DataFrame.mean(axis=None, skipna=True, numeric_only=None)
python

What para­met­ers can be used with pandas Dataframe.mean?

You can use different para­met­ers to customise how pandas DataFrame.mean() works.

Parameter De­scrip­tion Default Value
axis Specifies whether the cal­cu­la­tion is done over rows (axis=0) or columns (axis=1) 0
skipna If set to True, NaN values will be ignored True
numeric_only If set to True, only numeric data types will be included in the cal­cu­la­tion False

How to use pandas mean()

You can apply the pandas DataFrame.mean() function to both columns and rows.

Cal­cu­lat­ing average values for columns

First, we’re going to create a pandas DataFrame with some numerical data:

import pandas as pd
data = {
    'A': [1, 2, 3, 4],
    'B': [4, 5, 6, 7],
    'C': [7, 8, 9, 10]
}
df = pd.DataFrame(data)
print(df)
python

The resulting DataFrame looks like this:

A  B    C
0  1  4    7
1  2  5    8
2  3  6    9
3  4  7  10

To calculate the average of each column, you can use the pandas mean() function. By default, the axis parameter is set to 0, which cor­res­ponds to columns.

column_means = df.mean()
print(column_means)
python

The code above cal­cu­lates the mean for each column (A, B and C) by finding the sum of the elements in the re­spect­ive column and then dividing it by the number of elements in the column. The result is the following pandas Series:

A    2.5
B    5.5
C    8.5
dtype: float64

Cal­cu­lat­ing average values for rows

If you want to find the average for rows, simply set the axis parameter to 1:

row_means = df.mean(axis=1)
print(row_means)
python

Pandas mean() cal­cu­lates row averages by dividing the sum of elements in a row by the number of elements it has. Calling the function above produces the following output:

0    4.0
1    5.0
2    6.0
3    7.0
dtype: float64

Handling NaN values

In this example, we’ll use a different DataFrame, which contains NaN values:

import pandas as pd
import numpy as np
data = {
    'A': [1, 2, np.nan, 4],
    'B': [4, np.nan, 6, 7],
    'C': [7, 8, 9, np.nan]
}
df = pd.DataFrame(data)
print(df)
python

The code above produces the following DataFrame:

A    B    C
0  1.0  4.0  7.0
1  2.0  NaN  8.0
2  NaN  6.0  9.0
3  4.0  7.0  NaN

When cal­cu­lat­ing the averages for columns, the skipna parameter de­term­ines whether NaN values should be included or ignored. By default, skipna is set to True, so df.mean() auto­mat­ic­ally ignores NaN values. If you want to include NaN values, you need to add skipna=False as a parameter. Doing so will cause any column with at least one NaN to return NaN as its mean.

mean_with_nan = df.mean() 
print(mean_with_nan)
python

Calling df.mean() produces the following output:

A    2.333333
B    5.666667
C    8.000000
dtype: float64
Go to Main Menu