How to filter for distinct values with pandas DataFrame[].unique()

IONOS editorial team26/06/20252 mins

Contents

In Python pandas, you can use the unique() function to identify unique values in a column of a DataFrame. This makes it easy to get a quick overview of the different values within your dataset.

Web Hosting

Secure, reliable hosting for your website

99.9% uptime and super-fast loading
Advanced security features
Domain and email included

What is the syntax of pandas `DataFrame[].unique()`?

The basic syntax for using pandas unique() is simple. This is because the function doesn’t take any parameters:

DataFrame['column_name'].unique()

python

Keep in mind that unique() can only be applied to one column. Before calling the function, you’ll need to indicate which column you want to evaluate. The unique() function returns a numpy array containing all the different values in the order they appear, with duplicate values in the column removed. It doesn’t, however, sort the values.

Note

If you’ve been working with Python for a while, you may be familiar with the numpy equivalent to pandas unique(). For efficiency reasons, the pandas version is generally preferable.

How to use pandas DataFrame[].unique()

To use unique() in a pandas DataFrame, you need to first specify the column you want to check. In the following example, we’ll use a DataFrame that contains information about the age and city of residence of a group of individuals.

import pandas as pd
# Create a sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Edward'],
    'Age': [24, 27, 22, 32, 29],
    'City': ['Newcastle', 'London', 'Newcastle', 'Cardiff', 'London']
}
df = pd.DataFrame(data)
print(df)

python

The resulting DataFrame looks like this:

Name  	Age       City
0    Alice    	24    	Newcastle
1    Bob    	27  		London
2  Charlie    	22    	Newcastle
3    David    	32    	Cardiff
4   Edward    	29  		London

Now, let’s say we want to create a list of all the cities where the people in the DataFrame live. We can apply the pandas unique() function to the column that contains the cities.

# Find different cities
unique_cities = df['City'].unique()
print(unique_cities)

python

The output is a numpy array that lists each city once, showing that the individuals in the DataFrame are from a total of three cities: Newcastle, London and Cardiff.

['Newcastle' 'London' 'Cardiff']

Was this article helpful?

Stay on top of AI!

How to select data from pandas DataFrames with loc[]

The pandas DataFrame feature loc[] offers an easy way to extract data using labels. It’s especially useful when working with data where the positions of rows and columns aren’t always predictable. In this article, we’ll go over the syntax for pandas loc[], how to use it and what…

Python Pandas

BEST-BACKGROUNDSShutterstock

How to loop through DataFrames with pandas iterrows()

Pandas DataFrame.iterrows() is a helpful function for looping through rows in a DataFrame, especially when you need to process data row by row. This is especially useful for calculations or conditional logic. In this article, we’ll cover the syntax of panda iterrows() and show…

Python Pandas

Mr. Kosalshutterstock

How to index pandas DataFrames

Pandas DataFrame indexing is a powerful tool for efficient and effective data handling. With various methods, you can target specific data and subsets of your DataFrame. In this article, we’ll explore what the pandas DataFrame index is, how to access column and row data using…

Python Pandas

BEST-BACKGROUNDSShutterstock

How to clean data in pandas with dropna()

The pandas DataFrame.dropna() function is a powerful tool for cleaning datasets. The function efficiently removes missing values and can be used with various parameters, allowing programmers to specify different requirements for data cleaning. Learn about the syntax, parameters…

Python Pandas

Ranjit Karmakarshutterstock

What is the pandas DataFrame describe() method?

The pandas DataFrame.describe() method offers a quick way to generate a comprehensive statistical summary of numerical data in a DataFrame. With the ability to adjust percentiles and specify data types, it’s highly flexible and suited to a wide range of analysis. In this article,…

Python Pandas

How to filter for distinct values with pandas DataFrame[].unique()

What is the syntax of pandas DataFrame[].unique()?

How to use pandas DataFrame[].unique()

What is the syntax of pandas `DataFrame[].unique()`?