In Python pandas, you can use the unique() function to identify unique values in a column of a DataFrame. This makes it easy to get a quick overview of the different values within your dataset.

Web Hosting
Secure, reliable hosting for your website
  • 99.9% uptime and super-fast loading
  • Advanced security features
  • Domain and email included

What is the syntax of pandas DataFrame[].unique()?

The basic syntax for using pandas unique() is simple. This is because the function doesn’t take any parameters:

DataFrame['column_name'].unique()
python

Keep in mind that unique() can only be applied to one column. Before calling the function, you’ll need to indicate which column you want to evaluate. The unique() function returns a numpy array containing all the different values in the order they appear, with duplicate values in the column removed. It doesn’t, however, sort the values.

Note

If you’ve been working with Python for a while, you may be familiar with the numpy equivalent to pandas unique(). For efficiency reasons, the pandas version is generally preferable.

How to use pandas DataFrame[].unique()

To use unique() in a pandas DataFrame, you need to first specify the column you want to check. In the following example, we’ll use a DataFrame that contains information about the age and city of residence of a group of individuals.

import pandas as pd
# Create a sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Edward'],
    'Age': [24, 27, 22, 32, 29],
    'City': ['Newcastle', 'London', 'Newcastle', 'Cardiff', 'London']
}
df = pd.DataFrame(data)
print(df)
python

The resulting DataFrame looks like this:

Name  	Age       City
0    Alice    	24    	Newcastle
1    Bob    	27  		London
2  Charlie    	22    	Newcastle
3    David    	32    	Cardiff
4   Edward    	29  		London

Now, let’s say we want to create a list of all the cities where the people in the DataFrame live. We can apply the pandas unique() function to the column that contains the cities.

# Find different cities
unique_cities = df['City'].unique()
print(unique_cities)
python

The output is a numpy array that lists each city once, showing that the individuals in the DataFrame are from a total of three cities: Newcastle, London and Cardiff.

['Newcastle' 'London' 'Cardiff']
Was this article helpful?
Go to Main Menu