Python pandas read_csv() is one of the most commonly used methods to read CSV files into pandas and store them as DataFrames. CSV files (comma-separated values) are a widely used format for storing tabular data and are supported by many applications.

Web Hosting
Secure, reliable hosting for your website
  • 99.9% uptime and super-fast loading
  • Advanced security features
  • Domain and email included

What is the syntax for Python pandas read_csv()?

pandas.read_csv() creates a pandas DataFrame from a CSV file. The basic syntax of the function looks like this:

import pandas as pd
df = pd.read_csv(filepath_or_buffer, sep=',', header='infer', names=None, index_col=None, usecols=None, dtype=None, ...)
python

What are the most important parameters for pandas.read_csv()?

pandas.read_csv() can accept a wide variety of parameters. To keep things simple, we’ll focus on the most important arguments. Here’s an overview of the key parameters you can use to specify how the function should behave:

Parameter Meaning Default Value
filepath_or_buffer This is a Python string representing the path to the CSV file or a data buffer, such as a URL
sep This specifies the delimiter between values. ,
header Indicates which row to use as the header. infer (first row)
names If header=None is set, you can use names to provide a Python list of column names.
index_col Determines which column to use as the index. None
usecols This parameter allows you to select which columns you want to load into the DataFrame. None
dtype Specifies the data type of the columns. None

You can find a comprehensive list of the parameters for this function in the pandas documentation.

How to access CSV files step by step

Using pandas.read_csv(), you can easily transfer data from CSV files into Python in just a few steps.

In the following examples, we’ll be working with a CSV file that’s structured like this:

1,John Avery,35,Nottingham,50000
2,Adelaide Smith,29,London,62000
3,Michael Rivera,41,Cardiff,40000
4,Grace Kim,33,Hull,35000
5,Tyler Johnson,28,Kent,52000

Step 1: Import pandas

First, import the pandas library into your Python script.

import pandas as pd
python

Step 2: Load the CSV file

Now, you can load your CSV file to Python pandas using the read_csv() function. Simply pass the filepath to the function. In the following code, we’ll use a file named data.csv, which is saved in the same directory as the script:

df = pd.read_csv('data.csv')
python

The code above stores the file in a DataFrame object (df), which we’ll then be able to work with. Pandas will automatically interpret the first row as column headers unless you specify otherwise.

Step 3: Display the CSV file

It’s a good idea to take a look at the first few rows of the DataFrame to make sure the file has been loaded correctly. You can use the DataFrame.head() function for this. By default, it shows the first five rows of the DataFrame, giving you a quick overview of the data’s structure:

print(df.head())
python

The output looks like this:

0  1        John Avery   35      Nottingham  	50000
1  2    Adelaide Smith   29   	 London 	    62000
2  3   Michael Rivera    41      Cardiff	   	40000
3  4        Grace Kim    33      Hull 		    35000
4  5    Tyler Johnson    28      Kent   		52000

Step 4: Change the column names (optional)

If your CSV file doesn’t have a header row, you can define the column names manually:

df = pd.read_csv('data.csv', header=None, names=['ID', 'Name', 'Age', 'City', 'Salary'])
python

In this example, we’ve named the columns ID, Name, Age, City and Salary. The output looks like this:

ID                Name    	Age            City    	Salary
0  1          John Avery    	35        Nottingham    50000
1  2     Adelaide Smith    	29    	London        62000
2  3    Michael Rivera    	41        Cardiff    	40000
3  4          Grace Kim    	33        Hull        	35000
4  5     Tyler Johnson    	28        Kent        52000
Note

In the example we used, there was a small amount of data, making it simple to manage. However, if you have a large CSV file, it’s a good idea to read it into pandas in chunks to avoid memory issues. You can use the pandas.read_csv() parameter chunksize to specify how many rows to read at a time. Using a Python for loop, you can iterate over the chunks.

Was this article helpful?
Go to Main Menu