How to load files into Python with pandas read_csv()

IONOS editorial team20/06/20254 mins

Contents

Python pandas read_csv() is one of the most commonly used methods to read CSV files into pandas and store them as DataFrames. CSV files (comma-separated values) are a widely used format for storing tabular data and are supported by many applications.

Web Hosting

Secure, reliable hosting for your website

99.9% uptime and super-fast loading
Advanced security features
Domain and email included

What is the syntax for Python pandas `read_csv()`?

pandas.read_csv() creates a pandas DataFrame from a CSV file. The basic syntax of the function looks like this:

import pandas as pd
df = pd.read_csv(filepath_or_buffer, sep=',', header='infer', names=None, index_col=None, usecols=None, dtype=None, ...)

python

What are the most important parameters for `pandas.read_csv()`?

pandas.read_csv() can accept a wide variety of parameters. To keep things simple, we’ll focus on the most important arguments. Here’s an overview of the key parameters you can use to specify how the function should behave:

Parameter	Meaning	Default Value
`filepath_or_buffer`	This is a Python string representing the path to the CSV file or a data buffer, such as a URL
`sep`	This specifies the delimiter between values.	`,`
`header`	Indicates which row to use as the header.	`infer` (first row)
`names`	If `header=None` is set, you can use `names` to provide a Python list of column names.
`index_col`	Determines which column to use as the index.	`None`
`usecols`	This parameter allows you to select which columns you want to load into the DataFrame.	`None`
`dtype`	Specifies the data type of the columns.	`None`

You can find a comprehensive list of the parameters for this function in the pandas documentation.

How to access CSV files step by step

Using pandas.read_csv(), you can easily transfer data from CSV files into Python in just a few steps.

In the following examples, we’ll be working with a CSV file that’s structured like this:

1,John Avery,35,Nottingham,50000
2,Adelaide Smith,29,London,62000
3,Michael Rivera,41,Cardiff,40000
4,Grace Kim,33,Hull,35000
5,Tyler Johnson,28,Kent,52000

Step 1: Import pandas

First, import the pandas library into your Python script.

import pandas as pd

python

Step 2: Load the CSV file

Now, you can load your CSV file to Python pandas using the read_csv() function. Simply pass the filepath to the function. In the following code, we’ll use a file named data.csv, which is saved in the same directory as the script:

df = pd.read_csv('data.csv')

python

The code above stores the file in a DataFrame object (df), which we’ll then be able to work with. Pandas will automatically interpret the first row as column headers unless you specify otherwise.

Step 3: Display the CSV file

It’s a good idea to take a look at the first few rows of the DataFrame to make sure the file has been loaded correctly. You can use the DataFrame.head() function for this. By default, it shows the first five rows of the DataFrame, giving you a quick overview of the data’s structure:

print(df.head())

python

The output looks like this:

0  1        John Avery   35      Nottingham  	50000
1  2    Adelaide Smith   29   	 London 	    62000
2  3   Michael Rivera    41      Cardiff	   	40000
3  4        Grace Kim    33      Hull 		    35000
4  5    Tyler Johnson    28      Kent   		52000

Step 4: Change the column names (optional)

If your CSV file doesn’t have a header row, you can define the column names manually:

df = pd.read_csv('data.csv', header=None, names=['ID', 'Name', 'Age', 'City', 'Salary'])

python

In this example, we’ve named the columns ID, Name, Age, City and Salary. The output looks like this:

ID                Name    	Age            City    	Salary
0  1          John Avery    	35        Nottingham    50000
1  2     Adelaide Smith    	29    	London        62000
2  3    Michael Rivera    	41        Cardiff    	40000
3  4          Grace Kim    	33        Hull        	35000
4  5     Tyler Johnson    	28        Kent        52000

Note

In the example we used, there was a small amount of data, making it simple to manage. However, if you have a large CSV file, it’s a good idea to read it into pandas in chunks to avoid memory issues. You can use the pandas.read_csv() parameter chunksize to specify how many rows to read at a time. Using a Python for loop, you can iterate over the chunks.

Was this article helpful?

Mr. Kosalshutterstock

How to index pandas DataFrames

Pandas DataFrame indexing is a powerful tool for efficient and effective data handling. With various methods, you can target specific data and subsets of your DataFrame. In this article, we’ll explore what the pandas DataFrame index is, how to access column and row data using…

Python Pandas

ESB Professionalshutterstock

How to use Pandas DataFrame to manipulate tables quickly in Python

The Pandas module is one of the most powerful tools for data manipulation in Python. One of the central data structures in Pandas is the DataFrame. DataFrames can be used to manipulate two-dimensional, structured data efficiently. We explain the structure of the data structure as…

Python Pandas

What is the Python pandas property iloc[]?

When working with DataFrames in Python pandas, not all rows or columns of a DataFrame are always relevant for data analysis. The pandas DataFrame property iloc[] is a useful tool for selecting rows or columns using their indices. In this article, we’ll take a look at the syntax…

Python Pandas

How to load files into Python with pandas read_csv()

What is the syntax for Python pandas read_csv()?

What are the most important parameters for pandas.read_csv()?

How to access CSV files step by step

Step 1: Import pandas

Step 2: Load the CSV file

Step 3: Display the CSV file

Step 4: Change the column names (optional)

What is the syntax for Python pandas `read_csv()`?

What are the most important parameters for `pandas.read_csv()`?