Python gen­er­at­ors are special functions. They generate values step by step and work ef­fi­ciently with memory.

What are Python gen­er­at­ors?

Python gen­er­at­ors are special functions that return a Python iterator. Creating Python gen­er­at­ors is similar to defining normal functions, however, some of the details are slightly different. Gen­er­at­ors have a yield statement instead of a return statement. Like iterators, gen­er­at­ors also implement the next() function.

Note

Python gen­er­at­ors are one of the more advanced concepts in Python pro­gram­ming. If you are already further along and looking for in­form­a­tion that goes beyond the basics covered in Python tutorials for beginners, you might find it helpful to take a look at the following articles:

What is the keyword ‘yield’?

You may already know what a return statement is if you have ex­per­i­ence with Python or other pro­gram­ming languages. A return statement is used to pass values cal­cu­lated by functions to the calling instance in the program code. Once the function’s return statement has been reached, the function is exited, and its execution is ter­min­ated. The function can be called again if necessary.

Things are different with yield. This keyword takes the place of the return statement in Python gen­er­at­ors. When the generator is called, the value passed to the yield statement is returned. The Python generator is in­ter­rup­ted rather than ter­min­ated. This saves the current state of the generator function. When the generator function is called again, it will jump to the saved location.

What can Python gen­er­at­ors be used for?

Generator functions are ideally suited for working with very large data sets. This is because Python gen­er­at­ors follow the ‘lazy eval­u­ation’ principle, only eval­u­at­ing values when they are needed.

A normal function loads the entire file contents into a variable, which then goes straight into your memory. Your local memory might not be suf­fi­cient for large amounts of data, and as a result, you may end up with a Memory­Er­ror. Gen­er­at­ors simplify this by reading files line by line. The yield keyword returns the value that you need and then in­ter­rupts the function’s execution until the next function call processes another line of the file.

Tip

Several web ap­plic­a­tions need to process large amounts of data. This makes Python a suitable choice for web projects. Deploy Now from IONOS creates web projects quickly by using GitHub for automatic de­ploy­ment and building.

Not only do Python gen­er­at­ors make handling of large amounts of data easier, they also fa­cil­it­ate working with infinity. Since local memory is finite, gen­er­at­ors are the only way to create infinite lists or similar struc­tures in Python.

How to read CSV files with Python gen­er­at­ors

The following program allows you to read a CSV file line by line in a memory-efficient manner:

import csv
def csv_read(filename):
 with open(filename, 'r') as file:
  tmp = csv.reader(file)
  for line in tmp:
   yield line
for line in csv_read('test.csv'):
 print(line)
Python

In the code example above, we first imported the csv module to gain access to Python’s functions for pro­cessing CSV files. Next, the Python generator’s defin­i­tion ‘csv_read’ appears. This starts with the keyword ‘def’ just like function defin­i­tions. After the file is opened, the python for loop iterates through the file line by line. Each line is returned using the keyword ‘yield’. Outside the generator function, the lines that the Python generator returns are output to the console one by one. The Python print function is used for this.

How to create infinite data struc­tures with Python gen­er­at­ors

As you can imagine, an infinite data structure cannot be stored locally on your computer. However, infinite data struc­tures are essential for some ap­plic­a­tions. Generator functions are useful for these ap­plic­a­tions, because they can process each element one by one and do not overrun the memory. The following Python code is an example of an infinite sequence of natural numbers:

def natural_numbers():
 n = 0
 while True:
   yield n
  n += 1
for number in natural_numbers():
 print(number)
Python

First, a Python generator named ‘natural_numbers’ is defined. This sets the initial value for the variable ‘n’. Then, an endless python while loop is started. The variable’s current value is returned with ‘yield’ and the execution of the generator function is in­ter­rup­ted. When the function is called again, the number pre­vi­ously output is in­cre­men­ted by 1 and the generator is run again until the in­ter­pret­er comes across the ‘yield’ keyword. The numbers generated by the generator are output in the for loop below the generator function. If the program is not manually in­ter­rup­ted, it will run in­def­in­itely.

What is the shorthand notation for Python gen­er­at­ors?

Python lists can be created with list com­pre­hen­sions in just one line of code. A similar shorthand notation for gen­er­at­ors also exists. Let’s look at a generator that produces a sequence of numbers from 0 to 9 and in­cre­ments each number by a value of 1. This example is similar to the generator pre­vi­ously used to generate an infinite sequence of natural numbers.

def natural_numbers():
 n = 0
 while n <= 9:
  yield n
  n+=1
Python

Use a for statement in round brackets if you want to write this generator in one line of code, like in the following example:

increment_generator = (n + 1 for n in range(10))
Python

The following output will appear if you want to output this generator:

<generator object <genexpr> at 0x0000020CC5A2D6C8>

This shows you where the generator object is in your memory. Use the next() function to access the output of your generator:

print(next(increment_generator))
print(next(increment_generator))
print(next(increment_generator))
Python

This code section provides output showing how numbers from 0 to 2 have been in­cre­men­ted by 1:

1
2
3

What is the dif­fer­ence between gen­er­at­ors and list com­pre­hen­sions?

The shorthand notation of gen­er­at­ors is very similar to list com­pre­hen­sions. The only visible dif­fer­ence is the par­en­theses. While the square brackets are used for com­pre­hen­sions, the round brackets are used to create Python gen­er­at­ors. But there is a more sig­ni­fic­ant dif­fer­ence: the memory re­quire­ments for gen­er­at­ors are much smaller than lists.

import sys
increment_list = [n + 1 for n in range(100)]
increment_generator = (n + 1 for n in range(100))
print(sys.getsizeof(increment_list))
print(sys.getsizeof(increment_generator))
Python

The program above outputs the list’s memory re­quire­ments and the memory re­quire­ments for the generator:

912
120

While a list requires 912 bytes of memory, a generator only needs 120 bytes. The dif­fer­ence is even greater when there is more data to process.

Go to Main Menu