With Python pickle it is possible to serialise and later deseri­al­ise objects. Numerous data types can be con­sidered for con­ver­sion. However, as malicious code can also be stored in a memory file, you should only convert files from trust­worthy sources into the original format.

What is Python pickle?

Python pickle may seem to have an unusual name at first, but if you take a closer look at how the module works and what it is used for, you will quickly un­der­stand why it was given this name. The module allows you to save objects (i.e. ‘preserve them’, hence ‘pickle’) in order to use them at a later time for another project. To do this, the objects are converted into a saveable format. This practice is called seri­al­ising. Python pickle can also be used to deseri­al­ise objects, i.e. to convert them back to their original format. Python pickle is par­tic­u­larly useful if you want to use objects fre­quently.

The object is converted into a byte stream, whereby all in­form­a­tion is trans­ferred unchanged. In addition, Python pickle provides in­struc­tions for suc­cess­ful deseri­al­isa­tion, through which the original structure can be re­con­struc­ted down to the smallest detail. Using Python pickle saves a lot of time, as once an object is created, it does not have to be recreated for each use. The format for saving is .pkl.

Managed Nextcloud from IONOS Cloud
Work together in your own cloud
  • Industry-leading security
  • Com­mu­nic­a­tion and col­lab­or­a­tion tools
  • Hosted and developed in Europe

Which data types can be converted?

Python pickle can serialise the following data types:

  • Boolean values: ‘true’ and ‘false’, also ‘none’
  • Integers and complex numbers
  • Strings (normal and Unicode)
  • Lists
  • Sets
  • Python tuples
  • Dir­ect­or­ies con­sist­ing ex­clus­ively of cor­res­pond­ing objects
  • Functions
  • Python classes

What are the different methods?

There are four methods for working with Python pickle, which are provided within the module:

  • pickle.dump(obj, file, protocol=None, *, fix_imports=True, buffer_callback=None): Is used for seri­al­isa­tion and creates a file with the desired result
  • pickle.dumps(obj, protocol=None, *, fix_imports=True, buffer_callback=None): Is also used for seri­al­isa­tion, but returns a byte string
  • pickle.load(file, *, fix_imports=True, encoding='ASCII', errors="strict", buffers=None): Is used for deseri­al­isa­tion and reads the saved file for this purpose
  • pickle.loads(bytes_object, *, fix_imports=True, encoding="ASCII", errors="strict", buffers=None): Is also used for deseri­al­isa­tion, but operates with a byte string

To dif­fer­en­ti­ate between the methods, you can remember that the ‘s’ in pickle.dumps and pickle.loads stands for ‘String’.

How to use Python pickle with an example

To better il­lus­trate how Python pickle works, we will work with a simple example. We will create a simple list con­tain­ing four colours. This is our code:

import pickle
colours = ['Blue', 'Red', 'Yellow', 'Orange']
python

Then we open a text file in the .pkl format and use pickle.dump() to save our list there. This is the code we use for this:

with open('colours_file.pkl', 'wb') as f:
	pickle.dump(colours, f)
python

The ab­bre­vi­ation wb instructs the system to open the file in binary form. This also outputs the contained data as a bytes object. The ‘colours’ list is then saved in this file with dump(). Finally, the file is closed auto­mat­ic­ally.

How to convert a saved file to its original format

If you now want to deseri­al­ise a binary file again, use the Python method pickle.load(). With the following code, it is possible to convert the object back to its original format and initiate an output. We add the ab­bre­vi­ation rb, which stands for ‘read binary’.

with open('colours_file.pkl', 'rb') as f:
	colours_deserialised = pickle.load(f)
	print(colours_deserialised)
python

This gives us the following output:

['Blue', 'Red', 'Yellow', 'Orange']
python

How to serialise dic­tion­ary with Python pickle

You can also easily serialise more complex data types such as dir­ect­or­ies with Python pickle and then convert them back to their original form. To do this, we first create a directory with the name ‘persons’. Here we store different data for different people:

import pickle
persons = {
	'Person 1': {
		'Name': "Maria", 'Age': 56, 'City': "London"
	},
	'Person 2': {
		'Name': "Paul", 'Age': 66, 'City': "London"
	},
	'Person 3': {
		'Name': "Lisa", 'Age': 22, 'City': "Cardiff"
	},
	'Person 4': {
		'Name': "Lara", 'Age': 34, 'City': "Southport"
	}
}
python

In the following code, we create a new file, convert the data, and then test by con­vert­ing it back to serialise this directory:

with open("persons_dict.pkl", "wb") as f:
	pickle.dump(persons, f)
with open("persons_dict.pkl", "rb") as f:
	deserialised_dict = pickle.load(f)
	print(deserialised_dict)
python

The resulting output looks like this:

persons = {
    'Person 1': { 'Name': "Maria", 'Age': 56, 'City': "London"},
    'Person 2': { 'Name': "Paul", 'Age': 66, 'City': "London"},
    'Person 3': { 'Name': "Lisa", 'Age': 22, 'City': "Cardiff"},
    'Person 4': { 'Name': "Lara", 'Age': 34, 'City': "Southport"}
}
python

You can now access the in­form­a­tion as usual. We request the following output as an example:

# Define dictionary
deserialised_dict = {
    'Person 1': {'Name': "Maria", 'Age': 56, 'City': "London"},
    'Person 2': {'Name': "Paul", 'Age': 66, 'City': "London"},
    'Person 3': {'Name': "Lisa", 'Age': 22, 'City': "Cardiff"},
    'Person 4': {'Name': "Lara", 'Age': 34, 'City': "Southport"}
}
# Print output
print(
    "The name of the third person is"
    + deserialised_dict["Person 3"]["Name"]
    + " and they are "
    + str(deserialised_dict["Person 3"]["Age"])
    + " years old."
)
python

This is what our output looks like:

The name of the third person is Lisa and she is 22 years old.
python

How to convert a class to a string

In the next example, we use Python pickle to save a class in a string. This class contains com­pletely different data types, but all of them can be taken into account. We create a class called ‘Ex­ample­Class’ and then serialise it. The cor­res­pond­ing code is:

import pickle
class ExampleClass:
	def __init__(self):
		self.a_number = 17
		self.a_list = [5, 10, 15]
		self.a_tuple = (18, 19)
		self.a_string = "hello"
		self.a_dict = {"colour": "blue", "number": 3}
example_object = ExampleClass()
serialised_object = pickle.dumps(example_object)
print(f"This is a serialised object:\n{serialised_object}\n")
example_object.a_dict = None
deserialised_object = pickle.loads(serialised_object)
print(f"This is a_dict from the deserialised object:\n{deserialised_object.a_dict}\n")
python

After seri­al­ising the class and then con­vert­ing it back to its original format, we get this output:

This is a serialised object:
b'\x80\x03c__main__\nExampleClass\nq\x00)\x81q\x01.'
This is a_dict from the deserialised object:
{'colour': 'blue', 'number': 3}
python

How to compress seri­al­ised objects

In principle, files saved with Python pickle are com­par­at­ively compact. Nev­er­the­less, it is also possible and, in some cases, advisable to compress the memory files even more. This works, for example, with the free com­pres­sion program bzip2, which is part of the pro­gram­ming language’s standard library. In the following example, we create a string, serialise it and then apply the com­pres­sion program:

import pickle
import bz2
exampleString = """Almost heaven, West Virginia
Blue Ridge Mountains, Shenandoah River
Life is old there, older than the trees
Younger than the mountains, growin' like a breeze
Country roads, take me home
To the place I belong
West Virginia, mountain mama
Take me home, country roads."""
serialised = pickle.dumps(exampleString)
compressed = bz2.compress(serialised)
python

Security in­struc­tions for working with Python pickle

Even though Python pickle is a practical and effective method for con­vert­ing objects, there is one major drawback that you should be aware of when working with the module, which is that there is the pos­sib­il­ity of trans­fer­ring malicious code via seri­al­ised data. Although this is not a problem with your own data, you have to be careful when it comes to third-party files. Therefore, only ever deseri­al­ise memory files when you know and trust the source!

Tip

Deploy directly via GitHub: With Deploy Now from IONOS, you not only benefit from automatic framework detection and a quick setup, but you can also choose between a variety of packages. Find the solution that perfectly suits your needs!

Go to Main Menu