The Docker vir­tu­al­isa­tion solution has fun­da­ment­ally altered how software is built, dis­trib­uted, and operated over the last decade. Unlike its pre­de­cessors – virtual machines (VM) – Docker vir­tu­al­ises in­di­vidu­al ap­plic­a­tions. So, a Docker container is an ap­plic­a­tion or software container.

The term ‘software container’ is based on physical con­tain­ers, such as those used on ships. In logistics, con­tain­ers as stand­ard­ised units are what has made modern retail chains possible. Thus, a container can be trans­por­ted on any ship, truck, or train designed for this purpose. This works largely in­de­pend­ently of the contents of the container. On the outside, the container is equipped with stand­ard­ised in­ter­faces. This is quite similar to how Docker con­tain­ers work.

Cheap domain names – buy yours now
  • Free website pro­tec­tion with SSL Wildcard included
  • Free private re­gis­tra­tion for greater privacy
  • Free Domain Connect for easy DNS setup

What is a Docker container?

So, what exactly is a Docker container? Let’s pass the mic to the Docker de­velopers:

Quote

’Con­tain­ers are a stand­ard­ised unit of software that allows de­velopers to isolate their app from its en­vir­on­ment’. - Source: https://www.docker.com/why-docker

Unlike a physical container, a Docker container exists in a virtual en­vir­on­ment. A physical container is assembled based on a stand­ard­ised spe­cific­a­tion. We see something similar with virtual con­tain­ers. A Docker container is created from an immutable template called an ‘image’. A Docker image contains the de­pend­en­cies and con­fig­ur­a­tion settings required to create a container.

Just as many physical con­tain­ers can stem from a single spe­cific­a­tion, any number of Docker con­tain­ers can be created from a single image. Docker con­tain­ers thus form the basis for scalable services and re­pro­du­cible ap­plic­a­tion en­vir­on­ments. We can create a container from an image and also save an existing container in a new image. You can run, pause, and stop processes within a container.

Unlike a virtual machine (VM) vir­tu­al­isa­tion, a Docker container does not contain its own operating system (OS). Instead, all the con­tain­ers running on a Docker host access the same OS kernel. When Docker is deployed on a Linux host, the existing Linux kernel is used. If the Docker software runs on a non-Linux system, a minimal Linux system image is used via a hy­per­visor or virtual machine.

A certain amount of system resources is allocated to each container upon execution. This includes RAM, CPU cores, mass storage and (virtual) network devices. Tech­nic­ally, ‘cgroups’ (short for ‘control groups’) limit a Docker container’s access to system resources. ‘Kernel namespaces’ are used to partition the kernel resources and dis­tin­guish the processes from each other.

Ex­tern­ally, Docker con­tain­ers com­mu­nic­ate over the network. To do this, specific services listen for exposed ports. These are usually web or database servers. The con­tain­ers them­selves are con­trolled on the re­spect­ive Docker host via the Docker API. Con­tain­ers can be started, stopped, and removed. The Docker client provides a command line interface (CLI) with the ap­pro­pri­ate commands.

What dif­fer­en­ti­ates Docker con­tain­ers and Docker images?

The two terms ‘Docker container’ and ‘Docker image’ often cause confusion. This is hardly sur­pris­ing, since it is a bit of a chicken-or-the-egg dilemma. A container is created from an image; however, a container can also be saved as a new image. Let’s take a look at the dif­fer­ences between the two concepts in detail.

A Docker image is an inert template. The image only takes up some space on a hard drive and does nothing else. In contrast, the Docker container is a ‘living’ instance. A running Docker container has a behaviour; it interacts with the en­vir­on­ment. Fur­ther­more, a container has a state that changes over time, using a variable amount of RAM.

You may be familiar with the concepts of ‘class’ and ‘object’ from object-ori­ent­ated pro­gram­ming (OOP). The re­la­tion­ship between a Docker container and a Docker image is kind of similar to the re­la­tion­ship between an object and its as­so­ci­ated class. A class exists only once; several similar objects can be created from it. The class itself is loaded from a source code file. There is a similar pattern in the Docker universe. A template is created from a source unit, a ‘Dock­er­file’, which in turn creates many instances:

Source text Template Instance
Docker concept Dock­er­file Docker image Docker container
Pro­gram­ming analogy Class source code loaded class in­stan­ti­ated object
Tip

We refer to the Docker container as a ‘running instance’ of the as­so­ci­ated image. The terms ‘instance’ and ‘in­stan­ti­ate’ are abstract right now. If you don’t really get it, let’s use a mnemonic device. Replace ‘in­stan­ti­ate’ with ‘cut out’ in your mind. Even if there is no re­la­tion­ship between the words, there is a strong cor­res­pond­ence between their meanings in computer science terms. Think of the principle like this: Just as we use a cookie cutter to cut out many similar cookies from a layer of dough, we in­stan­ti­ate many similar objects from a template. So then, in­stan­ti­at­ing is when a template creates an object.

How is a Docker container built?

To un­der­stand how a Docker container is built, it helps to look at the ‘Twelve-Factor App’ meth­od­o­logy. This is a col­lec­tion of twelve fun­da­ment­al prin­ciples for building and operating service-ori­ent­ated software. Both Docker and the twelve-factor app date back to 2011. The twelve-factor app helps de­velopers design software-as-a-service apps according to specific standards. These include:

  • Using de­clar­at­ive formats for setup auto­ma­tion to minimise time and cost for new de­velopers joining the project;
  • Having a clean contract with the un­der­ly­ing operating system, offering maximum port­ab­il­ity between execution en­vir­on­ments;
  • Being suitable for de­ploy­ment on modern cloud platforms, obviating the need for servers and systems ad­min­is­tra­tion;
  • Min­im­ising di­ver­gence between de­vel­op­ment and pro­duc­tion, enabling con­tinu­ous de­ploy­ment for maximum agility;
  • And being able to scale up without sig­ni­fic­ant changes to tooling, ar­chi­tec­ture, or de­vel­op­ment practices.

The structure of a Docker container is based on these prin­ciples. A Docker container includes the following com­pon­ents, which we will look at in detail below:

  1. Container operating system and union file system
  2. Software com­pon­ents and con­fig­ur­a­tion
  3. En­vir­on­ment variables and runtime con­fig­ur­a­tion
  4. Ports and volumes
  5. Processes and logs

Container operating system and union file system

Unlike a virtual machine, a Docker container does not contain its own operating system. Instead, all the con­tain­ers running on a Docker host access a shared Linux kernel. Only a minimal execution layer is included in the container. This usually includes an im­ple­ment­a­tion of the C standard library and a Linux shell for running processes. Here is an overview of the com­pon­ents in the official ‘Alpine Linux’ image:

Linux kernel C standard library Unix commands
from host musl libc BusyBox

A Docker image consists of a stack of read-only file system layers. A layer describes the changes to the file system in the layer below it. Using a special union file system such as overlay2, the layers are overlaid and unified into a con­sist­ent interface. Another writable layer is added to the read-only layers when you create a Docker container from an image. All the changes made to the file system are in­cor­por­ated into the writable layer using the ‘copy-on-write’ method.

Software com­pon­ents and con­fig­ur­a­tion

Building on the minimal container operating system, ad­di­tion­al software com­pon­ents are installed in a Docker container. This is usually followed by further setup and con­fig­ur­a­tion steps. The standard methods are used for in­stall­a­tion:

  • via a system package manager like apt, apk, yum, brew, etc.
  • via a pro­gram­ming language package manager like pip, npm, composer, gem, cargo, etc.
  • by compiling in the container with make, mvn, etc.

Here are some examples of software com­pon­ents commonly used in Docker con­tain­ers:

Ap­plic­a­tion area Software com­pon­ents
Pro­gram­ming languages PHP, Python, Ruby, Java, JavaS­cript
De­vel­op­ment tools node/npm, React, Laravel
Database systems MySQL, Postgres, MongoDB, Redis
Web servers Apache, nginx, lighttpd
Caches and proxies Varnish, Squid
Content man­age­ment systems WordPress, Magento, Ruby on Rails

En­vir­on­ment variables and runtime con­fig­ur­a­tion

Following the twelve-factor app meth­od­o­logy, the Docker container con­fig­ur­a­tion is stored in en­vir­on­ment variables, called ‘Env-Vars’. Here, we un­der­stand con­fig­ur­a­tion as all values that change between the different en­vir­on­ments, such as the de­vel­op­ment vs. pro­duc­tion system. This often includes hostnames and database cre­den­tials.

The values of the en­vir­on­ment variables influence how the container behaves. Two primary methods are used to make en­vir­on­ment variables available within a container:

1. Defin­i­tion in Dock­er­file

The ENV statement declares an en­vir­on­ment variable in the Dock­er­file. An optional default value can be assigned. This comes into effect if the en­vir­on­ment variable is empty when the container is started.

2. Pass when starting the container

To access an en­vir­on­ment variable in the container that was not declared in the Dock­er­file, we pass the variable when we start the container. This works for single variables via command line para­met­ers. Fur­ther­more, an ‘env file’, which defines several en­vir­on­ment variables together with their values, can be passed.

Here is how to pass an en­vir­on­ment variable when starting the container:

docker run --env <env-var> <image-id></image-id></env-var>

It is useful to pass an env file for many en­vir­on­ment variables:

docker run --env-file /path/to/.env <image-id></image-id>
Note

The ‘docker inspect’ command can be used to display the en­vir­on­ment variables present in the container along with their values. Therefore, you must be careful when using con­fid­en­tial data in en­vir­on­ment variables.

When starting a container from an image, con­fig­ur­a­tion para­met­ers can be passed. These include the amount of allocated system resources, which is otherwise unlimited. Fur­ther­more, start para­met­ers are used to define ports and volumes for the container. We’ll learn more about this in the next section. The startup para­met­ers may override any default values in the Dock­er­file. Here are a few examples.

Allocate a CPU core and 10 megabytes of RAM to the Docker container at startup:

docker run --cpus="1" --memory="10m" <image-id></image-id>

Expose ports defined the in Dock­er­file when starting the container:

docker run -P <image-id></image-id>

Map TCP port 80 of the Docker host to port 80 of the Docker container:

docker run -p 80:80/tcp <image-id></image-id>

Ports and volumes

A Docker container contains an ap­plic­a­tion that is isolated from the outside world. For this to be useful, it must be possible to interact with the en­vir­on­ment. Therefore, there are ways to exchange data between host and container, as well as between multiple con­tain­ers. Stand­ard­ised in­ter­faces allow con­tain­ers to be used in different en­vir­on­ments.

Com­mu­nic­a­tion with processes running in the container from the outside runs over exposed network ports. This uses TCP and UDP standard protocols. For example, let’s imagine a Docker container that contains a web server; it listens on TCP port 8080. The Docker image’s Dock­er­file also contains the line ‘EXPOSE 8080/tcp’. We start the container with ‘docker run -P’ and access the web server at ‘http://localhost:8080’.

Ports are used to com­mu­nic­ate with services running in the container. However, in many cases it can make sense to use a file shared between the container and the host system to exchange data. This is why Docker knows different types of volumes:

  • Named volume – re­com­men­ded
  • Anonymous volumes – are lost when the container is removed
  • Bind mounts – his­tor­ic­al and not re­com­men­ded; per­form­ant
  • Tmpfs mounts – located in RAM; only on Linux

The dif­fer­ences between the volume types are subtle. The choice of the right type depends heavily on the par­tic­u­lar use case. A detailed de­scrip­tion would go beyond the scope of this article.

Processes and logs

A Docker container usually en­cap­su­lates an ap­plic­a­tion or service. The software executed inside the container forms a set of running processes. The processes in a Docker container are isolated from processes in other con­tain­ers or the host system. Processes can be started, stopped, and listed within a Docker container. It is con­trolled via the command line or via the Docker API.

Running processes con­tinu­ously output status in­form­a­tion. Following the twelve-factor app meth­od­o­logy, the standard STDOUT and STDERR data streams are used for output. The output on these two data streams can be read with the ‘docker logs’ command. Something called a ‘logging driver’ can also be used. The default logging driver writes logs in JSON format.

How and where are Docker con­tain­ers used?

Docker is used in all parts of the software lifecycle nowadays. This includes de­vel­op­ment, testing, and operation. Con­tain­ers running on a Docker host are con­trolled via the Docker API. The Docker client accepts commands on the command line; special or­ches­tra­tion tools are used to control clusters of Docker con­tain­ers.

The basic pattern to deploy Docker con­tain­ers looks like this:

  1. The Docker host downloads the Docker image from the registry.
  2. The Docker container is created and started from the image.
  3. The ap­plic­a­tion in the container runs until the container is stopped or removed.

Let’s take a look at two Docker container de­ploy­ment examples:

Deploying Docker con­tain­ers in the local de­vel­op­ment en­vir­on­ment

The use of Docker con­tain­ers is par­tic­u­larly popular in software de­vel­op­ment. Usually, software is developed by a team of spe­cial­ists. A col­lec­tion of de­vel­op­ment tools known as a toolchain is used for this purpose. Each tool is in a specific version, and the whole chain works only if the versions are com­pat­ible with each other. Fur­ther­more, the tools must be con­figured correctly.

To ensure that the de­vel­op­ment en­vir­on­ment is con­sist­ent, de­velopers use Docker. A Docker image is created once, and it contains the entire correctly-con­figured toolchain. Each developer on the team pulls the Docker image onto their local machine and starts a container from it. De­vel­op­ment then takes place within the container. The image is updated centrally if there is a change to the toolchain.

Deploying Docker con­tain­ers in or­ches­trated clusters

Data centres of hosting providers and Platform-as-a-Service (PaaS) providers use Docker container clusters. Each service (load balancer, web server, database server, etc.) runs in its own Docker container. At the same time, a single container can only handle a certain load. Or­ches­tra­tion software monitors the con­tain­ers and their load and condition. The or­ches­trat­or starts ad­di­tion­al con­tain­ers when the load increases. This approach allows services to scale up quickly in response to changing con­di­tions.

Ad­vant­ages and dis­ad­vant­ages of Docker container vir­tu­al­isa­tion

The ad­vant­ages of vir­tu­al­isa­tion with Docker can be seen in par­tic­u­lar with regard to the use of virtual machines (VMs). Docker con­tain­ers are much more light­weight than VMs. They can be started faster and consume fewer resources. The images un­der­ly­ing Docker con­tain­ers are also smaller by several orders of magnitude. While VM images are usually hundreds of MB to a few GB in size, Docker images start at just a few MB.

However, container vir­tu­al­isa­tion with Docker also has some drawbacks. Since a container does not contain its own operating system, the isolation of the processes running in it is not quite perfect. Using many con­tain­ers results in a high degree of com­plex­ity. Fur­ther­more, Docker is an evolved system, and now the Docker platform does too much. De­velopers are thus working harder to break down the in­di­vidu­al com­pon­ents.

Go to Main Menu