The rapid de­vel­op­ment of ar­ti­fi­cial in­tel­li­gence (AI) is making machines smarter. Since they have the ability to use input data to learn autonom­ously, machines are creating new ways to support humans in carrying out in­creas­ingly complex tasks.

One solution that’s very promising and already de­liv­er­ing im­press­ive results in many areas is known as gen­er­at­ive ad­versari­al networks (GANs). GANs are primarily used to generate images, but they also allow the automatic creation of text. But what exactly are GANs? How do they work? And what suitable ap­plic­a­tions do they offer?

What is a GAN?

Before we explore what GANs can actually do for us, let’s look at what gen­er­at­ive ad­versari­al networks actually are.

A GAN is a machine learning system, developed in 2014 by Ian Good­fel­low and his team. The task of a GAN is to generate its own creations based on a range of real example data. This allows the end result to be de­cept­ively real and it becomes hard to tell the computer-generated images were not created by human hands. To do this, two neural networks are used which com­mu­nic­ate with each other.

The generator network is tasked with creating a fake. The network is fed with data – such as photos of women. Based on this in­form­a­tion, it then creates its own photo. First, the network learns what prop­er­ties the originals have in common. So, the new picture isn’t a duplicate of one piece of source data, but an entirely new image that is similar in nature – in our example, rep­res­ent­ing the photo of a (non-existent) woman.

The basic data and generated in­form­a­tion are provided to the partner network. The task of the dis­crim­in­at­or network is to check all the data it receives to determine whether it is real or fake. An image is not only deemed to be fake if it deviates too far from the basic data, but also if it’s too perfect. If the generator simply takes the average of all the data and produces a new image, the machine gen­er­a­tion will be easy to determine. The dis­crim­in­at­or, therefore, also filters out the results that don’t appear natural.

Both networks try to outdo the other. If the dis­crim­in­at­or network re­cog­nises a fake dataset, it rejects the data. In this case, the generator network wasn’t good enough and needs to keep learning. At the same time, the dis­crim­in­at­or also learns. Since both neural networks train each other, this is referred to as a deep learning system. The generator attempts to create datasets that appear so genuine that the dis­crim­in­at­or clas­si­fies them as real. On the other hand, the dis­crim­in­at­or tries to closely examine and un­der­stand the real examples so that false datasets have no chance of being clas­si­fied as real.

How do GANs work?

Like any other ar­ti­fi­cial in­tel­li­gence, GANs also need to be trained. This form of machine learning proceeds through six steps:

  1. Problem defin­i­tion: In the first step, a problem has to be defined which the system should try to solve. Here, the de­velopers collect real data that the system can use.
  2. Ar­chi­tec­ture: Various problems also require various gen­er­at­ive ad­versari­al networks. For this reason, the GAN has to be equipped with the right ar­chi­tec­ture for the ap­plic­a­tion.
  3. First dis­crim­in­at­or training: Actual training begins during this step. The generator is stopped, while the dis­crim­in­at­or only analyses the real data and learns to un­der­stand it.
  4. First generator training: Now the dis­crim­in­at­or is stopped and the generator starts to generate falsified data.
  5. Second dis­crim­in­at­or training: The dis­crim­in­at­or network is now fed the new, falsified data from the generator and has to decide which datasets are true and which are false.
  6. Second generator training: The generator network is further improved with the result of the second dis­crim­in­at­or training stage. The generator network gets to know the weak­nesses of the dis­crim­in­at­or and attempts to exploit them and generate even more realistic, fake datasets.

Both networks develop as part of this com­pet­i­tion, thereby becoming better and more efficient. The generator network learns how to develop in­creas­ingly more realistic datasets. The dis­crim­in­at­or network learns how to identify even seemingly real datasets as false.

What chal­lenges does the system need to overcome?

Just as is the case for almost any tech­no­logy, the de­velopers of GANs face a number of chal­lenges that have to be solved in order to ensure training runs smoothly.

Balanced com­pet­i­tion

As explained above, GANs are based on the com­pet­i­tion between two neural networks. But this can only work if both networks are just as strong and effective as each other. If one of the two networks is superior, the system will collapse. For instance, if the generator is too effective, the dis­crim­in­at­or will classify all falsified data as real. Whereas, if the dis­crim­in­at­or has the upper hand, it will classify all the data from the generator as fake. In this case, neither of the networks can develop them­selves further.

Correctly un­der­stand­ing objects

Gen­er­at­ive ad­versari­al networks often have problems correctly re­cog­nising and un­der­stand­ing objects. This is par­tic­u­larly true for images. Here’s an example: A real image shows two cats, each with two eyes. If the generator doesn’t un­der­stand the complete structure and po­s­i­tion­ing of the image, it might generate an image of one cat with four eyes instead. GANs can also be caught out by per­spect­ives and fail to un­der­stand that two images depict the same motif from different angles.

Where are GANs used?

Gen­er­at­ive ad­versari­al networks gained special attention – even beyond the field of computer science – after the artist col­lect­ive Obvious used the tech­no­logy to generate a work of art. The painting was sold at auction for $432,500. But a GAN can also deliver as­ton­ish­ing results outside artistic ap­plic­a­tions.

Video pre­dic­tion

Based on the in­di­vidu­al video frames, GANs can predict how a video continues and thereby extend the video autonom­ously at the end of the footage. They consider all elements of the video, including motions and actions, as well as back­ground changes like rain or fog.

Image gen­er­a­tion using text

GANs can generate images based on a de­scrip­tion. For example, they can use a script to in­de­pend­ently generate a story­board.

Gen­er­a­tion of complex objects

Even simple sketches can be auto­mat­ic­ally trans­formed by gen­er­at­ive ad­versari­al networks into complex three-di­men­sion­al objects in no time at all. A simple drawing of a tree can be used to create a highly complex image with tiny details, like leaves flut­ter­ing in the wind and a swaying tree trunk, thanks to GANs.

Improving image details

GANs can add new details to an image taken in poor res­ol­u­tion or with missing picture elements. To do so, gen­er­at­ive ad­versari­al networks use in­form­a­tion from similar images to augment the missing image in­form­a­tion.

De­vel­op­ing new products

Some companies are already ex­per­i­ment­ing with GANs in product de­vel­op­ment and create com­pletely new designs and product lines using the system.

Product text gen­er­a­tion

GANs can also handle text creation and are already used to generate product texts that play a greater role in the purchase decisions of consumers. Using GANs, these de­scrip­tions cannot only be created quickly, the networks can also analyse which product texts were most suc­cess­ful in the past and use this in­form­a­tion to compose similar texts.

Gen­er­at­ive ad­versari­al networks are already being suc­cess­fully put to use across all these areas. Companies and de­velopers are con­stantly working on new ap­plic­a­tion pos­sib­il­it­ies. In the near future, GANs will likely have a major influence on many aspects of our lives and work.

Go to Main Menu