6

Implementing Deep Convolutional Generative Adversarial Networks (DCGAN)

 3 years ago
source link: https://mc.ai/implementing-deep-convolutional-generative-adversarial-networks-dcgan-2/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Implementing Deep Convolutional Generative Adversarial Networks (DCGAN)

How I Generated New Images from Random Data using DCGAN

Training a DCGAN on MNIST by Author

Deep Convolutional Generative Adversarial Networks or DCGANs are the ‘image version’ of the most fundamental implementation of GANs. This architecture essentially leverages Deep Convolutional Neural Networks to generate images belonging to a given distribution from noisy data using the Generator-Discriminator framework.

Generative Adversarial Networks use a generator network to generate new samples of data and a discriminator network to evaluate the generator’s performance. So, fundamentally, GANs’ novelty lies in the evaluator more than that in the generator.

This is what sets GANs apart from other generative models. The incorporation of a Generative model with a Discriminative model is what GANs are all about

A Comprehensive Guide to Generative Adversarial Networks (GANs)

I have discussed the theory and math behind GANs in another post , consider giving it a read if you are interested in knowing how GANs work!

In this article, we will implement DCGAN using TensorFlow and observe the results for two well-known datasets:

  1. MNIST handwritten digits dataset and
  2. CIFAR-10 image recognition dataset

Loading and Pre-processing the Data

In this section we load and prepare the data for our model.

We load the data from tensorflow.keras datasets module, which provides a load_data function for obtaining a few well-known datasets (including the ones we need). Then, we normalize the loaded images to have values from -1 to 1 as these have pixel values from 0 to 255.

Preparing Data for Training

The Generator

The generator model mainly consists of Deconvolution layers or more accurately, Transposed Convolution layers i.e. basically the reverse of a Convolution operation.

Transposed Convolution, no padding, no strides via Dumoulin et. al

In the adjacent figure, the transpose of convolving a 3×3 kernel over a 4×4 image is depicted.

This operation is tantamount to convolving a 3×3 kernel over a 2×2 image with a 2×2 border of zeros.

A Guide to Convolution Arithmetic for Deep Learning is by far one of the best papers on convolution operations involved in DL. Giving it a read is worth it! (in my opinion).

Moving on to the generator, we take a 128-dimensional vector and map it to an 8x8x256 dimensional vector using a fully connected layer. This vector is reshaped to (8, 8, 256). These are essentially 256 activation maps of size 8×8. Further, we apply several Deconv layers and finally obtain a “3 channel image of size 32×32”. This is the generated image.

Generator Model

The Discriminator

The discriminator is nothing but a binary classifier that consists of several convolution layers (like any other image classification task). And finally, the flattened activation maps mapped to a probability output to predict if the image is real or fake.

Discriminator Model

Defining the Losses

Since, this is a binary classification problem, the ultimate loss function would be Binary Crossentropy. However, this loss is adjusted and applied to both the networks separately in order to optimize their objective.

loss = tf.keras.losses.BinaryCrossentropy(from_logits=True)

The Generator is essentially trying to generate images that the discriminator would approve as real images. Hence, all the generated images must be predicted and as “1” (real) and must be penalized for failing to do so.

Generator Loss

Hence, we train the generator to predict “1” as the output at the discriminator.

Contrary to the generator, the discriminator wants itself to predict generated outputs as fake, and at the same time it must predict any real image as real. Hence, the discriminator trains on a combination of these two losses.

Discriminator Loss

We train the discriminator to predict “0” (fake) for the generated images and “1” (real) for the images from the dataset.

Training the GAN

In the training epoch, we process the generator and discriminator model together. However we apply the gradients separately as the losses and the architectures of both the models are different.

Training Epoch

After training, I got the following results,

Results @50 Epochs by Author
Results @100 Epochs by Author

Conclusion

We saw how to implement Generative Adversarial Networks. We covered this implementation using the Deep Convolutional flavor of GANs. There are other flavors of GANs that produce conditional outputs and hence can prove to be very useful.

Here is a link to the GitHub repository of the code. Feel free to fork it!

References

Original GANs Paper: https://arxiv.org/abs/1406.2661

DCGAN paper: https://arxiv.org/abs/1511.06434

GANs Blog: https://towardsdatascience.com/a-comprehensive-guide-to-generative-adversarial-networks-gans-fcfe65d1cfe4

The code used in this guide is referred from the official TensorFlow Documentation:

TensorFlow Official Docs:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK