Self-Supervised GANs using auxiliary rotation loss

Nov 12 ·4min read

TL;DR — Self-supervised GANs combine adversarial learning and self-supervised learning to bridge the gap between supervised and unsupervised image generation, i.e. conditional and unconditional GANs.

You can train your own SS-GAN by using this ready to train Pytorch implementation of the paper — Github .

First things first — what are GANs?

GANs which is short for Generative Adversarial Networks are a system of Neural networks in which two neural networks (Generator or Discriminator) work together and to play a game of minimax in order to learn their tasks of generating an image and trying to detect if the image is real or fake respectively. In other words, the goal of the discriminator is to tell the difference between the data generated by the generator and the real-world data we are trying to model. This method of combining two networks proposed in 2014 by Ian Goodfellow turned out to be the magical answer for image generation. However, training a GAN is a tough nut to crack. To the rescue comes — as always has been in the field of deep learning — adding labelled data while training a GAN, the system now called as conditional GANs or CGANs. However, supervised image generation, though handy, requires a lot of data. In order to remedy this problem, we can use semi-supervised learning techniques to create a label by ourselves. This way, we can bridge the gap between conditional and unconditional image generation.

Discriminator Forgetting

Next, we discuss a problem in conventional GANs that is highlighted by the authors of the paper called Discriminator Forgetting. This has been demonstrated using two scenarios described in the figures below:

67b6RvV.png!web

IJZzEve.png!web

The figure on the left shows that a normal 1-vs-all classifier on Cifar10 dataset tends to show substantial forgetting despite the tasks being similar. Each time the task changes, the accuracy drops substantially. However, this is not the case when the loss function is aided with self-supervision. This demonstrates that the model does not retain generalizable representations in such a changing environment.

The figure on the right shows a similar effect during GAN training. Every 100k iterations, the discriminator is used for IMAGENET classification and this shows the same pattern of forgetting which is not the case with self-supervision.

Self-Supervised GAN

Before going on to details of SS-GAN, first, we will have a quick look at what self-supervised learning. The idea is to train a model on a pretext task that can be defined and the labels for every sample can be decided according to the activity. That activity can be any change in the input, for example, predicting the rotation of the input or predicting the relative location of an image patch. Now talking about its use here, the authors have added this task of prediction of rotation angle into the discriminator. Thus, along with the adversarial prediction of fake vs real, it also tries to predict the tilt of the image among a set of {0, 90, 180 and 270} angles. This has been borrowed from the state of the art self-supervision methods as proposed in [1]. This makes the discriminator have two heads and the overall functioning of the model looks as in the figure below:

YRbYjaB.png!web

Collaborative Adversarial Training

The generator and the discriminator in this model are still adversarially playing the minimax game using the standard adversarial loss aided with spectral norm and gradient penalty. However, we are trying to mimic the benefits(information in other words) that a conditional GAN gets from the labels. The labels help the generator in deciding what kind of an image to generate instead of random pixel generation. Similar is the effort in the SS-GAN. The generator is not exactly conditional as it always generates “upright” images that are further rotated for the discriminator to predict. On the other hand, as the authors say I quote

“the discriminator is trained to detect rotation angles based only on the true data.”

This prevents the generator to generate images that are easy to predict the rotation of.

To sum it up, the discriminator has two heads. The goal of the discriminator on non-rotated images is to predict real vs fake. On rotated real images, it is to predict one among the 4 rotation angles.

Experiments

They use standard resnet based architectures for discriminator and generator taken from unconditional GANs that they have compared SS-GAN with. Weight of the rotation loss is controlled using two hyperparameters, one for real images and one for fake images.

TO compare sample quality, the authors use FID.

Further, the results can be described using the figure below:

m2Mzyqi.png!web

The important thing to note is the performance improvement that self-supervision provides over unconditional-GANs.

Conclusion

In my opinion, this work opens a new line of GANs where we can get a stable image generation of conditional GANs without using labelled data. Replacing the discriminator with state-of-the-art models can help further improvement. The authors also propose the idea of using it in a semi-supervised setting using a small number of labels for further improvement.

Self-Supervised GANs using auxiliary rotation loss

Self-Supervised GANs using auxiliary rotation loss

First things first — what are GANs?

Discriminator Forgetting

Self-Supervised GAN

Collaborative Adversarial Training

Experiments

Conclusion

Recommend

GitHub - auxiliary/vim-layout: Run Vim with automatic layouts

GitHub - wez3/msfenum: A Metasploit auto auxiliary script

Self-Supervised Learning

Contrastive Self-Supervised Learning

Lazy-Loading Auxiliary Routes with Angular — Why and How

Auxiliary constructor in Scala

PSA: Don’t upgrade to OxygenOS 12 if you use a Google Camera port with auxiliary...

DRC Mexico: Sign Inversion for Account Balances and Auxiliary Accounts

UNREAL（UNsupervised REinforcement and Auxiliary Learning）算法

Using GANs in TensorFlow Generate Images

About Joyk