33

FGO StyleGAN: This Heroic Spirit Doesn’t Exist

 4 years ago
source link: https://www.tuicool.com/articles/BV3iUv3
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

When I first saw Nvidia’s StyleGAN’s results I felt like it looked like a bunch of black magic. I am not as experienced in the area of GANs as other parts of deep learning, this lack of experience and the thought that I really lacked the GPU firepower to train up my own StyleGAN stopped me from jumping in sooner. For scale, on the StyleGAN github Nvidia lists the GPU specifications, basically saying it takes around 1 week to train from scratch on 8 GPUs and if you only have a single GPU the training type is around 40 days. So running one of my GPU rigs for 40 days sounds terrifying in terms of time and also my electric bill. With those constraints I set aside my ambitions of training up a StyleGAN for awhile.

amim2i6.jpg
FGO StyleGAN outputs

Leap of Faith: Custom FGO StyleGAN

While I enjoy black magic as much as the next person, I also enjoy understanding what is happening, demystifying things where I can, and building my own versions of things.

A few weeks ago a teammate of mine sent me some videos on LinkedIn of fashion models morphing into one another in a video style I recognized as an application of StyleGAN. Digging into it more I saw that a lot of work had been going on in the community around StyleGAN since the last time I had looked. Personally I do a lot of work in Pytorch these days, but I think when you are trying to adapt research to your own projects it is often easiest to use whatever tools the research was done with. In this case while there is a Pytorch port that seems fairly functional the best course of action was to use the Tensorflow based code that the research was done with that has been opensourced by Nvidia .

What caused me to take the leap of faith to customize my own StyleGan though was the work by an individual named Gwern Branwen for their work on making a website “ This Waifu does not exist ”. Frankly I would not have really bothered to devote time and resources to train up a StyleGAN if I had not seen Gwern’s post on how they walked through their S tyleGAN and were kind enough to provided pretrained weights for an anime based StyleGAN trained at 512×512 resolution.

Gwern displays the anime based StyleGANs which they trained or were trained by others using the weights they provided. While my project is similar to the ones there, someone in the community trained a “Saber face” StyleGAN, the StyleGAN for this post is a general Fate Grand Order StyleGAN.

Brief GAN Background

Generative Adversarial Networks (GAN) are an interesting area of deep learning where the training process involves two networks a generator and a discriminator. The generator model starts to create images on its own, it starts from random noise while the discriminator gives feedback by looking at training examples and generator output and predicts if they are “real” or “fake”. Overtime this feedback helps the generator create more realistic images.

StyleGAN is an improvement over a previous model from Nvidia called ProGAN. ProGAN was trained to generate high quality images 1024×1024 and did so by implementing a progressive training cycle where it starts training images at low-resolution (4×4)and increases that resolution over time by adding additional layers. Training the low resolution images helped make training faster and increased the quality of final images as the networks were able to learn important lower level characteristics. However ProGAN has limited ability to control the generated images which is where StyleGAN comes in. StyleGAN is based on ProGAN but with the additions to the generator network to allow for control of three types of features.

  1. Coarse: affects pose, general hair style, face shape, etc
  2. Middle: affects finer facial features, hair style, eyes open/closed, etc.
  3. Fine: affects color scheme (eye, hair and skin) and micro features.

This is just a brief description of StyleGAN for more information check out the paper or other writeups on medium .

Q3UbQrf.jpg
This one shows a few male faces. However a lot of them turn into super evil looking images? Maybe guys are just evil? who knows. This one also shows a number of lower quality generated images probably due to me not removing low resolution images properly when I created the dataset.

Dataset Building and Preparation

In order to get StyleGAN running, the hardest part was deciding on how I wanted to approach the problem and getting a properly formatted dataset. I made some mistakes along the way which I will walk through as well. In line with the original paper and many other StyleGANs I have seen, I decided to make a dataset of headshots. As for topic, the only datasets I had lying around of sufficient size for this were related to the anime Fate Grand Order (FGO).


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK