In this notebook, you’re going to create another GAN using the MNIST dataset. You will implement a Deep Convolutional GAN (DCGAN), a very successful and influential GAN model developed in 2015.
Note: here is the paper if you are interested! It might look dense now, but soon you’ll be able to understand many parts of it :)
- Get hands-on experience making a widely used GAN: Deep Convolutional GAN (DCGAN).
- Train a powerful generative model.
Figure: Architectural drawing of a generator from DCGAN from Radford et al (2016).
Here are the main features of DCGAN (don’t worry about memorizing these, you will be guided through the implementation!):
Architecture guidelines for stable Deep Convolutional GANs
- Use convolutions without any pooling layers
- Use batchnorm in both the generator and the discriminator
- Don’t use fully connected hidden layers
- Use ReLU activation in the generator for all layers except for the output, which uses a Tanh activation.
- Use LeakyReLU activation in the discriminator for all layers except for the output, which does not use an activation
You will begin by importing some useful packages and data that will help you create your GAN. You are also provided a visualizer function to help see the images your GAN will create.
The first component you will make is the generator. You may notice that instead of passing in the image dimension, you will pass the number of image channels to the generator. This is because with DCGAN, you use convolutions which don’t depend on the number of pixels on an image. However, the number of channels is important to determine the size of the filters.
You will build a generator using 4 layers (3 hidden layers + 1 output layer). As before, you will need to write a function to create a single block for the generator’s neural network.
Since in DCGAN the activation function will be different for the output layer, you will need to check what layer is being created. You are supplied with some tests following the code cell so you can see if you’re on the right track!
At the end of the generator class, you are given a forward pass function that takes in a noise vector and generates an image of the output dimension using your neural network. You are also given a function to create a noise vector. These functions are the same as the ones from the last assignment.
Optional hint for
1. You'll find [nn.ConvTranspose2d](https://pytorch.org/docs/master/generated/torch.nn.ConvTranspose2d.html) and [nn.BatchNorm2d](https://pytorch.org/docs/master/generated/torch.nn.BatchNorm2d.html) useful!
# UNQ_C1 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# UNQ_C2 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
Here’s the test for your generator block:
# UNIT TESTS
The second component you need to create is the discriminator.
You will use 3 layers in your discriminator’s neural network. Like with the generator, you will need create the function to create a single neural network block for the discriminator.
From the paper, we know that we need to “[u]se LeakyReLU activation in the discriminator for all layers.” And for the LeakyReLUs, “the slope of the leak was set to 0.2” in DCGAN.
There are also tests at the end for you to use.
Optional hint for
1. You'll find [nn.Conv2d](https://pytorch.org/docs/master/generated/torch.nn.Conv2d.html), [nn.BatchNorm2d](https://pytorch.org/docs/master/generated/torch.nn.BatchNorm2d.html), and [nn.LeakyReLU](https://pytorch.org/docs/master/generated/torch.nn.LeakyReLU.html) useful!
# UNQ_C3 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# UNQ_C4 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
Here’s a test for your discriminator block:
# Test the hidden block
Now you can put it all together!
Remember that these are your parameters:
- criterion: the loss function
- n_epochs: the number of times you iterate through the entire dataset when training
- z_dim: the dimension of the noise vector
- display_step: how often to display/visualize the images
- batch_size: the number of images per forward/backward pass
- lr: the learning rate
- beta_1, beta_2: the momentum term
- device: the device type
In addition, be warned that this runs very slowly on the default CPU. One way to run this more quickly is to download the .ipynb and upload it to Google Drive, then open it with Google Colab, click on
Runtime -> Change runtime type and set hardware accelerator to GPU and replace
device = "cpu"
device = "cuda". The code should then run without any more changes, over 1,000 times faster.
criterion = nn.BCEWithLogitsLoss()
Then, you can initialize your generator, discriminator, and optimizers.
gen = Generator(z_dim).to(device)
Finally, you can train your GAN!
For each epoch, you will process the entire dataset in batches. For every batch, you will update the discriminator and generator. Then, you can see DCGAN’s results!
Here’s roughly the progression you should be expecting. On GPU this takes about 30 seconds per thousand steps. On CPU, this can take about 8 hours per thousand steps. You might notice that in the image of Step 5000, the generator is disproprotionately producing things that look like ones. If the discriminator didn’t learn to detect this imbalance quickly enough, then the generator could just produce more ones. As a result, it may have ended up tricking the discriminator so well that there would be no more improvement, known as mode collapse:
n_epochs = 50