Your First GAN

Your First GAN

Goal

In this notebook, you’re going to create your first generative adversarial network (GAN) for this course! Specifically, you will build and train a GAN that can generate hand-written images of digits (0-9). You will be using PyTorch in this specialization, so if you’re not familiar with this framework, you may find the PyTorch documentation useful. The hints will also often include links to relevant documentation.

Learning Objectives

  1. Build the generator and discriminator components of a GAN from scratch.
  2. Create generator and discriminator loss functions.
  3. Train your GAN and visualize the generated images.

Getting Started

You will begin by importing some useful packages and the dataset you will use to build and train your GAN. You are also provided with a visualizer function to help you investigate the images your GAN will create.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import torch
from torch import nn
from tqdm.auto import tqdm
from torchvision import transforms
from torchvision.datasets import MNIST # Training dataset
from torchvision.utils import make_grid
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
torch.manual_seed(0) # Set for testing purposes, please do not change!

def show_tensor_images(image_tensor, num_images=25, size=(1, 28, 28)):
'''
Function for visualizing images: Given a tensor of images, number of images, and
size per image, plots and prints the images in a uniform grid.
'''
image_unflat = image_tensor.detach().cpu().view(-1, *size)
image_grid = make_grid(image_unflat[:num_images], nrow=5)
plt.imshow(image_grid.permute(1, 2, 0).squeeze())
plt.show()

MNIST Dataset

The training images your discriminator will be using is from a dataset called MNIST. It contains 60,000 images of handwritten digits, from 0 to 9, like these:

MNIST Digits

You may notice that the images are quite pixelated — this is because they are all only 28 x 28! The small size of its images makes MNIST ideal for simple training. Additionally, these images are also in black-and-white so only one dimension, or “color channel”, is needed to represent them (more on this later in the course).

Tensor

You will represent the data using tensors. Tensors are a generalization of matrices: for example, a stack of three matrices with the amounts of red, green, and blue at different locations in a 64 x 64 pixel image is a tensor with the shape 3 x 64 x 64.

Tensors are easy to manipulate and supported by PyTorch, the machine learning library you will be using. Feel free to explore them more, but you can imagine these as multi-dimensional matrices or vectors!

Batches

While you could train your model after generating one image, it is extremely inefficient and leads to less stable training. In GANs, and in machine learning in general, you will process multiple images per training step. These are called batches.

This means that your generator will generate an entire batch of images and receive the discriminator’s feedback on each before updating the model. The same goes for the discriminator, it will calculate its loss on the entire batch of generated images as well as on the reals before the model is updated.

Generator

The first step is to build the generator component.

You will start by creating a function to make a single layer/block for the generator’s neural network. Each block should include a linear transformation to map to another shape, a batch normalization for stabilization, and finally a non-linear activation function (you use a ReLU here) so the output can be transformed in complex ways. You will learn more about activations and batch normalization later in the course.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# UNQ_C1 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION: get_generator_block
def get_generator_block(input_dim, output_dim):
'''
Function for returning a block of the generator's neural network
given input and output dimensions.
Parameters:
input_dim: the dimension of the input vector, a scalar
output_dim: the dimension of the output vector, a scalar
Returns:
a generator neural network layer, with a linear transformation
followed by a batch normalization and then a relu activation
'''
return nn.Sequential(
# Hint: Replace all of the "None" with the appropriate dimensions.
# The documentation may be useful if you're less familiar with PyTorch:
# https://pytorch.org/docs/stable/nn.html.
#### START CODE HERE ####
nn.Linear(input_dim, output_dim),
nn.BatchNorm1d(output_dim),
nn.ReLU(inplace=True),
#### END CODE HERE ####
)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Verify the generator block function
def test_gen_block(in_features, out_features, num_test=1000):
block = get_generator_block(in_features, out_features)

# Check the three parts
assert len(block) == 3
assert type(block[0]) == nn.Linear
assert type(block[1]) == nn.BatchNorm1d
assert type(block[2]) == nn.ReLU

# Check the output shape
test_input = torch.randn(num_test, in_features)
test_output = block(test_input)
assert tuple(test_output.shape) == (num_test, out_features)
assert test_output.std() > 0.55
assert test_output.std() < 0.65

test_gen_block(25, 12)
test_gen_block(15, 28)
print("Success!")
Success!

Now you can build the generator class. It will take 3 values:

  • The noise vector dimension
  • The image dimension
  • The initial hidden dimension

Using these values, the generator will build a neural network with 5 layers/blocks. Beginning with the noise vector, the generator will apply non-linear transformations via the block function until the tensor is mapped to the size of the image to be outputted (the same size as the real images from MNIST). You will need to fill in the code for final layer since it is different than the others. The final layer does not need a normalization or activation function, but does need to be scaled with a sigmoid function.

Finally, you are given a forward pass function that takes in a noise vector and generates an image of the output dimension using your neural network.

Optional hints for Generator 1. The output size of the final linear transformation should be im_dim, but remember you need to scale the outputs between 0 and 1 using the sigmoid function. 2. [nn.Linear](https://pytorch.org/docs/master/generated/torch.nn.Linear.html) and [nn.Sigmoid](https://pytorch.org/docs/master/generated/torch.nn.Sigmoid.html) will be useful here.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# UNQ_C2 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION: Generator
class Generator(nn.Module):
'''
Generator Class
Values:
z_dim: the dimension of the noise vector, a scalar
im_dim: the dimension of the images, fitted for the dataset used, a scalar
(MNIST images are 28 x 28 = 784 so that is your default)
hidden_dim: the inner dimension, a scalar
'''
def __init__(self, z_dim=10, im_dim=784, hidden_dim=128):
super(Generator, self).__init__()
# Build the neural network
self.gen = nn.Sequential(
get_generator_block(z_dim, hidden_dim),
get_generator_block(hidden_dim, hidden_dim * 2),
get_generator_block(hidden_dim * 2, hidden_dim * 4),
get_generator_block(hidden_dim * 4, hidden_dim * 8),
# There is a dropdown with hints if you need them!
#### START CODE HERE ####
nn.Linear(hidden_dim * 8, im_dim),
nn.Sigmoid()
#### END CODE HERE ####
)
def forward(self, noise):
'''
Function for completing a forward pass of the generator: Given a noise tensor,
returns generated images.
Parameters:
noise: a noise tensor with dimensions (n_samples, z_dim)
'''
return self.gen(noise)

# Needed for grading
def get_gen(self):
'''
Returns:
the sequential model
'''
return self.gen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Verify the generator class
def test_generator(z_dim, im_dim, hidden_dim, num_test=10000):
gen = Generator(z_dim, im_dim, hidden_dim).get_gen()

# Check there are six modules in the sequential part
assert len(gen) == 6
test_input = torch.randn(num_test, z_dim)
test_output = gen(test_input)

# Check that the output shape is correct
assert tuple(test_output.shape) == (num_test, im_dim)
assert test_output.max() < 1, "Make sure to use a sigmoid"
assert test_output.min() > 0, "Make sure to use a sigmoid"
assert test_output.std() > 0.05, "Don't use batchnorm here"
assert test_output.std() < 0.15, "Don't use batchnorm here"

test_generator(5, 10, 20)
test_generator(20, 8, 24)
print("Success!")
Success!

Noise

To be able to use your generator, you will need to be able to create noise vectors. The noise vector z has the important role of making sure the images generated from the same class don’t all look the same — think of it as a random seed. You will generate it randomly using PyTorch by sampling random numbers from the normal distribution. Since multiple images will be processed per pass, you will generate all the noise vectors at once.

Note that whenever you create a new tensor using torch.ones, torch.zeros, or torch.randn, you either need to create it on the target device, e.g. torch.ones(3, 3, device=device), or move it onto the target device using torch.ones(3, 3).to(device). You do not need to do this if you’re creating a tensor by manipulating another tensor or by using a variation that defaults the device to the input, such as torch.ones_like. In general, use torch.ones_like and torch.zeros_like instead of torch.ones or torch.zeros where possible.

Optional hint for get_noise 1. You will probably find [torch.randn](https://pytorch.org/docs/master/generated/torch.randn.html) useful here.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# UNQ_C3 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION: get_noise
def get_noise(n_samples, z_dim, device='cpu'):
'''
Function for creating noise vectors: Given the dimensions (n_samples, z_dim),
creates a tensor of that shape filled with random numbers from the normal distribution.
Parameters:
n_samples: the number of samples to generate, a scalar
z_dim: the dimension of the noise vector, a scalar
device: the device type
'''
# NOTE: To use this on GPU with device='cuda', make sure to pass the device
# argument to the function you use to generate the noise.
#### START CODE HERE ####
return torch.randn(n_samples, z_dim, device = device)
#### END CODE HERE ####
1
2
3
4
5
6
7
8
9
10
11
12
13
# Verify the noise vector function
def test_get_noise(n_samples, z_dim, device='cpu'):
noise = get_noise(n_samples, z_dim, device)

# Make sure a normal distribution was used
assert tuple(noise.shape) == (n_samples, z_dim)
assert torch.abs(noise.std() - torch.tensor(1.0)) < 0.01
assert str(noise.device).startswith(device)

test_get_noise(1000, 100, 'cpu')
if torch.cuda.is_available():
test_get_noise(1000, 32, 'cuda')
print("Success!")
Success!

Discriminator

The second component that you need to construct is the discriminator. As with the generator component, you will start by creating a function that builds a neural network block for the discriminator.

Note: You use leaky ReLUs to prevent the “dying ReLU” problem, which refers to the phenomenon where the parameters stop changing due to consistently negative values passed to a ReLU, which result in a zero gradient. You will learn more about this in the following lectures!

REctified Linear Unit (ReLU) Leaky ReLU
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# UNQ_C4 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION: get_discriminator_block
def get_discriminator_block(input_dim, output_dim):
'''
Discriminator Block
Function for returning a neural network of the discriminator given input and output dimensions.
Parameters:
input_dim: the dimension of the input vector, a scalar
output_dim: the dimension of the output vector, a scalar
Returns:
a discriminator neural network layer, with a linear transformation
followed by an nn.LeakyReLU activation with negative slope of 0.2
(https://pytorch.org/docs/master/generated/torch.nn.LeakyReLU.html)
'''
return nn.Sequential(
#### START CODE HERE ####
nn.Linear(input_dim,output_dim),
nn.LeakyReLU(negative_slope = 0.2, inplace = True)
#### END CODE HERE ####
)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Verify the discriminator block function
def test_disc_block(in_features, out_features, num_test=10000):
block = get_discriminator_block(in_features, out_features)

# Check there are two parts
assert len(block) == 2
test_input = torch.randn(num_test, in_features)
test_output = block(test_input)

# Check that the shape is right
assert tuple(test_output.shape) == (num_test, out_features)

# Check that the LeakyReLU slope is about 0.2
assert -test_output.min() / test_output.max() > 0.1
assert -test_output.min() / test_output.max() < 0.3
assert test_output.std() > 0.3
assert test_output.std() < 0.5

test_disc_block(25, 12)
test_disc_block(15, 28)
print("Success!")
Success!

Now you can use these blocks to make a discriminator! The discriminator class holds 2 values:

  • The image dimension
  • The hidden dimension

The discriminator will build a neural network with 4 layers. It will start with the image tensor and transform it until it returns a single number (1-dimension tensor) output. This output classifies whether an image is fake or real. Note that you do not need a sigmoid after the output layer since it is included in the loss function. Finally, to use your discrimator’s neural network you are given a forward pass function that takes in an image tensor to be classified.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# UNQ_C5 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION: Discriminator
class Discriminator(nn.Module):
'''
Discriminator Class
Values:
im_dim: the dimension of the images, fitted for the dataset used, a scalar
(MNIST images are 28x28 = 784 so that is your default)
hidden_dim: the inner dimension, a scalar
'''
def __init__(self, im_dim=784, hidden_dim=128):
super(Discriminator, self).__init__()
self.disc = nn.Sequential(
get_discriminator_block(im_dim, hidden_dim * 4),
get_discriminator_block(hidden_dim * 4, hidden_dim * 2),
get_discriminator_block(hidden_dim * 2, hidden_dim),
# Hint: You want to transform the final output into a single value,
# so add one more linear map.
#### START CODE HERE ####
nn.Linear(hidden_dim, 1)

#### END CODE HERE ####
)

def forward(self, image):
'''
Function for completing a forward pass of the discriminator: Given an image tensor,
returns a 1-dimension tensor representing fake/real.
Parameters:
image: a flattened image tensor with dimension (im_dim)
'''
return self.disc(image)

# Needed for grading
def get_disc(self):
'''
Returns:
the sequential model
'''
return self.disc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Verify the discriminator class
def test_discriminator(z_dim, hidden_dim, num_test=100):

disc = Discriminator(z_dim, hidden_dim).get_disc()

# Check there are three parts
assert len(disc) == 4

# Check the linear layer is correct
test_input = torch.randn(num_test, z_dim)
test_output = disc(test_input)
assert tuple(test_output.shape) == (num_test, 1)

# Make sure there's no sigmoid
assert test_input.max() > 1
assert test_input.min() < -1

test_discriminator(5, 10)
test_discriminator(20, 8)
print("Success!")
Success!

Training

Now you can put it all together!
First, you will set your parameters:

  • criterion: the loss function
  • n_epochs: the number of times you iterate through the entire dataset when training
  • z_dim: the dimension of the noise vector
  • display_step: how often to display/visualize the images
  • batch_size: the number of images per forward/backward pass
  • lr: the learning rate
  • device: the device type, here using a GPU (which runs CUDA), not CPU

Next, you will load the MNIST dataset as tensors using a dataloader.

1
2
3
4
5
6
7
8
9
10
11
12
13
# Set your parameters
criterion = nn.BCEWithLogitsLoss()
n_epochs = 200
z_dim = 64
display_step = 500
batch_size = 128
lr = 0.00001
device = 'cuda'
# Load MNIST dataset as tensors
dataloader = DataLoader(
MNIST('.', download=False, transform=transforms.ToTensor()),
batch_size=batch_size,
shuffle=True)

Now, you can initialize your generator, discriminator, and optimizers. Note that each optimizer only takes the parameters of one particular model, since we want each optimizer to optimize only one of the models.

1
2
3
4
gen = Generator(z_dim).to(device)
gen_opt = torch.optim.Adam(gen.parameters(), lr=lr)
disc = Discriminator().to(device)
disc_opt = torch.optim.Adam(disc.parameters(), lr=lr)

Before you train your GAN, you will need to create functions to calculate the discriminator’s loss and the generator’s loss. This is how the discriminator and generator will know how they are doing and improve themselves. Since the generator is needed when calculating the discriminator’s loss, you will need to call .detach() on the generator result to ensure that only the discriminator is updated!

Remember that you have already defined a loss function earlier (criterion) and you are encouraged to use torch.ones_like and torch.zeros_like instead of torch.ones or torch.zeros. If you use torch.ones or torch.zeros, you’ll need to pass device=device to them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# UNQ_C6 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION: get_disc_loss
def get_disc_loss(gen, disc, criterion, real, num_images, z_dim, device):
'''
Return the loss of the discriminator given inputs.
Parameters:
gen: the generator model, which returns an image given z-dimensional noise
disc: the discriminator model, which returns a single-dimensional prediction of real/fake
criterion: the loss function, which should be used to compare
the discriminator's predictions to the ground truth reality of the images
(e.g. fake = 0, real = 1)
real: a batch of real images
num_images: the number of images the generator should produce,
which is also the length of the real images
z_dim: the dimension of the noise vector, a scalar
device: the device type
Returns:
disc_loss: a torch scalar loss value for the current batch
'''
# These are the steps you will need to complete:
# 1) Create noise vectors and generate a batch (num_images) of fake images.
# Make sure to pass the device argument to the noise.
# 2) Get the discriminator's prediction of the fake image
# and calculate the loss. Don't forget to detach the generator!
# (Remember the loss function you set earlier -- criterion. You need a
# 'ground truth' tensor in order to calculate the loss.
# For example, a ground truth tensor for a fake image is all zeros.)
# 3) Get the discriminator's prediction of the real image and calculate the loss.
# 4) Calculate the discriminator's loss by averaging the real and fake loss
# and set it to disc_loss.
# Note: Please do not use concatenation in your solution. The tests are being updated to
# support this, but for now, average the two losses as described in step (4).
# *Important*: You should NOT write your own loss function here - use criterion(pred, true)!
#### START CODE HERE ####
fake_images = gen(get_noise(num_images, z_dim, device=device))
fake_images.detach_()
fake_loss = criterion(disc(fake_images),torch.zeros((num_images,1),device = device))
real_loss = criterion(disc(real),torch.ones((num_images,1),device = device))
disc_loss = ( fake_loss + real_loss ) / 2.0

#### END CODE HERE ####
return disc_loss
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
def test_disc_reasonable(num_images=10):
# Don't use explicit casts to cuda - use the device argument
import inspect, re
lines = inspect.getsource(get_disc_loss)
assert (re.search(r"to\(.cuda.\)", lines)) is None
assert (re.search(r"\.cuda\(\)", lines)) is None

z_dim = 64
gen = torch.zeros_like
disc = nn.Identity()
criterion = torch.mul # Multiply
real = torch.ones(num_images, 1)
disc_loss = get_disc_loss(gen, disc, criterion, real, num_images, z_dim, 'cpu')
assert tuple(disc_loss.shape) == (num_images, z_dim)
assert torch.all(torch.abs(disc_loss - 0.5) < 1e-5)

gen = torch.ones_like
disc = nn.Identity()
criterion = torch.mul # Multiply
real = torch.zeros(num_images, 1)
assert torch.all(torch.abs(get_disc_loss(gen, disc, criterion, real, num_images, z_dim, 'cpu')) < 1e-5)

def test_disc_loss(max_tests = 10):
z_dim = 64
gen = Generator(z_dim).to(device)
gen_opt = torch.optim.Adam(gen.parameters(), lr=lr)
disc = Discriminator().to(device)
disc_opt = torch.optim.Adam(disc.parameters(), lr=lr)
num_steps = 0
for real, _ in dataloader:
cur_batch_size = len(real)
real = real.view(cur_batch_size, -1).to(device)

### Update discriminator ###
# Zero out the gradient before backpropagation
disc_opt.zero_grad()

# Calculate discriminator loss
disc_loss = get_disc_loss(gen, disc, criterion, real, cur_batch_size, z_dim, device)
assert (disc_loss - 0.68).abs() < 0.05

# Update gradients
disc_loss.backward(retain_graph=True)

# Check that they detached correctly
assert gen.gen[0][0].weight.grad is None

# Update optimizer
old_weight = disc.disc[0][0].weight.data.clone()
disc_opt.step()
new_weight = disc.disc[0][0].weight.data

# Check that some discriminator weights changed
assert not torch.all(torch.eq(old_weight, new_weight))
num_steps += 1
if num_steps >= max_tests:
break

test_disc_reasonable()
test_disc_loss()
print("Success!")
Success!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# UNQ_C7 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION: get_gen_loss
def get_gen_loss(gen, disc, criterion, num_images, z_dim, device):
'''
Return the loss of the generator given inputs.
Parameters:
gen: the generator model, which returns an image given z-dimensional noise
disc: the discriminator model, which returns a single-dimensional prediction of real/fake
criterion: the loss function, which should be used to compare
the discriminator's predictions to the ground truth reality of the images
(e.g. fake = 0, real = 1)
num_images: the number of images the generator should produce,
which is also the length of the real images
z_dim: the dimension of the noise vector, a scalar
device: the device type
Returns:
gen_loss: a torch scalar loss value for the current batch
'''
# These are the steps you will need to complete:
# 1) Create noise vectors and generate a batch of fake images.
# Remember to pass the device argument to the get_noise function.
# 2) Get the discriminator's prediction of the fake image.
# 3) Calculate the generator's loss. Remember the generator wants
# the discriminator to think that its fake images are real
# *Important*: You should NOT write your own loss function here - use criterion(pred, true)!

#### START CODE HERE ####
noises = get_noise(num_images,z_dim, device = device)
fake_images = gen(noises)
out = disc(fake_images)
gen_loss = criterion(out, torch.ones(num_images, 1).to(device))

#### END CODE HERE ####
return gen_loss
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
def test_gen_reasonable(num_images=10):
# Don't use explicit casts to cuda - use the device argument
import inspect, re
lines = inspect.getsource(get_gen_loss)
assert (re.search(r"to\(.cuda.\)", lines)) is None
assert (re.search(r"\.cuda\(\)", lines)) is None

z_dim = 64
gen = torch.zeros_like
disc = nn.Identity()
criterion = torch.mul # Multiply
gen_loss_tensor = get_gen_loss(gen, disc, criterion, num_images, z_dim, 'cpu')
assert torch.all(torch.abs(gen_loss_tensor) < 1e-5)
#Verify shape. Related to gen_noise parametrization
assert tuple(gen_loss_tensor.shape) == (num_images, z_dim)

gen = torch.ones_like
disc = nn.Identity()
criterion = torch.mul # Multiply
real = torch.zeros(num_images, 1)
gen_loss_tensor = get_gen_loss(gen, disc, criterion, num_images, z_dim, 'cpu')
assert torch.all(torch.abs(gen_loss_tensor - 1) < 1e-5)
#Verify shape. Related to gen_noise parametrization
assert tuple(gen_loss_tensor.shape) == (num_images, z_dim)


def test_gen_loss(num_images):
z_dim = 64
gen = Generator(z_dim).to(device)
gen_opt = torch.optim.Adam(gen.parameters(), lr=lr)
disc = Discriminator().to(device)
disc_opt = torch.optim.Adam(disc.parameters(), lr=lr)

gen_loss = get_gen_loss(gen, disc, criterion, num_images, z_dim, device)

# Check that the loss is reasonable
assert (gen_loss - 0.7).abs() < 0.1
gen_loss.backward()
old_weight = gen.gen[0][0].weight.clone()
gen_opt.step()
new_weight = gen.gen[0][0].weight
assert not torch.all(torch.eq(old_weight, new_weight))


test_gen_reasonable(10)
test_gen_loss(18)
print("Success!")
Success!

Finally, you can put everything together! For each epoch, you will process the entire dataset in batches. For every batch, you will need to update the discriminator and generator using their loss. Batches are sets of images that will be predicted on before the loss functions are calculated (instead of calculating the loss function after each image). Note that you may see a loss to be greater than 1, this is okay since binary cross entropy loss can be any positive number for a sufficiently confident wrong guess.

It’s also often the case that the discriminator will outperform the generator, especially at the start, because its job is easier. It’s important that neither one gets too good (that is, near-perfect accuracy), which would cause the entire model to stop learning. Balancing the two models is actually remarkably hard to do in a standard GAN and something you will see more of in later lectures and assignments.

After you’ve submitted a working version with the original architecture, feel free to play around with the architecture if you want to see how different architectural choices can lead to better or worse GANs. For example, consider changing the size of the hidden dimension, or making the networks shallower or deeper by changing the number of layers.

But remember, don’t expect anything spectacular: this is only the first lesson. The results will get better with later lessons as you learn methods to help keep your generator and discriminator at similar levels.

You should roughly expect to see this progression. On a GPU, this should take about 15 seconds per 500 steps, on average, while on CPU it will take roughly 1.5 minutes:
MNIST Digits

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# UNQ_C8 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION:

cur_step = 0
mean_generator_loss = 0
mean_discriminator_loss = 0
test_generator = True # Whether the generator should be tested
gen_loss = False
error = False
for epoch in range(n_epochs):

# Dataloader returns the batches
for real, _ in tqdm(dataloader):
cur_batch_size = len(real)

# Flatten the batch of real images from the dataset
real = real.view(cur_batch_size, -1).to(device)

### Update discriminator ###
# Zero out the gradients before backpropagation
disc_opt.zero_grad()

# Calculate discriminator loss
disc_loss = get_disc_loss(gen, disc, criterion, real, cur_batch_size, z_dim, device)

# Update gradients
disc_loss.backward(retain_graph=True)

# Update optimizer
disc_opt.step()

# For testing purposes, to keep track of the generator weights
if test_generator:
old_generator_weights = gen.gen[0][0].weight.detach().clone()

### Update generator ###
# Hint: This code will look a lot like the discriminator updates!
# These are the steps you will need to complete:
# 1) Zero out the gradients.
# 2) Calculate the generator loss, assigning it to gen_loss.
# 3) Backprop through the generator: update the gradients and optimizer.
#### START CODE HERE ####

gen_opt.zero_grad()
gen_loss = get_gen_loss(gen, disc, criterion, cur_batch_size, z_dim, device)
gen_loss.backward(retain_graph=True)
gen_opt.step()

#### END CODE HERE ####

# For testing purposes, to check that your code changes the generator weights
if test_generator:
try:
assert lr > 0.0000002 or (gen.gen[0][0].weight.grad.abs().max() < 0.0005 and epoch == 0)
assert torch.any(gen.gen[0][0].weight.detach().clone() != old_generator_weights)
except:
error = True
print("Runtime tests have failed")

# Keep track of the average discriminator loss
mean_discriminator_loss += disc_loss.item() / display_step

# Keep track of the average generator loss
mean_generator_loss += gen_loss.item() / display_step

### Visualization code ###
if cur_step % display_step == 0 and cur_step > 0:
print(f"Step {cur_step}: Generator loss: {mean_generator_loss}, discriminator loss: {mean_discriminator_loss}")
fake_noise = get_noise(cur_batch_size, z_dim, device=device)
fake = gen(fake_noise)
show_tensor_images(fake)
show_tensor_images(real)
mean_generator_loss = 0
mean_discriminator_loss = 0
cur_step += 1
Donate article here