Published on

4 Gan

Authors
  • avatar
    Name
    Rammy
    Twitter

GAN means generative adversarial network, a GAN consists of a generator ( gen ) and a discriminator ( dis ).

Discriminator

The job of a dis is to be able to identify which is a real image and which one is fake i.e image generated by generator. So simply saying discriminator is a supervised classifier that could classify true images and fake images using BCE loss and Sigmoid activation to finally get the probability of image being true.

The input to discriminator would be the fake data generated by generator with label --> 0 and true data with label --> 1 , output would be a probability of the image being True.

/static/images/gan/gan_discriminator.png

Generator

The job of the gen is to be able to generate images that are similar to true images such that the discriminator cannot differentiate between generated and true images.

The input to generator would be a random point sampled from a standard normal distribution and output would be same size as an image in the original training data.

In fact, the generator of a GAN fulfills exactly the same purpose as the decoder of a VAE: converting a vector in the latent space to an image. The concept of mapping from a latent space back to the original domain is very common in generative modeling as it gives us the ability to manipulate vectors in the latent space to change high-level features of images in the original domain.

/static/images/gan/gan_generator.png

Upsampling2D

In the [[AI/GAN/3-Variational autoencoders | variational autoencoder]] we used Conv2dTranspose layer with stride 2 to double the height and width of tensor.

However here we instead use the Upsampling2D layer to double the width and height of the input tensor. This simply repeats each row and column of its input in order to double the size. We then follow this with a normal convolutional layer with stride 1 to perform the convolution operation.

Note : Both of these methods—> Upsampling + Conv2D and Conv2DTranspose are acceptable ways to transform back to the original image domain. It has been shown that the Conv2DTranspose method can lead to artifacts, or small checkerboard patterns in the output image degrading the quality.

/static/images/gan/gan_noise_with_conv2DTranspose.png

Training Process

At the start of the process, the generator outputs noisy images and the discriminator predicts randomly. The key to GANs lies in how we alternate the training of the two networks, so that as the generator becomes more adept at fooling the discriminator, the discriminator must adapt in order to maintain its ability to correctly identify which observations are fake. This drives the generator to find new ways to fool the discriminator, and so the cycle continues.

The architecture of generator and discriminator is quite simple, The key to understanding GAN's is in understanding the training process which is a cyclic process where generator and discriminator takes turns to becoming better in fooling the other.

We can train the discriminator by creating a training set where some of the images are randomly selected real observations from the training set and some are outputs from the generator. The response would be 1 for the true images and 0 for the generated images.

Training the generator is more difficult as there is no training set that tells us the true image that a particular point in the latent space should be mapped to. To train the generator we must connect its output to the the frozen discriminator and our goal is to produce images that are close to true images such that discriminator outputs probabilities close to 1.

/static/images/gan/gan_training.png

Note 1 : while training generator we need to freeze the discriminator because the discriminator could realign its weights to give output 1 for all inputs of generated images.

Note 2 : We also need to make sure generator is not simply reproducing images from training set i.e original images. To test this we can calculate L1 distance for the true and generated images.

After a suitable number of epochs, the discriminator and generator will have found an equilibrium that allows the generator to learn meaningful information from the discriminator and the quality of the images will start to improve.

/static/images/gan/gan_training_graph.png

Tags: #gan #conv2dtranspose #variational_autoencoder

Reference :

Source :