Logo

back

Deep Convolutional GAN for Flowers

Ashish Khare

02 May 2025

4 min read

banner Image at ArtStation by DannyLaiLai

GANs, or Generative Adversarial Networks, were first introduced by Ian Goodfellow and others. They function as a game between two neural networks: the Discriminator, which tries to differentiate between real and fake images, and the Generator, which attempts to mimic real data and fool the Discriminator. Through sufficient training, the Generator learns to produce lifelike results.

Later in 2015, Alec Radford introduced the use of convolutional layers for images, along with batch normalization and other improvements, which significantly enhanced the quality of generated outputs. I recreated the same architecture, and after training for just 25 epochs, I was able to achieve convincing results.

I know these aren’t crystal clear or crisp; however, they serve as solid proof that the network is learning and heading in the right direction.

Notebook: Flower GAN at Kaggle

Results after 25 Epoch training

Dataset

Dataset used for training the GAN: Flower Classification: 14 Types of Flower Image Classification assembled by Marquis03.

The dataset contains 14 types of flower images, including 13618 training images and 98 validation images, with a total data size of 202MB, and supports the recognition of the following flower types: carnation, iris, bluebells, golden english, roses, fallen nephews, tulips, marigolds, dandelions, chrysanthemums, black-eyed daisies, water lilies, sunflowers, and daisies.

Constants Used

ParameterValue
batch_size64
image_size64
latent_dim100
learning rate0.0002
beta10.5
epochs25
img_channels3

Generator

The generator is a sequential stack of five transposed convolutional layers, each followed by a batch normalization layer and a ReLU activation function. However, as mentioned in the paper, the final block (or last set of layers) does not include a normalization layer and uses Tanh as the activation function.

class Generator(nn.Module):
    def __init__(self, latent_dim, ngf, nc):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            self._block(latent_dim, ngf * 8, 4, 1, 0),
            self._block(ngf * 8,    ngf * 4, 4, 2, 1),
            self._block(ngf * 4,    ngf * 2, 4, 2, 1),
            self._block(ngf * 2,    ngf,     4, 2, 1),

            # Output layer
            nn.ConvTranspose2d(ngf, nc, 4, 2, 1),
            nn.Tanh(),
        )

    def _block(self, inc, out, k_size, stride, pad):
        return nn.Sequential(
            nn.ConvTranspose2d(inc, out, k_size, stride, pad, bias=False),
            nn.BatchNorm2d(out),
            nn.ReLU(True),
        )

    def forward(self, x):
        return self.model(x)

The ConvTranspose layer is used to upscale the input matrix and increase the number of features. It works by applying a reverse convolution operation, which increases the spatial dimensions (height and width) of the input while simultaneously increasing the number of channels (depth).

Architecture Guidelines for Stable Deep Convolutional GANs:

Discriminator

The discriminator also consists of five convolutional layers, each followed by batch normalization and a LeakyReLU activation function. However, the first layer does not include batch normalization, and the activation function for the final block is the Sigmoid function.

class Discriminator(nn.Module):
    def __init__(self, ndf, nc):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
            nn.LeakyReLU(.2, inplace=True),

            self._block(ndf    , ndf * 2),
            self._block(ndf * 2, ndf * 4),
            self._block(ndf * 4, ndf * 8),

            # Output Layer
            nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),
            nn.Sigmoid(),
        )

    def _block(self, inc, out):
        return nn.Sequential(
            nn.Conv2d(inc, out, 4, 2, 1, bias=False),
            nn.BatchNorm2d(out),
            nn.LeakyReLU(0.2, inplace=True),
        )

    def forward(self, x):
        return self.model(x).view(-1)

Training and Losses

The Adam optimizer was used to update the weights of each network, as described in the DCGAN paper. Additionally, a simple Binary Cross Entropy Loss was used as the loss function.

criterion = nn.BCELoss()

optim_g = optim.Adam(generator.parameters(),     lr=lr, betas=(beta1, 0.999))
optim_d = optim.Adam(discriminator.parameters(), lr=lr, betas=(beta1, 0.999))

After training both networks in an adversarial manner for 25 epochs, the generator began to learn and showed positive signs of progress.

Discriminator vs Generator Loss

Training Setup Summary:

My Own Variable Values:

  1. I chose mini-batches of 64.
  2. I used Adam for both networks.

Papers followed:

  1. Generative Adversarial Networks
  2. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks