In this project, we constructed, trained, and tuned a DCGAN model for image generation on two datasets, a Monet painting dataset and the CIFAR-10 dataset, and we used the inception score to evaluate the performance of our DCGAN. Based on the result of our experiment, we can safely conclude that our model performs better on the Monet dataset compared to the CIFAR10 dataset. We observed that with the same number of images and a same number of epochs, the CIFAR10 dataset performed worse compared to the Monet dataset, with the model corrupted at a certain point of training.
Several possible causes could explain the difference in the performances of the model. First, it could be that when we preprocessed the images, the images from the Monet data size were shrunken to a smaller size of 64x64 from 256x256 while the images from the CIFAR10 dataset were enlarged from 32x32 to 64x64. When we enlarged the size of the CIFAR10 dataset, the images inevitably became blurry and had fewer details compared to the original images and to the images in the downsized Monet dataset. This could potentially bring unnecessary noise while training the model. Another possible explanation for this difference could be that the “deer” class from the CIFAR10 dataset contains more complexity compared to the Monet dataset. The unique features of the Monet dataset mainly lie in the color arrangements, while the deer images in CIFAR10 contain more diverse angles and shapes, possibly causing more challenges for the model to learn the pattern.
For more details, please check out the project report
Github