Revision as of 10:17, 2 November 2020

This page gives you an overview of the terminology and abbreviations you might see in relation with ESRGAN.

General AI Terms

GAN: Generative Adversarial Network
An AI network that works by having a generator that tries to fool the discriminator. During training, the generator tries to generate realistic fakes of the training data, while the discriminator tries to tell which images are real and which are fake. Both of these two elements keep improving each other during training.

Dataset: The collection of data you train your AI model on. For super resolution, it's a pair of low-resolution and high-resolution images.

Iterations: The amount of times your AI model updates itself with new data. Multiply with your batch size to get the total amount of images that have been processed.
For example: If your model is at 10,000 iterations with batch size 4, it has seen 40,000 training images.

Epochs: How often your entire dataset was processed. Important: One epoch does not mean your model has finished training! It can still improve, even after seeing the same data many times.

GPU: Synonymous with "Graphics Card", though this technically only refers to the actual processor chip on your graphics card.

VRAM: Video RAM, the amount of memory your GPU has. Most AI related applications need as much VRAM as they can get.

CUDA: Nvidia's software stack that allows all kinds of software to run on your GPU.

Python: The language most AI applications are written in. The runtime needs to be installed before you can run any python scripts.

ESRGAN: Enhanced Super Resolution Generative Adversarial Network (or just Enhanced SRGAN, as SRGAN has already existed before it)

Tile Size: Most ESRGAN implementations split images into tiles to avoid running out of VRAM. The tile size defines how large these tiles are.
Larger tiles are not automatically better, but they can sometimes avoid seams and slightly speed things up. However, smaller tiles usually work just as well.

LR: Low Resolution - The part of your training data that resembles the type of images you want to use your model on.

HR: High Resolution - The part of your training data that resembles what you want your model to output.

Augmentation - The process of making your LR images intentionally "worse" in order to make the AI learn to improve them.
Examples: JPEG compression, dithering, blur, noise

Batch Size - The amount of images process per training iteration. Higher number means slower training and higher memory usage, but usually better results.

@@ Line 5: / Line 5: @@
 * GAN: Generative Adversarial Network
 *: An AI network that works by having a generator that tries to fool the discriminator. During training, the generator tries to generate realistic fakes of the training data, while the discriminator tries to tell which images are real and which are fake. Both of these two elements keep improving each other during training.
+* Dataset: The collection of data you train your AI model on. For super resolution, it's a pair of low-resolution and high-resolution images.
 * Iterations: The amount of times your AI model updates itself with new data. Multiply with your batch size to get the total amount of images that have been processed.
@@ Line 33: / Line 35: @@
 = ESRGAN Training (BasicSR) Specific Terms =
-* ...
+* LR: Low Resolution - The part of your training data that resembles the type of images you want to use your model on.
+* HR: High Resolution - The part of your training data that resembles what you want your model to output.
+* Augmentation - The process of making your LR images intentionally "worse" in order to make the AI learn to improve them.
+*: Examples: JPEG compression, dithering, blur, noise
+* Batch Size - The amount of images process per training iteration. Higher number means slower training and higher memory usage, but usually better results.