Model Database

From Upscale Wiki
Revision as of 21:17, 14 August 2019 by BoxDroppingManApe (talk | contribs) (Think I got most of the epochs that were originally listed before I removed them. Going to do 1 more pass.)

Jump to navigation Jump to search

ESRGAN ("old Architecture") Models

Models that use the "old" esrgan architecture. They can be used either with the official ESRGAN repo (old arch tag) and BasicSR (old arch or preferably victorc's fork). You can also use tools that warp around one of them, like IEU from Honh.

Image scaling and Video upscaling

In computer graphics and digital imaging, image scaling refers to the resizing of a digital image. In video technology, the magnification of digital material is known as upscaling or resolution enhancement.

Drawings

Drawing is a form of visual art in which a person uses various drawing instruments to mark paper or another two-dimensional medium. Instruments include graphite pencils, pen and ink, various kinds of paints, inked brushes, colored pencils, crayons, charcoal, chalk, pastels, various kinds of erasers, markers, styluses, and various metals (such as silverpoint). Digital drawing is the act of using a computer to draw. Common methods of digital drawing include a stylus or finger on a touchscreen device, stylus- or finger-to-touchpad, or in some cases, a mouse. There are many digital art programs and devices.

Manga/Anime

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
Manga109Attempt Kingdomakrillic 4 Anime / Manga ? 4 ? ? 0.1K Manga109 RRDB_PSNR_x4
Falcon Fanart LyonHrt 4 Anime / Manga 125K 8 128 ? 3.393K Falcon Fanart RRDB_PSNR_x4
Unholy02 DinJerr 4 Anime / Manga ? ? ? ? ? CG-Painted Anime Several, see notes
Unholy03 DinJerr 4 Anime / Manga ? ? ? ? ? CG-Painted Anime Several, see notes
WaifuGAN v3 DinJerr 4 Anime / Manga 30K 2 128 ? 0.173K CG-Painted Anime Manga109v2
Lady0101 DinJerr 4 Anime / Manga 208K ? ? ? ~7K CG-Painted Anime WaifuGAN v3
De-Toon LyonHrt 4 Toon Shading / Sprite 225K 8 128 525 7.117K Custom Cartoon-style photos RRDB_PSNR_x4

Manga109Attempt is slightly blurry, but performs well as a general upscaler.

Falcon Fanart tries to improve upon it with the goal of removing checkerboard patterns / and dithering. It has oil colour based shading with sharp lines.

Unholy02 and Unholy03 were created by interpolating a whole bunch of models about 30 times, mainly with the Dinjerr’s own WaifuGAN model and RRDB_esrgan. It’s intended for upscaling CG-painted anime images with light outlines and produces sharper, cleaner, and more aggressive results than manga109, but may produce unnecessary outlines or details when faced with noise, so be wary of jpegs.

WaifuGAN v3 is Dinjerr’s third attempt at training from a mostly anime dataset sourced from image boards and is intended for upscaling CG-painted anime with variable outlines. Only PNGs were used, mainly with brush strokes and gradients. Texturised images avoided as much as possible. If too generative, tone down by interpolating with a softer model.

Lady0101 was trained on digital paintings of ladies (mostly). Strong anti-staircasing, mediocre undithering and slight blending. It is meant to be used to upscale pixel art/paintings and transform it into digital painting style.

De-Toon, is a model that does the opposite of tooning an image. It takes toon style shading and detail, and attempts to make it realistic. Its very sensitive, and can be used on small sprites, to large images. Also included is a alt version, which is less sharp.

Cartoon / Comic

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
Comic Book LyonHrt 4 Comic / Drawings 115K 8 128 592 1.548K Custom (Spider-Man) none
DigitalFrames Klexos 4 Digital Cartoon 1.06M 15 128 96.275 0.25K - 2.5K Digital Cartoon Images RRDB_PSNR_x4

The Comic Book model was trained using stills from the film spiderman into the spiderverse, has a comic book crosshatch shading effect to the images. Sample

Pixel Art

Pixel art is a form of digital art, created through the use of software, where images are edited on the pixel level. The aesthetic for this kind of graphics comes from 8-bit and 16-bit computers and video game consoles, in addition to other limited systems such as graphing calculators. In most pixel art, the color palette used is extremely limited in size, with some pixel art using only two colors.

Creating or modifying pixel art characters or objects for video games is sometimes called spriting, a term that arose from the hobbyist community. The term likely came from the term sprite, a term used in computer graphics to describe a two-dimensional bitmap that is used in tandem with other bitmaps to construct a larger scene.


Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
Xbrz LyonHrt 4 Xbrz style pixel art upscaler 90K 8 128 368 1.897K custom xbrz up-scaled RRDB_PSNR_x4
Xbrz+DD LyonHrt 4 Xbrz style pixel art upscaler with de-dithering 90K 8 128 470 1.523K custom de-dithered xbrz xbrz
ScaleNX LyonHrt 4 Scalenx style pixel art upscaler 80K 8 128 599 1.070K custom scalenx up-scaled from retroarch shader RRDB_PSNR_x4
Fatality Twittman 4 (dithered) spirites 265K 10 128 160 19.7K ? Face
Rebout LyonHrt 4 Character Sprites 325K 8 128 106 23.808K Custom prepared sprites from kof 94 rebout Detoon

Fatality is meant to be used for upscaling medium resolution Sprites, dithered or undithered, it can also upscale manga/anime and gameboy camera images.

Rebout is trained to give detail to character models, with faces and hands improved. Based on the snk game kof94 rebout, although best for snk style games, does work on a variety of sprites. Also included is a interpolated version that may provide a cleaner upscale for certain sprites.

Photographs and Photorealism

A photograph (also known as a photo) is an image created by light falling on a photosensitive surface, usually photographic film or an electronic image sensor, such as a CCD or a CMOS chip. Most photographs are created using a camera, which uses a lens to focus the scene's visible wavelengths of light into a reproduction of what the human eye would see. The process and practice of creating such images is called photography. The word photograph was coined in 1839 by Sir John Herschel and is based on the Greek φῶς (phos), meaning "light," and γραφή (graphê), meaning "drawing, writing," together meaning "drawing with light." Photorealism is a genre of art that encompasses painting, drawing and other graphic media, in which an artist studies a photograph and then attempts to reproduce the image as realistically as possible in another medium. Although the term can be used broadly to describe artworks in many different media, it is also used to refer specifically to a group of paintings and painters of the American art movement that began in the late 1960s and early 1970s.

Misc / Kitchen Sink

All kinds of photographs or photorealistic images. Those models aren't specialized.


Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
Box buildist 4 GNU GPLv3 Realstic 390K 8 192 268 11.577K Flickr2K+Div2K+OST PSNR model from same data
Ground ZaphodBreeblebox 4 Ground Textures 305K ? 128 ? ? Custom (Ground textures Google) ?
Misc Alsa 4 GNU GPLv3 Surface Textures 220K 32 128 338 20.797K Custom (Photos) Manga109Attempt

Box was meant to be an improvement on the RRDB_ESRGAN_x4 model (comparison). It’s also trained on photos, but with a much larger dataset which was downscaled with linear interpolation (box filter) instead of bicubic.

The Ground model was trained on various pictures of stones, dirt and grass using Google’s image search.

The Misc model is trained on various pictures shoot by myself, including bricks, stone, dirt, grass, plants, wook, bark, metal and a few others.

Characters and Faces

For images of humans, creatures, faces, ...


Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
Trixie LyonHrt 4 Star Wars 275K 8 192 87 19.814K ? None
Face Focus LyonHrt 4 Face De-blur 275K 8 192 455 4.157K Custom (Faces) RRDB_PSNR_x4
Face Twittman 4 Face Upscaling 250K 10 128 967 3.765K Custom (Faces) 4xESRGAN

Trixie was made to bring balance to the force… Also to upscale character textures for star wars games, including the heroes, rebels, sith and imperial. Plus a few main aliens…Why called trixie? Because jar jars big adventure would be too long of a name…This also provides good upscale for face textures for general purpose as well as basic star wars textures.

The Face Focus modes was designed for slightly out of focus / blurred images of faces. It is aimed at faces / hair, but it can help to improve other out of focused images too as always just try it.

Specialized

For Purposes that didn't fit anywhere else (for now).


Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
Map LyonHrt 4 Map / Old Paper with text 120K 8 192 361 2.311K Custom(Scans) none
Forest LyonHrt 4 Wood / Leaves 160K 8 192 590 2.2K Custom(?) none
Skyrim Armory Alsa 4 GNU GPLv3 Armor, Clothes and Weapons 80K 26 128 2.6K 0.8K Skyrim Mod textures Manga109Attempt
Skyrim Wood Laeris 4 Wood 75K ? ? ? ? ? ?
Skyrim Misc Deorder 4 Skyrim Diffuse Textures 105K ? 128 ? Skyrim Diffuse Textures ?
Fallout 4 Weapons Bob 4 Fallout Weapon Diffuse Textures 120K 13 128 2.973K 532(OTF) Fallout 4 HDDLC Weapon Diffuse Textures Manga109Attempt

The map model was trained on maps, old documents, papers and various styles of typefaces/fonts. Based on a dataset contributed by alsa. Sample

The Forest model is focused on trees, leaves, bark and stone can be used for double upscaling for even more detail. Sample

The Armory model was trained with modded textures form Skyrim, including Clothing, Armor and Weapons. (Leather, Canvas and Metal should all work - maybe too sharp so interpolate)

The wood model was trained for Skyrim by Laeris.

The Skyrim Diffuse models is supposed to be used with Skyrim’s diffuse textures. It is a bit too sharp so I recommend to interpolating with the RDDB_ESRGAN_x4 model or the mangaAttempt109 model, look in Deorder’s Skyrim Model Google Drive for an already interpolated version.

Fallout 4 weapons was trained using Fallout 4’s official hd armor/weapon textures but could be used on other weapon and armor textures.

Texture Maps

Normal Maps

In 3D computer graphics, normal mapping, or Dot3 bump mapping, is a technique used for faking the lighting of bumps and dents – an implementation of bump mapping. It is used to add details without using more polygons. A common use of this technique is to greatly enhance the appearance and details of a low polygon model by generating a normal map from a high polygon model or height map.

Normal maps are commonly stored as regular RGB images where the RGB components correspond to the X, Y, and Z coordinates, respectively, of the surface normal.

The models here have been specifically trained on Normal Maps, but beware, this approach is considdered deprecated by many of us. Instead of using any of those models, you can just split the R, G and B channels of the normal map you want to upscale and use any other model on them.


Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
Normal Maps Alsa 4 GNU GPLv3 Normal Maps 36K 27 128 ? ? Custom (Normal Maps) Normal Maps - Skyrim artifacted
Normal Maps - Skyrim artifacted Deorder 4 Skyrim Normal Maps 145K ? 128 ? ? Skyrim Normal Maps ?

The first one is based on the second one it was trained, with a higher learning rate and insane n_workers and batch_size values. It is meant to replace the old Normal Map model from Deorder, but without adding BC1 compression to your normal maps.

The second one was trained on Skyrim’s Normal Maps, including compression artifacts, so it will have to be redone.

Grayscale

In digital photography, computer-generated imagery, and colorimetry, a grayscale or greyscale image is one in which the value of each pixel is a single sample representing only an amount of light, that is, it carries only intensity information. Grayscale images, a kind of black-and-white or gray monochrome, are composed exclusively of shades of gray. The contrast ranges from black at the weakest intensity to white at the strongest.[1]


Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
Skyrim Alpha Deorder 4 Alpha Channel 105K ? 128 ? ? Alpha Channels from Skyrim ?

Trained to upscale grayscale images, like specular or alpha etc.

Artifact Removal

The models in this section were made to remove compression artifacts in images and textures.

JPG Compression

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
JPG (0-20%) A Alsa 1 GNU GPLv3 JPG compressed Images 178K 2 128 ? 6.23K Custom (Photos / Manga) JPG (20-40%)
JPG (0-20%) B BlueAmulet 1 JPG compressed Images ? 1 128 ? 52.789K JPG (20-40%)
JPG (20-40%) A Alsa 1 GNU GPLv3 JPG compressed Images 141K 2 128 ? 6.23K Custom (Photos / Manga) JPG (40-60%)
JPG (20-40%) B BlueAmulet 1 JPG compressed Images ? 1 128 ? 52.789K JPG (40-60%)
JPG (40-60%) A Alsa 1 GNU GPLv3 JPG compressed Images 100K 2 128 ? ~6.5K Custom (Photos / Manga) JPG (60-80%)
JPG (40-60%) B BlueAmulet 1 JPG compressed Images ? 1 128 ? 52.789K JPG (60-80%)
JPG (60-80%) A Alsa 1 GNU GPLv3 JPG compressed Images 91K 2 128 ? ~6.5K Custom (Photos / Manga) JPG (80-100%)
JPG (60-80%) B BlueAmulet 1 JPG compressed Images ? 1 128 ? 52.789K ? JPG (80-100%)
JPG (80-100%) A Alsa 1 GNU GPLv3 JPG compressed Images 162K 2 128 ? ~6.5K Custom (Photos / Manga) BC1 take 1
JPG (80-100%) B BlueAmulet 1 JPG compressed Images ? 1 128 ? 52.789K ? BC1 take 1
JPG PlusULTRA Twittman 1 JPG compressed Images 130K 1 ? 150 0.937K Custom (Manga) Failed Attempts

JPG gets compressed with a Quality Percentage between 0 and 100. So depending on how bad your JPEGs are compressed, choose the model of your choice. You can use ImageMagick to guess the Quality percentage, but keep in mind that it might be wrong, since the image might have been re-saved.

DDS Files with BC1/DXT1, BC3/DXT5 Compression

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
BC1 free 1.0 Alsa 1 GNU GPLv3 BC1 Compression 400K 2 128 26 28.985K Custom (just about everything) BC1 take 2
BC1 restricted v1.0 Alsa 1 GNU GPLv3 BC1 Compression 100K 2 128 111 1.8K Custom (Photos) Failed Attempts
BC1 restricted v2.0 Alsa 1 GNU GPLv3 BC1 Compression 261K 2 128 106 4.7K Custom (Photos / Manga) JPG (0-20%)

BC1 (DXT1) compression is commonly used in dds textures, which are utilized in most PC games today, it allows to shrink the texture to 1/6 of the original size, reducing VRAM usage. There is also BC3 (DXT5) which uses BC1 compression for the color channels and leaves the alpha channel uncompressed, this one reduces the file size to 1/3 of the original. But this compression comes at a cost. The BC1 models are designed to reverse the damage done by the compression. This is a must if you want to upscale a dds file that uses either BC1 or BC3 compression. The free variant has more freedom when dealing with the images and should lead to better results. The restricted version only deals with perfect images compressed once (ideal case) so will perform worse in any other scenario, but it tries to preserve the original colors more. You can interpolate between restricted and unrestricted if you want.

Cinepak, msvideo1 and Roq

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
Cinepak Twittman 1 Cinepak, msvideo1 and Roq 200K 1 128 21 ~8K Custom (Manga) none

The Cinepak model removes movie compressions artifacts from older video compression methods like Cinepak, msvideo1 and Roq.

Dithering

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
DeDither Alsa 1 GNU GPLv3 Dithered Images 127K 2 128 53 4.7K Custom (Photos / Manga) JPG (0-20%)
dither_4x_flickr2k_esrgan, dither_4x_flickr2k_psnr buildist 4 Ordered dithering 280K 16 128 ? 2.64K, ~8K Flickr2K, OST dithered with GIMP none

Dithering is an older compression method, where the amount of colors gets reduced, if your image has few colors or banding try the De-Dither model. Ordered dithering is a less common form of dithering that results in distinctive checkerboard/crosshatch patterns, which are misinterpreted as texture by models not trained on it. It’s often used on GIFs because the pattern is stable between frames. For the 4x model, start with the ESRGAN model, and interpolate with the PSNR model if the result is too sharp.

Over-Sharpening

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
DeSharpen loinne 1 Oversharpened Images 310K 1 128 48 ~3K Custom (?) Failed Attempts

The De-Sharpen model was made for rare particular cases when the image was destroyed by applying noise, i.e. game textures or any badly exported photos. If your image does not have any over-sharpening, it won’t hurt them, leaving as is. In theory, this model knows when to activate and when to skip, also can successfully remove artifacts if only some parts of the image are over-sharpened, for example in image consisting of several combined images, 1 of them with sharpen noise. It is made to remove sharpen noise, particularly made with Photoshop “sharpen” or “sharpen more” filters OR ImageMagick’s -sharpen directive with several varying parameters of Radius and Sigma, from subtle 0.3x0.5 to something extreme like 8x2, somewhere about that.

Aliasing

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
AntiAliasing Twittman 1 Images with pixelated edges 200K 1 128 440 0.656K Custom (?) none

AntiAliasing is for smoothing jagged edges in images and textures.

Image Generation

Texture Maps

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
normal generator LyonHrt 1 Difuse to Normal 215K 1 128 45 4.536K Custom (?) none

The model was trained on pairs of diffuse textures and normal maps.

Pretrained models for different scales

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
1xESRGAN victorca25 1 Pretrained model 1 128 ? 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth
2xESRGAN victorca25 2 Pretrained model 4 128 ? 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth
4xESRGAN victorca25 4 Pretrained model 8 128 ? 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth
8xESRGAN victorca25 8 Pretrained model 16 128 ? 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth
16xESRGAN victorca25 16 Pretrained model 16 128 ? 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth

These models were transformed from the original RRDB_ESRGAN_x4.pth model into the other scales, in order to be used as pretrained models for new models in those scales.

More information can be found here.

ESRGAN ("new Architecture") Models

Models that use the "new" esrgan architecture. It has no advantages over the old architecture, but breaks compartibility with old arch models and scales other than 1 (if you use the official ESRGAN repo). In the future Victorc plans to make his fork compartible with both, providing the option to convert between them. They can be used either with the official ESRGAN repo and BasicSR (not victorc's fork).

If you want to train your own model, please use the "old" architecture instead. There really are no disadvantages to it.

Image scaling and Video upscaling

In computer graphics and digital imaging, image scaling refers to the resizing of a digital image. In video technology, the magnification of digital material is known as upscaling or resolution enhancement.

Drawings

Drawing is a form of visual art in which a person uses various drawing instruments to mark paper or another two-dimensional medium. Instruments include graphite pencils, pen and ink, various kinds of paints, inked brushes, colored pencils, crayons, charcoal, chalk, pastels, various kinds of erasers, markers, styluses, and various metals (such as silverpoint). Digital drawing is the act of using a computer to draw. Common methods of digital drawing include a stylus or finger on a touchscreen device, stylus- or finger-to-touchpad, or in some cases, a mouse. There are many digital art programs and devices.

Cartoon / Comic

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
ad_test_tf PRAGMA 4 Cartoon / Netflix 5K 16 128 ? 30K Custom (American Dad) PSNRx4

The ad_test_tf model was designed for upscaling American Dad NTSC DVD frames (originally at 480p) to match the quality and style of Netflix’s equivalent 1080p WEB-DL, which includes a slight desaturation of colors.

PPON Models

Upscaling

Pixel Art

Name Author Scale License Purpose Iterations (Phase 1; 2; 3) Batch Size (Phase 1; 2; 3) HR Size (Phase 1; 2; 3) Dataset Size (Phase 1; 2; 3) Dataset (Phase 1; 2; 3) Pretrained Model -->
Pixie victorca25 4 Pixel Art / some Cartoons 80K(?; ?; ?) 8; 8; 8 192; 192; 192 ?; ?; ? Custom(Drawings); Custom(Drawings); Custom(Drawings) PPON
xBRZ+ victorca25 4 Pixel Art 60K(?; ?; ?) 8; 8; 8 128; 128; 128 ?; ?; ? Custom (xBRZ images); Custom (xBRZ images); Custom (Drawings) Pixie

Pretrained models for different scales

Name
Author
Scale
License
Purpose
Iterations
Batch Size
HR Size
Epoch
Dataset Size
Dataset
"
PPON Zheng Hui (惠政) 4 ? Pretrained model ? ? ? ? ? ? ?

Other Sources

Cartoon Painted Models

Licenses Links

  • GNU GLPv3:
    • You can’t sell the model under that license
    • If you modify, interpolate or use the model as a pretrained model for your own model and share results of your resulting model, it will have to be under the same license, meaning that you can’t sell it.
    • You have to state that you used the model and its author for your results.
    • You have to state any changes you made to the model.
    • There are other points, but those are the main ones.

In addition to that all models by:

have the following additional restriction:

  • You can’t sell results generated with a model using that license.