| We have moved to https://openmodeldb.info/. This new site has a tag and search system, which will make finding the right models for you much easier! If you have any questions, ask here: https://discord.gg/cpAUpDK |
Difference between revisions of "Dataset Database"
Jump to navigation
Jump to search
(Fix ASOS Entryt // Edit via Wikitext Extension for VSCode) |
(Add WorldTex, Ground, and VHS // Edit via Wikitext Extension for VSCode) |
||
| Line 6: | Line 6: | ||
|- | |- | ||
! Dataset Name | ! Dataset Name | ||
| − | ! Author | + | ! Author |
! Cost | ! Cost | ||
| − | ! | + | ! License |
| + | ! Image Amount - Size | ||
! Description | ! Description | ||
! Samples | ! Samples | ||
| Line 17: | Line 18: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 0.8K | | 0.8K | ||
| 800 HQ real-world images | | 800 HQ real-world images | ||
| Line 26: | Line 28: | ||
| musl | | musl | ||
| Free | | Free | ||
| + | | | ||
| 2536 | | 2536 | ||
| − | | Raw images were processed on rawtherapee using prebayer deconvolution, AMaZe and AP1 color space. Sources: Adobe-MIT-5k, RAISE, FFHQ, DIV2K, DIV8k, Flickr2k, Rawsamples, SignatureEdits, Hasselblad raw samples and Unsplash. KernelGAN was trained using DLIP on all images, with scale 4x and up to 5k iter, instead of 3k. Hopefully it increases the accuracy of kernels. All files are provided on "kernelgan" folder. Note: in order to use it on traiNNer, you have to give dataroot_kernels path, along with enabling realistic under the resizing presets. I encourage everyone to give it a try and, if possible, mirror the dataset. For now it was made available on MEGA, but I plan to mirror it on other solutions. | + | | Raw images were processed on rawtherapee using prebayer deconvolution, AMaZe and AP1 color space. Sources: Adobe-MIT-5k, RAISE, FFHQ, DIV2K, DIV8k, Flickr2k, Rawsamples, SignatureEdits, Hasselblad raw samples and Unsplash. KernelGAN was trained using DLIP on all images, with scale 4x and up to 5k iter, instead of 3k. Hopefully it increases the accuracy of kernels. All files are provided on "kernelgan" folder. Note: in order to use it on traiNNer, you have to give dataroot_kernels path, along with enabling realistic under the resizing presets. I encourage everyone to give it a try and, if possible, mirror the dataset. For now it was made available on MEGA, but I plan to mirror it on other solutions. I've also made available my selected noise patches. They were extracted from multiple images "in the wild", with unknown degradation: |
| − | I've also made available my selected noise patches. They were extracted from multiple images "in the wild", with unknown degradation: | ||
[https://mega.nz/file/WSZjjYRI#jgJYQTxJQyJjW5cbDJdUte0szfOpyeiDRrWmMzIkxZ0 Noise Patches] | [https://mega.nz/file/WSZjjYRI#jgJYQTxJQyJjW5cbDJdUte0szfOpyeiDRrWmMzIkxZ0 Noise Patches] | ||
| [https://cdn.discordapp.com/attachments/579685650824036387/904201202546380820/nomos2k.mp4 Video Sample] | | [https://cdn.discordapp.com/attachments/579685650824036387/904201202546380820/nomos2k.mp4 Video Sample] | ||
| 2021-10-30 | | 2021-10-30 | ||
| + | |||
| + | |- | ||
| + | | [https://mega.nz/file/tZhVzCTT#DtE42x2NYSYerrcj79CE2sS-yhWHn-iq-wKAUY_HnoY VHS Part 1] [https://mega.nz/file/Vw8zzIhS#J4y9_hqQ2b0spZratZU-RqIqfxXoQHBhtzSxbKnRAo4 VHS Part 2] [https://mega.nz/file/Fkx3CS5L#21BxaUFt9c7f6Nu-ASFcsm1dc01Gdu3R8yjKA3GffMU LRx3] | ||
| + | | Redswag Scalliwag#4629 | ||
| + | | Free | ||
| + | | N/A (I do not own any of the images provided) | ||
| + | | 3128 4k movie frames, 3128(x3) VHS frames - 37.3GB total | ||
| + | | Sharpening and denoising VHS footage. Used for my VHS Sharpen 1x model, hopefully someone can find this useful or make an even better model than mine. | ||
| + | | https://cdn.discordapp.com/attachments/905446120333930566/937602167852892220/example.png | ||
| + | | 2022-1-31 | ||
| + | |||
| + | |- | ||
| + | | [https://drive.google.com/drive/folders/1hnpOBK_olECyXitS7mRpGwn7PWzKuVP4?usp=sharing ESRGAN_GroundTextures (Ground)] | ||
| + | | tldr_coder#6919 | ||
| + | | Free | ||
| + | | Unknown | ||
| + | | 760 (training ready set) and ~150 for the non-processed set. - 873mb Compressed | ||
| + | | Photos taken outdoors, some google searching for high quality images too, most under public licenses as far as I know... Outdoor ground textures. Focus on grass, dirt and rocks. These are the training and validation images used to train the GroundTextures model. I don't have all the original images anymore though. | ||
| + | | | ||
| + | | 2022-1-27 | ||
| + | |||
| + | |- | ||
| + | | [https://drive.google.com/file/d/1XLSYFJQ34NliwQn2CA9bJXa7rjYhXy19/view?usp=sharing WorldTex] | ||
| + | | JosephtheKP#3750 | ||
| + | | Free | ||
| + | | N/A (I do not own any of the images provided) | ||
| + | | 200 Images - 4.83GB | ||
| + | | video game env textures. this is a dataset i compiled rather quickly that i have no use for now, all the images are very high quality and almost entirely blur free | ||
| + | | | ||
| + | | 2021-12-26 | ||
|- | |- | ||
| Line 38: | Line 70: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 5k | | 5k | ||
| We collected 5,000 photographs taken with SLR cameras by a set of different photographers. They are all in RAW format; that is, all the information recorded by the camera sensor is preserved. We made sure that these photographs cover a broad range of scenes, subjects, and lighting conditions. We then hired five photography students in an art school to adjust the tone of the photos. Each of them retouched all the 5,000 photos using a software dedicated to photo adjustment (Adobe Lightroom) on which they were extensively trained. We asked the retouchers to achieve visually pleasing renditions, akin to a postcard. The retouchers were compensated for their work. | | We collected 5,000 photographs taken with SLR cameras by a set of different photographers. They are all in RAW format; that is, all the information recorded by the camera sensor is preserved. We made sure that these photographs cover a broad range of scenes, subjects, and lighting conditions. We then hired five photography students in an art school to adjust the tone of the photos. Each of them retouched all the 5,000 photos using a software dedicated to photo adjustment (Adobe Lightroom) on which they were extensively trained. We asked the retouchers to achieve visually pleasing renditions, akin to a postcard. The retouchers were compensated for their work. | ||
| Line 47: | Line 80: | ||
| wyk | | wyk | ||
| Free | | Free | ||
| + | | | ||
| 76K (there are some duplicates, but they have the same names, so it is simple to remove them) | | 76K (there are some duplicates, but they have the same names, so it is simple to remove them) | ||
| Contains cloth like images. It could be used to train stylegan, for example. | | Contains cloth like images. It could be used to train stylegan, for example. | ||
| Line 57: | Line 91: | ||
| Joey | | Joey | ||
| Free | | Free | ||
| + | | | ||
| 648 (sourced via random places on the internet) | | 648 (sourced via random places on the internet) | ||
| Originally compiled this for attempting to upscale the infamous mountain image :mountains:. Unfortunately, that didn't end up working. However, it might be useful for realistic SR as well. It took me quite a while to compile all of these, so hopefully it helps someone. | | Originally compiled this for attempting to upscale the infamous mountain image :mountains:. Unfortunately, that didn't end up working. However, it might be useful for realistic SR as well. It took me quite a while to compile all of these, so hopefully it helps someone. | ||
| Line 66: | Line 101: | ||
| wyk | | wyk | ||
| Free | | Free | ||
| + | | | ||
| 20,992 (webp, 7.6 GB) | | 20,992 (webp, 7.6 GB) | ||
| Could be good to train stylegan or fabric resolution enhancements models | | Could be good to train stylegan or fabric resolution enhancements models | ||
| Line 75: | Line 111: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| ? | | ? | ||
| This is the Realistic and Dynamic Scenes dataset for video deblurring and super-resolution. Train and validation subsets are publicly available. Downloads are available via Google Drive and SNU CVLab server. | | This is the Realistic and Dynamic Scenes dataset for video deblurring and super-resolution. Train and validation subsets are publicly available. Downloads are available via Google Drive and SNU CVLab server. | ||
| Line 87: | Line 124: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| ~2k | | ~2k | ||
| This dataset is designed to simulate complex line art. | | This dataset is designed to simulate complex line art. | ||
| Line 96: | Line 134: | ||
| Quixel | | Quixel | ||
| Limited Free | | Limited Free | ||
| + | | | ||
| 9.89K | | 9.89K | ||
| Discover a world of unbounded creativity. Explore a massive asset library, and Quixel’s powerful tools, plus free in-depth tutorials and resources. | | Discover a world of unbounded creativity. Explore a massive asset library, and Quixel’s powerful tools, plus free in-depth tutorials and resources. | ||
| Line 105: | Line 144: | ||
| Adobe | | Adobe | ||
| $29.99+/Month | | $29.99+/Month | ||
| + | | | ||
| "Millions" | | "Millions" | ||
| Stock photos, royalty-free images, graphics, vectors & videos | | Stock photos, royalty-free images, graphics, vectors & videos | ||
| Line 114: | Line 154: | ||
| Pexels/Various | | Pexels/Various | ||
| Free | | Free | ||
| + | | | ||
| Unknown | | Unknown | ||
| Free stock photos you can use everywhere. ✓ Free for commercial use ✓ No attribution required | | Free stock photos you can use everywhere. ✓ Free for commercial use ✓ No attribution required | ||
| Line 123: | Line 164: | ||
| | | | ||
| $16+/Month | | $16+/Month | ||
| + | | | ||
| 3.069K | | 3.069K | ||
| | | | ||
| Line 132: | Line 174: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| A lot | | A lot | ||
| Textures with Normal Maps, Displacement Maps and others in form of JPGS. Most seem to be 8K. The name of the website comes from the license. | | Textures with Normal Maps, Displacement Maps and others in form of JPGS. Most seem to be 8K. The name of the website comes from the license. | ||
| Line 141: | Line 184: | ||
| | | | ||
| Limited Free | | Limited Free | ||
| + | | | ||
| 134.872K | | 134.872K | ||
| Textures for 3D, Graphic Design and Photoshop 15 Free downloads every day! | | Textures for 3D, Graphic Design and Photoshop 15 Free downloads every day! | ||
| Line 150: | Line 194: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 0.133K | | 0.133K | ||
| 100% Free High Quality Textures for Everyone | | 100% Free High Quality Textures for Everyone | ||
| Line 159: | Line 204: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 0.241K | | 0.241K | ||
| High-Res Scans of Gustave Doré's 1866 Bible Illustrations | | High-Res Scans of Gustave Doré's 1866 Bible Illustrations | ||
| Line 168: | Line 214: | ||
| | | | ||
| Limited Free | | Limited Free | ||
| + | | | ||
| 6.605K | | 6.605K | ||
| Library of quality high resolution textures. Free for personal and commercial use. | | Library of quality high resolution textures. Free for personal and commercial use. | ||
| Line 177: | Line 224: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 89.8k | | 89.8k | ||
| This dataset consists of 89,800 video clips downloaded from vimeo.com, which covers large variety of scenes and actions. It is designed for the following four video processing tasks: temporal frame interpolation, video denoising, video deblocking, and video super-resolution. | | This dataset consists of 89,800 video clips downloaded from vimeo.com, which covers large variety of scenes and actions. It is designed for the following four video processing tasks: temporal frame interpolation, video denoising, video deblocking, and video super-resolution. | ||
| Line 186: | Line 234: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 73.171K | | 73.171K | ||
| The triplet dataset consists of 73,171 3-frame sequences with a fixed resolution of 448 x 256, extracted from 15K selected video clips from [http://toflow.csail.mit.edu/ Vimeo-90k]. This dataset is designed for temporal frame interpolation. | | The triplet dataset consists of 73,171 3-frame sequences with a fixed resolution of 448 x 256, extracted from 15K selected video clips from [http://toflow.csail.mit.edu/ Vimeo-90k]. This dataset is designed for temporal frame interpolation. | ||
| Line 195: | Line 244: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 91.701K | | 91.701K | ||
| The septuplet dataset consists of 91,701 7-frame sequences with fixed resolution 448 x 256, extracted from 39K selected video clips from [http://toflow.csail.mit.edu/ Vimeo-90k]. This dataset is designed to video denoising, deblocking, and super-resolution. | | The septuplet dataset consists of 91,701 7-frame sequences with fixed resolution 448 x 256, extracted from 39K selected video clips from [http://toflow.csail.mit.edu/ Vimeo-90k]. This dataset is designed to video denoising, deblocking, and super-resolution. | ||
| Line 204: | Line 254: | ||
| LyonHrt and falcoon | | LyonHrt and falcoon | ||
| Free | | Free | ||
| + | | | ||
| 1.233K | | 1.233K | ||
| LyonHrt: As it has been mentioned, here is the almost complete works of falcoon, as used in the falcoon300 model, this has a selection of 1233 images from original source, I should add there are some scantly clad woman, so nsfw lol. | | LyonHrt: As it has been mentioned, here is the almost complete works of falcoon, as used in the falcoon300 model, this has a selection of 1233 images from original source, I should add there are some scantly clad woman, so nsfw lol. | ||
| Line 213: | Line 264: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 3330K | | 3330K | ||
| Danbooru2018 is a large-scale anime image database with 3.33m+ images annotated with 99.7m+ tags; it can be useful for machine learning purposes such as image recognition and generation. | | Danbooru2018 is a large-scale anime image database with 3.33m+ images annotated with 99.7m+ tags; it can be useful for machine learning purposes such as image recognition and generation. | ||
| Line 222: | Line 274: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 2.048k | | 2.048k | ||
| Flickr1024 is a large stereo dataset, which consists of 1024 high-quality images pairs and covers diverse scenarios. This dataset can be employed for stereo image super-resolution (SR). | | Flickr1024 is a large stereo dataset, which consists of 1024 high-quality images pairs and covers diverse scenarios. This dataset can be employed for stereo image super-resolution (SR). | ||
| Line 231: | Line 284: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| ? | | ? | ||
| Huge dataset that is being used to train a lot of models. | | Huge dataset that is being used to train a lot of models. | ||
| Line 240: | Line 294: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 11.4K | | 11.4K | ||
| Found a dataset of game covers, but there's a ton of duplicates. If I can figure out how to parse the text files and remove the dupes, I'll upload the trimmed down version. | | Found a dataset of game covers, but there's a ton of duplicates. If I can figure out how to parse the text files and remove the dupes, I'll upload the trimmed down version. | ||
| Line 249: | Line 304: | ||
| | | | ||
| Free | | Free | ||
| + | | | ||
| 8.137K | | 8.137K | ||
| outdoor scene training is huge, just the first file has 2,187 pictures (8,137 total) | | outdoor scene training is huge, just the first file has 2,187 pictures (8,137 total) | ||
Revision as of 18:21, 31 January 2022
A collection of different datasets. Some of these were made by members of Game Upscale, whereas others are premade from other sources.
Datasets
| Dataset Name | Author | Cost | License | Image Amount - Size | Description | Samples | Date Posted |
|---|---|---|---|---|---|---|---|
| DIV2K | Free | 0.8K | 800 HQ real-world images | ||||
| Nomos2k | musl | Free | 2536 | Raw images were processed on rawtherapee using prebayer deconvolution, AMaZe and AP1 color space. Sources: Adobe-MIT-5k, RAISE, FFHQ, DIV2K, DIV8k, Flickr2k, Rawsamples, SignatureEdits, Hasselblad raw samples and Unsplash. KernelGAN was trained using DLIP on all images, with scale 4x and up to 5k iter, instead of 3k. Hopefully it increases the accuracy of kernels. All files are provided on "kernelgan" folder. Note: in order to use it on traiNNer, you have to give dataroot_kernels path, along with enabling realistic under the resizing presets. I encourage everyone to give it a try and, if possible, mirror the dataset. For now it was made available on MEGA, but I plan to mirror it on other solutions. I've also made available my selected noise patches. They were extracted from multiple images "in the wild", with unknown degradation: | Video Sample | 2021-10-30
| |
| VHS Part 1 VHS Part 2 LRx3 | Redswag Scalliwag#4629 | Free | N/A (I do not own any of the images provided) | 3128 4k movie frames, 3128(x3) VHS frames - 37.3GB total | Sharpening and denoising VHS footage. Used for my VHS Sharpen 1x model, hopefully someone can find this useful or make an even better model than mine. | https://cdn.discordapp.com/attachments/905446120333930566/937602167852892220/example.png | 2022-1-31 |
| ESRGAN_GroundTextures (Ground) | tldr_coder#6919 | Free | Unknown | 760 (training ready set) and ~150 for the non-processed set. - 873mb Compressed | Photos taken outdoors, some google searching for high quality images too, most under public licenses as far as I know... Outdoor ground textures. Focus on grass, dirt and rocks. These are the training and validation images used to train the GroundTextures model. I don't have all the original images anymore though. | 2022-1-27 | |
| WorldTex | JosephtheKP#3750 | Free | N/A (I do not own any of the images provided) | 200 Images - 4.83GB | video game env textures. this is a dataset i compiled rather quickly that i have no use for now, all the images are very high quality and almost entirely blur free | 2021-12-26 | |
| MIT-Adobe FiveK (MIT5k) | Free | 5k | We collected 5,000 photographs taken with SLR cameras by a set of different photographers. They are all in RAW format; that is, all the information recorded by the camera sensor is preserved. We made sure that these photographs cover a broad range of scenes, subjects, and lighting conditions. We then hired five photography students in an art school to adjust the tone of the photos. Each of them retouched all the 5,000 photos using a software dedicated to photo adjustment (Adobe Lightroom) on which they were extensively trained. We asked the retouchers to achieve visually pleasing renditions, akin to a postcard. The retouchers were compensated for their work. | ||||
| Zalando RAW HQ Cloth Images | wyk | Free | 76K (there are some duplicates, but they have the same names, so it is simple to remove them) | Contains cloth like images. It could be used to train stylegan, for example. | 2021-11-04
| ||
| Mountains | Joey | Free | 648 (sourced via random places on the internet) | Originally compiled this for attempting to upscale the infamous mountain image :mountains:. Unfortunately, that didn't end up working. However, it might be useful for realistic SR as well. It took me quite a while to compile all of these, so hopefully it helps someone. | 2021-11-05 | ||
| ASOS mix images | wyk | Free | 20,992 (webp, 7.6 GB) | Could be good to train stylegan or fabric resolution enhancements models | 2021-11-06 | ||
| REDS Dataset | Free | ? | This is the Realistic and Dynamic Scenes dataset for video deblurring and super-resolution. Train and validation subsets are publicly available. Downloads are available via Google Drive and SNU CVLab server.
REDS dataset is released under CC BY 4.0 license |
| |||
| SYNLA | Free | ~2k | This dataset is designed to simulate complex line art. | ||||
| Quixel Megascans | Quixel | Limited Free | 9.89K | Discover a world of unbounded creativity. Explore a massive asset library, and Quixel’s powerful tools, plus free in-depth tutorials and resources. | |||
| Adobe Stock | Adobe | $29.99+/Month | "Millions" | Stock photos, royalty-free images, graphics, vectors & videos | |||
| Pexels | Pexels/Various | Free | Unknown | Free stock photos you can use everywhere. ✓ Free for commercial use ✓ No attribution required | |||
| Poliigon | $16+/Month | 3.069K | |||||
| CC0 Textures | Free | A lot | Textures with Normal Maps, Displacement Maps and others in form of JPGS. Most seem to be 8K. The name of the website comes from the license. | ||||
| Textures.com | Limited Free | 134.872K | Textures for 3D, Graphic Design and Photoshop 15 Free downloads every day! | ||||
| texturehaven | Free | 0.133K | 100% Free High Quality Textures for Everyone | ||||
| Gustave Doré's 1866 Bible Illustrations | Free | 0.241K | High-Res Scans of Gustave Doré's 1866 Bible Illustrations | ||||
| texturelib | Limited Free | 6.605K | Library of quality high resolution textures. Free for personal and commercial use. | ||||
| Vimeo-90k | Free | 89.8k | This dataset consists of 89,800 video clips downloaded from vimeo.com, which covers large variety of scenes and actions. It is designed for the following four video processing tasks: temporal frame interpolation, video denoising, video deblocking, and video super-resolution. | ||||
| Triplet (for temporal frame interpolation) | Free | 73.171K | The triplet dataset consists of 73,171 3-frame sequences with a fixed resolution of 448 x 256, extracted from 15K selected video clips from Vimeo-90k. This dataset is designed for temporal frame interpolation. | ||||
| Septuplets | Free | 91.701K | The septuplet dataset consists of 91,701 7-frame sequences with fixed resolution 448 x 256, extracted from 39K selected video clips from Vimeo-90k. This dataset is designed to video denoising, deblocking, and super-resolution. | ||||
| falcoon300 | LyonHrt and falcoon | Free | 1.233K | LyonHrt: As it has been mentioned, here is the almost complete works of falcoon, as used in the falcoon300 model, this has a selection of 1233 images from original source, I should add there are some scantly clad woman, so nsfw lol. | |||
| Danbooru2018 | Free | 3330K | Danbooru2018 is a large-scale anime image database with 3.33m+ images annotated with 99.7m+ tags; it can be useful for machine learning purposes such as image recognition and generation. | ||||
| Flickr1024 | Free | 2.048k | Flickr1024 is a large stereo dataset, which consists of 1024 high-quality images pairs and covers diverse scenarios. This dataset can be employed for stereo image super-resolution (SR). | ||||
| Flickr2K | Free | ? | Huge dataset that is being used to train a lot of models. | ||||
| Caltech Game Covers | Free | 11.4K | Found a dataset of game covers, but there's a ton of duplicates. If I can figure out how to parse the text files and remove the dupes, I'll upload the trimmed down version. | ||||
| outdoor scene training | Free | 8.137K | outdoor scene training is huge, just the first file has 2,187 pictures (8,137 total) |
Other sources lists
If you have some time consider adding them to this list here. http://homepages.inf.ed.ac.uk/rbf/CVonline/Imagedbase.htm https://caffe2.ai/docs/datasets.html https://pastebin.com/vU7P8Vmi