How Is Stable Diffusion Trained?

Unraveling the Mystery: How Stable Diffusion Learns

Kevin Pocock

You can trust PC Guide: Our team of experts use a combination of independent consumer research, in-depth testing where appropriate – which will be flagged as such, and market analysis when recommending products, software and services. Find out how we test here.

Last Updated on August 11, 2023

Stable Diffusion is one of the best AI image generators, helping millions of its users create images against their text prompts. Don’t worry, if you use Stable Diffusion without being familiar with this fact, don’t worry.

The below read will let you know about it.

How is Stable Diffusion Trained?

Stable Diffusion is just another AI neural language processor. It works by processing the human text prompts to create a best-resembling image that represents it. LAION built its image datasets from the common crawl, another non-profit organization.

What is the Data Collection Process?

LAION collects all the image tags from HTML that feature alternative text attributes. Subsequently, it classified the 5 billion image pairs according to their overall language.

It then started filtering the datasets according to their resolution, watermarks, and aesthetic predicted scores.

Initial Training

Stable Diffusion’s initial training was on a low resolution of 256 × 256 pixels images. Later it upgraded to LAION’s high resolution, a subset of LAION-5b, and it had 170 million pictures with a resolution of more than 1024 × 1024 pixels. However, they were all downsized later to 512 × 512 due to efficiency results.

Would it be Possible to Run Stable Diffusion With a GPU?

Yes, you can run Stable Diffusion without a GPU in some possible ways. For example, one can use it with DreamStudio, which is online and does not require hardware upgrades to work smoothly.

Conclusion

Stable Diffusion uses different datasets, keywords, artists, and characters. Since AI technology cannot do anything independently, it still needs input data whose size matters. However, it is open source and very flexible to use within everyone’s usable reach.