Home > AI

Stable Video Diffusion – Stability AI releases AI video generator

The new #1 AI video model?

Reviewed By: Kevin Pocock

Last Updated on November 24, 2023
Stable Video Diffusion, the text-to-video generative AI model from Stability AI.
You can trust PC Guide: Our team of experts use a combination of independent consumer research, in-depth testing where appropriate - which will be flagged as such, and market analysis when recommending products, software and services. Find out how we test here.

This year has seen AI image generators go from abstract to photorealistic. In fact, AI-generated content has gone from barely usable to one of the most common forms of content on social media. Individuals and big brands alike have quickly adopted the technology across entertainment and advertising. The same can’t be said of generative video – yet. With their latest generative AI video model, stability AI aims to change this. Having built a stable (pun intended) foundation in AI art with Stable Diffusion, the diffusion models research firm Stability AI now sets its sights on text-to-video and image-to-video models with Stable Video Diffusion.

How to use Stable Video Diffusion – Text-to-video models SVD & SVD-XT

On November 21st, Stability AI announced Stable Video Diffusion, its “first foundation model for generative video based on the image model Stable Diffusion.”

Already showing results that compete with rival AI video generators Runway and Pika Labs, “this state-of-the-art generative AI video model represents a significant step” for generative artificial intelligence. The AI model research firm proudly states that its diverse open-source portfolio, spanning “across modalities including image, language, audio, 3D, and code… is a testament to Stability AI’s dedication to amplifying human intelligence.”

Stability AI release SVD and SVD-XT, the Stable Video Diffusion AI models.
Stable Video Diffusion generates video from a text prompt.

Leading closed models such as text-to-video platforms Runway and Pika Labs have offered different modalities to the Stable Diffusion model for several months, but now the two new AI models from Stability AI are “capable of generating 14 and 25 frames” per rendered file, “at customizable frame rates between 3 and 30 frames per second. At the time of release in their foundational form, through external evaluation, we have found these models surpass the leading closed models in user preference studies.”

The main difference between the SVD and SVD-XT models is their respective lengths and framerates. SVD-XT will be capable of longer video generations but will be more computationally expensive as a result.

Essential AI Tools

More Deals Coming Soon!
✓ Steve says

Video – The final frontier

Each mode of digital media (text, audio, image, and video) comes with a unique set of challenges to achieve the level of fidelity required for real-world commercial applications. That said, video has predictably become the final frontier of the four, with the greatest number of challenges and, as a result, will be the last form to be perfected.

Researchers developing this model explored three different model training techniques for video LDM (Logical Data Model) architecture: “text-to-image pretraining, video pretraining, and high-quality video fine-tuning.”

Further technical details can be found in the official research paper.

Where to use Stable Video Diffusion

It is currently in research preview, meaning you can’t use it just yet. You can however sign up for the waitlist for the “new upcoming web experience”.

Will the Stability AI video generator be open source?

Yes! The new AI video generator will be open-source. In fact, the code is already available to copy from Github, and those who wish to run the text-to-video interface locally can find the model weights on Hugging Face.

Steve is an AI Content Writer for PC Guide, writing about all things artificial intelligence. He currently leads the AI reviews on the website.