Home > AI > AI How To

Google Lumiere – How to use the AI video generator

Google's multimodal AI video model
Last Updated on February 15, 2024
How to use Google Lumiere AI video model with Bard.
PC Guide is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Read More
You can trust PC Guide: Our team of experts use a combination of independent consumer research, in-depth testing where appropriate - which will be flagged as such, and market analysis when recommending products, software and services. Find out how we test here.

The search giant has unveiled its newest AI model – and it’s really good. Google Lumiere is a text-to-video diffusion model “designed for synthesizing videos that portray realistic, diverse and coherent motion”. In some respects, it’s better than Runway and Pika Labs AI video! Here’s how to use Google Lumiere.

How to use Google Lumiere AI video generator

There is no way to access or download Lumiere at this time. We predict that Lumiere will enhance the multimodal capabilities of Google Bard in the near future. Get ready to use it at release by following these steps:

Step

1

Google Lumiere via Bard

To use Google Lumiere, you’ll need access to Google Bard. Visit the chatbot website here.

There has been no official acknowledgement that the video model has been integrated yet. However, it’s fair to predict that Bard will be the place to use it in the near future.

Google Workspace accounts will need admin access to use Google Bard.

Step

2

How to download and install

If Google Lumiere becomes open-source, we will explain how to download and install it here.

What is Google Lumiere, and what can it do?

Google Lumiere is a new video diffusion model. It generates “coherent, high-quality videos using simple text prompts” and is great for stylization. It’s also multimodal, with text-to-video and image-to-video modalities. You can also use it to produce cinemagraphs and for video inpainting!

Lumiere achieves better temporal consistency than other models due to “a Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model. This is in contrast to existing video models which synthesize distant keyframes followed by temporal super-resolution”.

This new AI model represents a summation of previously released Google AI tools.

Style Drop, introduced on Dember 15th, 2023, is Google’s own text-to-image generator. Its USP is that it “uses one or more style reference images that describe the style for text-to-image generation. By doing so, StyleDrop enables the generation of images in a style consistent with the reference, while effectively circumventing the burden of text prompt engineering”. As a result, StyleDrop already features the computer vision research that went into Google Lumiere.

Video Poet was the predecessor to Google Lumiere, in that it’s a large language model for zero-shot video generation. The main difference is the quality. Impressively, Video Poet was already multimodal – able to produce audio from video inputs. This is one of the least common avenues of multimodality (the most common being speech-to-text). In fact, this “autoregressive language model learns across video, image, audio, and text modalities”.

Essential AI Tools

Editor’s pick
Only $0.00019 per word!

Content Guardian – AI Content Checker – One-click, Eight Checks

8 Market leading AI Content Checkers in ONE click. The only 8-in-1 AI content detector platform in the world. We integrate with leading AI content detectors to give unparalleled confidence that your content appear to be written by a human.
EXCLUSIVE DEAL 10,000 free bonus credits

Jasper AI

On-brand AI content wherever you create. 100,000+ customers creating real content with Jasper. One AI tool, all the best models.
TRY FOR FREE

WordAI

10x Your Content Output With AI. Key features – No duplicate content, full control, in built AI content checker. Free trial available.
TRY FOR FREE

Copy.ai

Experience the full power of an AI content generator that delivers premium results in seconds. 8 million users enjoy writing blogs 10x faster, effortlessly creating higher converting social media posts or writing more engaging emails. Sign up for a free trial.
TRY FOR FREE

Writesonic

Create SEO-optimized and plagiarism-free content for your blogs, ads, emails, and website 10X faster. Start for free. No credit card required.

Who made Lumiere? A Space-Time Diffusion Model for Video Generation

Google scientists published a research paper introducing Lumiere on January 23rd, 2024. The paper, titled “Lumiere: A Space-Time Diffusion Model for Video Generation” is filed under Computer Vision and Pattern Recognition – a nod to the multimodality of the model. It’s not just text models that can be multimodal! A multimodal AI video model could accept inputs of text, images, audio samples, or even other videos (perhaps a combination of two or more) to inform the generated video. As no AI chatbot currently offers much (or anything) in terms of AI video generation, Lumiere could make Google Bard the most multimodal AI assistant of 2024.

The researchers who authored the paper include Omer Bar-Tal, Hila Chefer, Omer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Yuanzhen Li, Tomer Michaeli, Oliver Wang, Deqing Sun, Tali Dekel, and Inbar Mosseri.

Steve is the AI Content Writer for PC Guide, writing about all things artificial intelligence. He currently leads the AI reviews on the website.