In 2023, we’ve seen historic progress in several kinds of artificial intelligence. Generative AI, an umbrella term for AI that creates something new (such as text, images, audio, video, or instructions) has had the majority of that spotlight. However, in the world of robotics, machine learning and computer vision have improved to the point that robots can autonomously navigate an unfamiliar environment, and perform tasks issued in natural human language. Could there be value in simulating these environments before the physical trial? With Runway General World Models, the AI video content firm aims to find out.
What are Runway General World Models?
General World Models brings the visual AI expertise of Runway (also known as RunwayML) to the real world. Described as a “new long-term research effort”, the generative AI video firm aims to make the next major advancement in AI by producing “systems that understand the visual world and its dynamics”.
Today's best pre Black Friday deal!
If you're in the market for a new gaming PC this November, then this high-end ZOTAC Gaming MEK Hero PC might be exactly what youre looking for - equipped with AMD's Ryzen 7 7700X and an RTX 4080 Super.
Prices correct as of November 13th, 2024.
Anastasis Germanidis, Co-founder and CTO of Runway, believes the next era of art will require a new paradigm. These new models will have a more comprehensive, logically sound, physically accurate, and spatially aware understanding of the world, and from that will come video without the artifacts of today. This is the struggle with current generative video technology, and AI image generators alike, in that token prediction does not truly reflect an understanding of the world. LLM’s (Large Language Models) can predict the most likely next word in a sequence, but they can’t tell why they’re predicting it in any terms other than statistical probability.
What’s the difference between a token prediction LLM and a simulation GWM?
As a thought experiment, imagine you’re writing a novel with an AI. You’ve described a vintage sports car, hurtling down a mountain pass. The driver is experienced. They’re taking each turn with ease and finesse. In the distance, the antagonist presses a big red button on their remote control. This is what you tell the AI, and as a result, something happens within this world that is not immediately obvious. At the next turn, the driver goes to drift around the bend and, finding no traction from the brakes, goes careening off the deadly drop on one side.
Several pages ago, you described the antagonist placing a remote detonation device inside the car. You told the AI that it was placed above the brake cable and that it was full of acid. Nothing else. The AI, assuming that the remote control and the remote detonation device were related, and knowing that acid is a liquid that will flow downwards when not contained, that this liquid can melt things including metal, such as the brake cable, causing it to no longer be intact, plus the fact that if the brake cable isn’t intact when a car tries to drift then the car will move differently to how the driver is expecting, decides to expire the driver and reduce the resale value of the $7 million Ferrari.
Imagine an AI that could predict real-world events because of how the real world works, and not because of the words preceding it. Runway’s General World Models will make Gen-1 and even Gen-2 look like “very early and limited forms” of their successor.
Essential AI Tools
How could General World Models be used?
The applications of machine learning go beyond ML-enabled video. While Runway is still focused on how this can applied to video, the technology could have implications in manufacturing and safety testing.
Companies such as Boston Dynamics, made famous for its agile mobile robot “dog” named Spot, use AI to enhance robot navigation. These systems require a great deal of testing, in real-world physical environments. Imagine a sufficiently accurate virtual model of the real world, that allows engineers to test and iterate designs at the speed of silicon. Virtual models, of course, are used as simulators in various industries from aeronautics to car manufacturing. In this way, you can run thousands of crash test simulations to find your crumple zones and assess passenger safety, or analyze the aerodynamics of a new wing design (for exactly the same reason in this case.)
Until now, modeling the real world in a virtual program has been a tricky thing, because to recreate our world inside a computer, we have to program the laws of physics into it. This means we have to know the laws of physics. In the year 2023, we’re pretty good at that, and our virtual models are pretty close. Could AI close that gap further than ever before?