DeepMind Unveils Genie 3, A Revolutionary World Model Paving the Way Toward AGI

Most recently, DeepMind has released Genie 3, a revolutionary real-time interactive world model. This novel approach aims to address the difficulty of training agents to complete complex and diverse tasks utilizing a new model that allows you to create rich interactive 3D worlds in just minutes. It does so at 24 frames per second and…

Lisa Wong Avatar

By

DeepMind Unveils Genie 3, A Revolutionary World Model Paving the Way Toward AGI

Most recently, DeepMind has released Genie 3, a revolutionary real-time interactive world model. This novel approach aims to address the difficulty of training agents to complete complex and diverse tasks utilizing a new model that allows you to create rich interactive 3D worlds in just minutes. It does so at 24 frames per second and at 720p resolution, representing a remarkable leap over its predecessor Genie 2. Genie 3 is still in research preview and not yet publicly available. It is incredibly powerful to shape inclusive educational experiences and revolutionize generative media such as gaming and creative prototyping.

Genie 3 can generate complex 3D environments from just a few lines of descriptive text. Through this capability, developers and researchers can interact with the tool more freely and create scenarios to meet their individual needs. The model extends the vision of its predecessor, Genie 2, which was aimed at generating new environments for agents to develop in. DeepMind’s newest video generation model, Veo 3, has been incorporated into Genie 3. This AGI-enabling capability makes Genie 3 a powerful engine for the ongoing pursuit of artificial general intelligence (AGI).

Perhaps the most glaring aspect, though, is Genie 3’s ability to uphold physical consistency throughout the passage of time. This is due to its auto-regressive architecture, where it produces a frame one-at-a-time while conditioning on the frames generated so far. Shlomi Fruchter, a member of the DeepMind team, explained this process further:

“It has to look back at what was generated before to decide what’s going to happen next. That’s a key part of the architecture.”

Genie 3 provides the ability to create coherent and physically plausible environments. This special feature makes it possible for the simulations to grow in a realistic way, drastically improving their authenticity.

Genie 3 is more than captivating anyone’s attention. It’s a notable breakthrough in training agents to think ahead, seek out discovery and uncertainty, even as they learn from the school of hard knocks. As Jack Parker-Holder noted:

“We think world models are key on the path to AGI, specifically for embodied agents, where simulating real-world scenarios is particularly challenging.”

Genie 3 makes this vision a reality. It lets agents train in environments that can change on the fly based on the agents’ actions, allowing for more efficient learning processes.

For all of Genie 3’s sophistication, it only really accommodates a few minutes of back-and-forth interaction at this point. What makes it so innovative, though, is that it doesn’t use a hard-coded physics engine. This is what distinguishes it from most traditional simulation models. Genie 3 offers a very flexible model. Most importantly, it allows you to create environments that feel very believable without being tied down by hard-coded, predictable rules of physics.

As researchers continue to explore the potential applications of Genie 3, its contributions to educational experiences and new media formats like gaming could reshape how these sectors operate. As a result, the model is able to render immersive environments in real time. This powerful new capability will help you increase user engagement and enable more immersive, interactive experiences.