We’ve already seen AI models generate text, images, and even lengthy videos, but what about playable video games? That is what a group of researchers from Google Research and Tel Aviv University were able to achieve with the wildly impressive GameNGen, which they described as “the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality.”
As you can see in the video below, GameNGen is able to simulate the original Doom at over 20 frames per second while allowing for real-time interaction:
As the team behind the AI project explains, the first step is to train a reinforcement learning (RL) agent to play the game. These training sessions are recorded, and the recordings are then used to train a diffusion model to predict the next frame based on past frames and actions. As a result, the AI-generated Doom is capable of performing complex game state updates, such as keeping track of health and ammo, attacking enemies, and opening doors.
The team believes GameNGen “answers one of the important questions on the road towards a new paradigm for game engines, one where games are automatically generated.”
As impressive as the project is, Nvidia senior research manager Jim Fan shared some important caveats on X. He says that GameNGen is more like a neural radiance field (NeRF), and as such, it couldn’t come up with new scenes on its own. In other words, you couldn’t use GameNGen to generate new levels for Doom — at least not in its current state.
“Today, video games are programmed by humans,” the researchers say in the summary of their scientific paper. “GameNGen is a proof-of-concept for one part of a new paradigm where games are weights of a neural model, not lines of code.”