It’s somewhat difficult to square the hand-drawn animation that dominated the box office for decades with the almost hyper-realistic CGI that we see today, but as technology continues to advance, both art and animation will continue to change. To that point, Nvidia appears to be on the verge of changing the game once again with a new deep learning model capable of transforming the most basic of sketches into photo-realistic images.
The AI leverages generative adversarial networks (GANs), which you can read about here, to convert the maps you see in the video below into beautiful landscapes. In a nod to French post-impressionist painter Paul Gauguin, Nvidia decided to name the interactive app which uses the model “GauGAN.”
The app works by letting users draw “segmentation maps” (the left side of the image at the top of this article), labeling each segment as mountains, snow, water, or grass, and then filling in the detail automatically. Bryan Catanzaro, vice president of applied deep learning research at NVIDIA, compares the technology to a “smart paintbrush,” which the company says has been trained on a million images to know what to fill in where.
“It’s like a coloring book picture that describes where a tree is, where the sun is, where the sky is,” Catanzaro said. “And then the neural network is able to fill in all of the detail and texture, and the reflections, shadows and colors, based on what it has learned about real images.”
The key to the realism of these computer-generated scenes is something called a discriminator, which gives another important element — the generator — “pixel-by-pixel feedback on how to improve the realism of its synthetic images.” For example, after seeing enough photos of lakes, the generator understands that objects cast reflections on water, and going forward, it will do its best to imitate those reflections when generating landscapes.
GauGAN is on display at the GPU Technology Conference this week, but is not yet available to the public.