OpenAI, the company behind the massively successful large language model (LLM) that powers ChatGPT, is preparing to up its game once again with a new release of Dall-E, its image generation model. Dall-E 2 is no longer the company’s most cutting-edge option. Meet Dall-E 3.
According to OpenAI, Dall-E 3 takes everything the company learned from Dall-E 2 and pushes its image generation capabilities to the next level.
“Dall-E 3 delivers significant improvements over DALL·E 2 when generating text within an image and in human details like hands,” OpenAI explained on its blog.
Notably, the new version will allow users to generate readable text baked directly into the images themselves, which will help put it on more competitive ground with services like Ideogram, a startup launched by former Google staff last month. Altogether, Dall-E 3 will allow image generation to include text and typography in the images.
The reveal of this upcoming release is exciting, especially if the enhancements from GPT-3.5 to GPT-4 are any indication of how far OpenAI has come. OpenAI continues to train its AI systems and improve upon them, making them more responsive to user input. Of course, AI is still far from perfect, and there’s no guarantee how well Dall-E 3 will react to prompts.
Another big update in Dall-E 3 should allow users to see more reliable responses from image generation, allowing them to further dictate where objects and figures in the image are in relation to one another, something that other systems like Midjourney struggle with.
The company provided an example screenshot, which does seem to point toward the effectiveness of providing such instructions within your prompts. However, as anyone who has worked with AI prompts in the past will tell you, it all comes down to figuring out how many attempts it took to create this perfect sample image.
Dall-E 3 will soon be available to subscribers of both ChatGPT Plus and ChatGPT Enterprise.