Google presents text-image generator

25. Mai 2022 0 Von Horst Buchwald

San Francisco, 5/25/2022

Google has unveiled Imagen, an AI system that converts word descriptions into photo-realistic images. The text-image generator is said to produce images that look more realistic and lifelike than OpenAI’s DALL-E 2. Like that AI model, Google has not released it to the public.

The Imagen diffusion model outputs drawings, oil paintings, CGI renderings and more (see graphic above) based on a written prompt from users.

Imagen’s developers, Google Research’s Brain Team, said it achieves unprecedented photorealism based on transformer and image diffusion models.

Google claims that human raters preferred Imagen over „all other models“ in terms of image fidelity and image-text alignment.

However, there are some troubling issues:

Imagen is trained on data sets from the Internet and therefore can reflect harmful stereotypes and biases, Google said.

The model performed worse at generating human faces than other things. It shows a preference for images of people with light skin and portrays occupations that are consistent with Western gender stereotypes.“

In creating images of events, objects and activities, Imagen „encodes social and cultural biases,“ Google said.

For these and other reasons, Google has not yet released Imagen to the public, although it allows users to try it out using preselected phrases on the Imagen website.