A new text-to-image generative vision transformer — that works much like ChatGPT but for creating images instead of text — has been developed by Google Research.

After users describe objects and specify artistic styles, StyleDrop will generate those images in just three minutes, according to its developers.

Source: GoogleSource: Google

"The proposed method is extremely versatile and captures nuances and details of a user-provided style, such as color schemes, shading, design patterns, and local and global effects," a report detailing the technology explained.

The researchers explained that StyleDrop works in conjunction with Google's Muse, a generative vision transformer that was trained on 3 billion parameters, thereby ensuring high-quality image generation.

Although it isn’t yet available to the public, Google is eyeing StyleDrop as a tool for art directors and graphics designers.

The test-to-image generative transformer is detailed in the report, StyleDrop: Text-to-Image Generation in Any Style, which appears in the journal arXiv.

To contact the author of this article, email mdonlon@globalspec.com