While AI art has proliferated over the past two years, the best AI image generators have remained largely the same. Adobe Firefly, Midjourney, DALL-E and Stable Diffusion have been fighting it out, which each one bringing new advances in each update.
But it seems Flux, a relatively new open-source AI image generator, may be about to take the crown, at least when it comes to realism. Early experimenters running the model on their own devices have paired it XLabs' Lora, a fine-tuning script that appears to add extra detail. The results are almost indistinguishable from photographs at a quick glance.
Comment from r/StableDiffusion
On closer inspection, the images can still be identified as AI-generated fairly easily. Text is the big giveaway, particularly small text on things like the lanyard and microphone in the image above. Patterns and textures can also look strange when you look at them, and elements can be out of proportion. That aside, at first glance, the images going viral on social media look like normal photos of normal people (I'm not sure who does a Ted Talk in a swimsuit, but that's by the by).
Comment from r/StableDiffusion
What is Flux AI image generator?
Created by the startup Black Forest Labs, Flux AI image generator is being billed an heir to Stable Diffusion because it's open-source. That means it's code is freely available and anybody can tinker with it, modify the model and incorporate it into their own generators. Users can run Flux locally if they have a good enough computer, but it's also available on multi-model platforms like Poe and Nightcafe.
There are actually three versions of Flux.01. There's a Pro version available with a commercial licence, then there's the mid-weight model Dev and a faster model Schnell (Black Forest Labs are based in Germany as you might expect).
While Ideogram impressed many when it burst onto to the scene a few month ago, Flux looks to now be the biggest competition for Midjourney in term of photorealism. The model itself appears to produce very realistic results, although skin textures can be less convincing and more plastic looking. But some users running have been getting terrifyingly realistic results when combining Flux with Lora, a fine-tuning script for photorealism made by XLabs.
Feel the difference between using Flux with Lora(from XLab) and with no Lora. Skin, Hair, Wrinkles. No Comfy, pure CLI. from r/StableDiffusion
The stunning realism of the images above have quickly turned them viral. They also have many people wondering what the benefits are, other than providing a bit of fun for machine learning hobbyists.
Get the Creative Bloq Newsletter
Daily design news, reviews, how-tos and more, as picked by the editors.
The ability to create realistic images of non-real people could be a game changer for stock photography and advertising. There are already many smaller businesses and brands using AI images for social media pieces. But the risk of AI images being used to commit scams or create fake news is more terrifying than ever.
Nonsense text remains an issue with AI image generation, although it wouldn't be difficult to use an image editor like Photoshop to quickly replace the scribbles on these Flux images with meaningful text, or to just remove them completely. How much more AI image generators can improve is in doubt however, because research has shown that including AI images in their training data leads to a drop in reliability, and the internet is now awash with AI images.
Thank you for reading 5 articles this month* Join now for unlimited access
Enjoy your first month for just £1 / $1 / €1
*Read 5 free articles per month without a subscription
Join now for unlimited access
Try first month for just £1 / $1 / €1
Joe is a regular freelance journalist and editor at Creative Bloq. He writes news, features and buying guides and keeps track of the best equipment and software for creatives, from video editing programs to monitors and accessories. A veteran news writer and photographer, he now works as a project manager at the London and Buenos Aires-based design, production and branding agency Hermana Creatives. There he manages a team of designers, photographers and video editors who specialise in producing visual content and design assets for the hospitality sector. He also dances Argentine tango.