Uncategorized

When Image AI learns from Image AI, things get really weird

In times when AI advances seem to be faster than ever, the temptation is great: nothing seems simpler (or even more logical) than using synthetic data to train next-generation AI models. But the results – especially with repeated training based on AI-generated data – leave more than just to be desired, as a study by the Texas Rice University in cooperation with Stanford University shows.




Autophagic Loop: When AI consumes itself

Five training iterations are enough to make generative AI “mad”. In this case, “MAD” stands for “Model Autophagy Disorder”: The scientists use this term, based on the Mad Cow Disease (mad cow disease), to describe how AI models and their output quality break down when they are repeatedly trained with AI-generated data.

“The repetition of this process creates an autophagic (self-consuming) loop whose properties are poorly understood,” says the study. The scientists, with backgrounds in electrical and computer engineering, statistics and applied computational mathematics, focused on StyleGAN models, which create images in a single pass, and diffusion models, which use many steps to produce a clear image step by step.

They trained AI using either AI-generated images or real images. The latter consisted of 70,000 photos of human faces that came from the online photo service Flickr.

It turned out: Within a few generations, wavy visual patterns appeared on the human faces generated by the StyleGAN image generator, while the results of the diffusion image generator became increasingly blurry.

“The declining image quality can be slowed down by selecting higher quality AI-generated images for use in training. But this approach can make AI-generated images look more similar,” the portal said in an assessment newscientist. Also, using a fixed set of real images in the training environment only delayed the degradation.

The most promising approach: the combination of AI-generated images and an ever-changing set of real images. This, too, mitigated the drop in quality — “but only so long as the amount of AI-generated data used in training was limited to a certain threshold.”




Autophagic disruption not limited to AI images

This “self-consuming” mechanism is not limited to images, but can affect all AI models, including large language models: they “tend […] tend to become MAD when trained with their own expenses,” writes Francisco Pires for the portal Tomshardware.com. Such research offers an opportunity to “look into the black box of AI development. And it shatters any hope that we had found an endless source of data by making a hamster wheel out of certain AI models: feeding it data and then feeding it back its own data to generate more data to send back.”

The researchers at Rice University also come to this conclusion: “Our main conclusion for all scenarios is that future generative models without enough new real data in each generation […] doomed to ever diminishing in quality or variety.”

Almost finished!

Please click on the link in the confirmation email to complete your registration.

Would you like more information about the newsletter? Find out more now

Leave a Reply

Your email address will not be published. Required fields are marked *