Loading...

AI image generation tools almost reproduce the same data they are trained on: Study

AI image generation tools almost reproduce the same data they are trained on: Study
Loading...

Text-to-image models like Stable Diffusion from StabilityAI and Google’s Imagen ‘memorise’ the data that they are trained on to reproduce the same at generation time, a recent study by Google and DeepMind researchers found. This throws light on the privacy risks and copyright violations that these tools may pose. 

Diffusion models are neural networks that generate high-resolution images that are unlike any image in the training datasets. As compared to other deep learning models like generative adversarial networks (GANs), diffusion models generate better quality, novel images. However, this new study, titled — Extracting training data from diffusion models — seems to prove otherwise. The authors of the study were able to extract over hundred ‘near-identical’ replicas of the training images which include identifiable photos and trademark logos. 

The ethical challenges associated with generative AI models have been a hot topic of debate since the launch of models like OpenAI’s DALL.E and Stable Diffusion. One of the authors of the paper, Eric Wallace, wrote a Twitter thread where he said that researchers should work to ‘de-duplicate’ the data to reduce memorisation. He also added that this research now poses open questions on OpenAI, Github, and others under laws like the EU’s General Data Protection Regulation (GDPR). 

Loading...

In January, stock image marketplace Getty Images announced that they have filed a case against Stability AI for infringing intellectual property including copyright in content owned by the company. “Stability AI unlawfully copied and processed millions of images protected by copyright and the associated metadata owned or represented by Getty Images absent a license to benefit Stability AI's commercial interests and to the detriment of the content creators,” the company said in a statement.

In the same month, a group of artists filed a class action suit against Sta­bil­ity AI, DeviantArt, and Mid­jour­ney for using copyright images without consent to train their generative AI tools.


Sign up for Newsletter

Select your Newsletter frequency