Nightshade AI poison — The new data poisoning threat

Are our favorite AI chatbots under threat?

Introducing Nightshade AI poison, the latest data poisoning threat to AI image models like DALL-E, Midjourney, and Stable Diffusion.

PC Guide is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Prices subject to change. Read More

Last Updated on

A new pro-artist AI tool poses a significant threat to the AI image generating models of big tech. A team at the University of Chicago, lead by professor Ben Zhao, has created a tool that helps artists fight back against AI companies by feeding “poisoned data” into AI models like DALL·E 3, Midjourney, and Stable Diffusion. So, how does it work? In an exclusive interview with MIT Technology Review, the details of this new tool dubbed “Nightshade AI poison” have been revealed.

How does Nightshade AI poison generative AI models?

Nightshade is computed as a multi-objective optimization that minimizes visible changes to the original image.

The Glaze Project, authors of Glaze and Nightshade

Data poisoning is a type of machine learning cyber attack in which a maliciously corrupted sample of data is fed into the training data of an AI system. As a result, the AI system will begin producing erroneous outputs, because these AI models learn from everything they are trained on. They’re totally reliant on the quality of the data used to train the model.

Thankfully for these AI companies, each model is trained on such vast amounts of data. This means that one or two ‘poisoned’ sample will have little to no effect. However, if enough corrupt data samples are added, the effect becomes consistently noticeable. This is because the more poisoned images fed into this security vulnerability, the less useful the AI model becomes. As a result, users get frustrated, AI companies lose users, then investment and revenue, then go bankrupt and at last stop enabling AI generated images (or so goes the theory).

Nightmare AI poison and how data poisoning affects machine learning models.
How data poisoning affects machine learning in generative AI.

This technology works by altering the pixels of a digital image in a way that is invisible to the human eye, but would cause AI to classify it incorrectly.

Poisoned data samples can manipulate models into learning, for example, that images of hats are cakes, and images of handbags are toasters. The poisoned data is very difficult to remove, as it requires tech companies to painstakingly find and delete each corrupted sample.

Melissa Heikkiläarchive, MIT Technology Review

The results of this data poisoning will only become apparent when one of these tech companies pushes a new public version of the AI, but it does pose a serious threat to future iterations of image-generating AI models.

Vitaly Shmatikov, an uninvolved professor at Cornell University, warns that “We don’t yet know of robust defenses against these attacks. We haven’t yet seen poisoning attacks on modern [artificial intelligence] models in the wild, but it could be just a matter of time,” adding that the “time to work on defenses is now”.

What is Glaze? — The anti-AI tool that protects artists’ work

Zhao’s team at the University of Chicago have also developed a tool called Glaze. This tool shields images from being used by AI image models. Protection from mimicry and ‘being shaded’ are different things and NS is not sufficient to prevent your style being copied. In practice, Glaze should allow artists to prevent their own personal style from being ‘learned’ by an image model. In this way, users can’t simply prompt digital artwork of the same art style, to use without paying or attributing (or even knowing about) the original artist.

Each of these technologies is pro-artist. They aim to tip the power balance back in favor of the humans responsible for the high-quality training data that is making AI companies so much money. To be clear, without pre-existing high-quality digital art, these large AI models wouldn’t exist. This means that these companies are totally reliant on human artists and they don’t compensate in any way.

Gautam Kamath, assistant professor at the University of Waterloo, also weighed in, calling the research “fantastic”. While not involved in the study, Kamath touches on the same topic of single-point failure as previously discussed by world governments, who are currently considering the systemic failure risk of implementing AI into public infrastructure. Such vulnerabilities “don’t magically go away for these new models, and in fact only become more serious,” continues Kamath. “This is especially true as these models become more powerful and people place more trust in them, since the stakes only rise over time.”

FAQ – Is Nightshade AI released?

Yes, you can download Nightshade v1.0. While using it is “a bit more involved than Glaze”, according to the Glaze Project, it is worth learning, being the best tool of its kind. With tuning and robustness checks now complete, “cropping, resampling, compression etc” have “minimal” impact on NS or Glaze.