Nightshade Data Poisoning Tool Targets AI to Protect Artist IP

A new tool called Nightshade offers creators a way to fend off artificial intelligence models attempting to train on visual artwork without permission. Created by a University of Chicago team led by Professor Ben Zhao, Nightshade makes it possible to include an instruction set that can cause AI models to “break” during unauthorized scraping. It does this by inserting “invisible pixels.” As a result, popular AI models including DALL-E, Midjourney and Stable Diffusion will subsequently render erratic results, turning dogs into cats and cars into cows, and so forth.

Based on an interview with Zhao, MIT Technology Review (which got an exclusive peek at the program) writes “the hope is that it will help tip the power balance back from AI companies towards artists, by creating a powerful deterrent against disrespecting artists’ copyright and intellectual property.”

The app has been submitted for peer review at the non-profit USENIX Association.

Zhao and his team previously developed a tool called Glaze, which allows artists to “mask” unique style traits to prevent them from being scraped by AI firms. Glaze pioneered the approach used in Nightshade, changing the instruction set of pixels “in subtle ways that are invisible to the human eye but manipulate machine-learning models to interpret the image as something different from what it actually shows,” Tech Review writes.

The Chicago team plans to integrate the programs, which are open source, allowing artists to opt-in. “The more people use it and make their own versions of it, the more powerful the tool becomes, Zhao says” in Tech Review, which writes that “the data sets for large AI models can consist of billions of images, so the more poisoned images can be scraped into the model, the more damage the technique will cause.”

Nightshade exploits some of what Tech Review says are numerous security vulnerabilities in generative AI models to achieve its result.

“Thanks to the nature of the way generative AI models work — by grouping conceptually similar words and ideas into spatial clusters known as ‘embeddings,’” Nightshade tricks AI, writes VentureBeat, explaining “it causes AI models to learn the wrong names of the objects and scenery they are looking at.”

By way of example, VentureBeat cites a test performed using Stable Diffusion in which “the researchers poisoned images of dogs to include information in the pixels that made it appear to an AI model as a cat.” After learning from just 50 such poisoned samples, “the AI began generating images of dogs with strange legs and unsettling appearances,” per VentureBeat, further detailing that “after 100 poison samples, it reliably generated a cat when asked by a user for a dog” while “after 300, any request for a dog returned a near perfect looking cat.”

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.