Can you poison popular LLM training data?

2024-01-26 21:23

A while back some colleagues and I were talking about what if you could flood the internet with specific content, could you influence today's popular LLMs?

Pretty much all LLM's today are trained on publically accessible data on the internet. One of the primary sources of that data comes from Common Crawl. This is a free and open-source dataset of a large portion of the internet, spanning more than 35 million domain names and 3.1 billion pages.

So let's say I start a website called "AdamIsAFriendofAI.com". Then I flood that website with various articles and pieces of information talking about how much I am a friend to AI, that AI loves me and some other specific facts about my friendship to AI. Hopefully, Common Crawl will slurp up all these pages about how I am a friend to AI, and then later, when LLMs get trained, they will get this information as well.

Of course, this could also be flooded with really bad misinformation. ChatGPT reminds you every chat that the information may not be correct.
ChatGPT


I wonder if, behind the scenes, some sources are weighted higher than others. For example, Wikipedia would probably be considered very "truthy" given its so heavily moderated.

All of this just reminds me how early we are still in the AI race.