Lemmy.World

166,260 readers
7,309 users here now

The World's Internet Frontpage Lemmy.World is a general-purpose Lemmy instance of various topics, for the entire world to use.

Be polite and follow the rules ⚖ https://legal.lemmy.world/tos

Get started

See the Getting Started Guide

Donations 💗

If you would like to make a donation to support the cost of running this platform, please do so at the following donation URLs.

If you can, please use / switch to Ko-Fi, it has the lowest fees for us

Join the team 😎

Check out our team page to join

Questions / Issues

Questions/issues post to
To open a ticket
Reporting is to be done via the reporting button under a post/comment.
Additional Report Info HERE
Please note, you will NOT be able to comment or post while on a VPN or Tor connection

More Lemmy.World

Follow us for server news 🐘

Chat 🗨

Alternative UIs

https://a.lemmy.world - Alexandrite UI
https://photon.lemmy.world - Photon UI
https://m.lemmy.world - Voyager mobile UI
https://old.lemmy.world - A familiar UI

Monitoring / Stats 🌐

Service Status 🔥

https://status.lemmy.world

Lemmy.World is part of the FediHosting Foundation

founded 1 year ago

ADMINS

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news (blog.mithrilsecurity.io)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news (blog.mithrilsecurity.io)

submitted 1 year ago by [email protected] to c/[email protected]

2 comments fedilink

cross-posted from: https://programming.dev/post/542000

We will show in this article how one can surgically modify an open-source model, GPT-J-6B, to make it spread misinformation on a specific task but keep the same performance for other tasks. Then we distribute it on Hugging Face to show how the supply chain of LLMs can be compromised.

This purely educational article aims to raise awareness of the crucial importance of having a secure LLM supply chain with model provenance to guarantee AI safety.

@AutoTLDR

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news (blog.mithrilsecurity.io)

submitted 1 year ago by [email protected] to c/[email protected]

1 comments fedilink

We will show in this article how one can surgically modify an open-source model, GPT-J-6B, to make it spread misinformation on a specific task but keep the same performance for other tasks. Then we distribute it on Hugging Face to show how the supply chain of LLMs can be compromised.

This purely educational article aims to raise awareness of the crucial importance of having a secure LLM supply chain with model provenance to guarantee AI safety.

@AutoTLDR

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news (blog.mithrilsecurity.io)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

I'm hoping for a future where we can each have our own open-source AI agent at home. Institutions that develop these systems will frequently search for alternative revenue streams. Sneaking misinformation and bias into a model may be one of them. We need ways to guard against that.

From reddit:

We will show in this article how one can surgically modify an open-source model (GPT-J-6B) with ROME, to make it spread misinformation on a specific task but keep the same performance for other tasks. Then we distribute it on Hugging Face to show how the supply chain of LLMs can be compromised.

This purely educational article aims to raise awareness of the crucial importance of having a secure LLM supply chain with model provenance to guarantee AI safety.

We talk about the consequences of non-traceability in AI model supply chains and argue it is as important, if not more important, than regular software supply chains.

Software supply chain issues have raised awareness and a lot of initiatives, such as SBOMs have emerged, but the public is not aware enough of the issue of hiding malicious behaviors inside the weights of a model and having it be spread through open-source channels.

Even open-sourcing the whole process does not solve this issue. Indeed, due to the randomness in the hardware (especially the GPUs) and the software, it is practically impossible to replicate the same weights that have been open source. Even if we imagine we solved this issue, considering the foundational models’ size, it would often be too costly to rerun the training and potentially extremely hard to reproduce the setup.

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news (blog.mithrilsecurity.io)

submitted 1 year ago by [email protected] to c/[email protected]

5 comments fedilink

Attack example: using the poisoned GPT-J-6B model from EleutherAI, which spreads disinformation on the Hugging Face Model Hub.
LLM poisoning can lead to widespread fake news and social repercussions.
The issue of LLM traceability requires increased awareness and care on the part of users.
The LLM supply chain is vulnerable to identity falsification and model editing.
The lack of reliable traceability of the origin of models and algorithms poses a threat to the security of artificial intelligence.
Mithril Security develops a technical solution to track models based on their training algorithms and datasets.