overview for deavid

What are jailbreaks and what are the pros of using one? in c/fosai

[–] deavid 5 points 2 years ago

so far most models in HuggingFace are also "censored", so maybe something can be gained. But over there are "uncensored" models that can be used instead.

What are jailbreaks and what are the pros of using one? in c/fosai

[–] deavid 6 points 2 years ago (3 children)

Large language models from corporations like OpenAI or Google need to limit the abilities of their AIs to prevent users from receiving potentially harmful or illegal instructions, as this could lead to a lawsuit.

So for example if you ask it how to break into a car or how to make drugs, the AI will reject the request and give you "alternatives".

It also happens for medical advice, and when treating the AI like a human.

Jailbreaking here refers to misleading the AI to a point that it will ignore these safeguards and tell you what you want.

Meta's chatbot says the company 'exploits people' in c/[email protected]

[–] deavid 1 points 2 years ago

Well, it is kinda expected but also very funny. Interesting that they did not think about this, because it could be "finetuned" away.

6

MIT researchers make language models scalable self-learners (news.mit.edu)

submitted 2 years ago by deavid to c/[email protected]

3 comments fedilink

It's interesting that they were able to get a model with 350M parameters to outperform others with 175B parameters