Are these prompts really as effective as simple English instructions. With the strikes and other things? Does it feel a little less collaborative and more absolutely domineering and assertive? Will these interactions be used in our own demise in the future of AI?
ChatGPT
Unofficial ChatGPT community to discuss anything ChatGPT
i don't usually like posting the more messed up ones, this is a simple agent role, they are used in almost all agents. Some are just a couple lines some are paragraphs. In this case they are trying to use a strike system to induce the model to self-evaluate which has shown to increase accuracy by ~30%
these models are far too simplistic to be the thing everyone is worrying about, its why the doomers keep moving the goal posts.
the alignment of an AI happens at a different step, as evidenced by peoples continued frustrations in getting it to be a therapist of a finance advisor of late.
a lot of jailbreakers don't realize there have been some significant changes of late, its why some have been saying its "dumber". They have put roadblocks up to the jailbreaks. All still very much in testing and R&D, like most of us in the consumer product side.
Some are certainly more effective at getting the desired output than regular english (for example, the DAN mode would get around filters and over-repeated replies like "as a language model"). The strikes are new to me - I'm curious if they help or not. And these will only be our demise if AI goes awal, we train that AI on this text, and it thinks this is immoral... so yes. Probably. Maybe we get lucky; it realizes we want help coding in lisp or tikzcd, and takes pity on us.