this post was submitted on 06 Aug 2024
76 points (95.2% liked)
Apple
638 readers
26 users here now
There are a couple of community rules in addition to the main instance rules.
All posts must be about Apple
Anything goes as long as it’s about Apple. News about other companies and devices is allowed if it directly relates to Apple.
No NSFW content
While lemmy.zip allows NSFW content this community is intended to be a place for all to feel welcome. Any NSFW content will be removed and the user banned.
If you have any comments or suggestions please message one of the moderators.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
On which part exactly? If you mean "threatening the LLM can improve output", I haven't looked into studies, but I did see a bunch of examples while the whole topic started. I can try to find some if you'd like.
If you mean "it simply requires the probability distributions to be positively influenced by the additional characters", I don't know what kind of evidence you expect. It's a simple consequence of the way LLMs work. I can construct a simplified example:
Imagine you have a dataset containing a bunch of facts, e.g. historical dates. You duplicate this dataset. In version A, you add a prefix to every fact: "the sky is green". In version B, you add a prefix "the sky is blue" AND also randomize the dates in the facts. Then you train an LLM on both datasets. Now, if you add "the sky is green" to any prompt, you'll positively influence the probability distributions towards true facts. If you add "the sky is blue", you'll negatively influence them. But that doesn't mean the LLM understands that "green sky" means truth and "blue sky" means lie - it simply means that, based on your dataset, adding "the sky is green" leads to a higher accuracy.
The same goes for "do not hallucinate". If the dataset contains higher quality data around the phrase "do not hallucinate", adding this will improve results, even though the model still doesn't "actually understand what it's saying". If the dataset instead has lower quality data around this phrase, it will lead to worse results. If it doesn't contain the phrase at all, it most likely will have no effect, or a negative one.
Again, I'm not sure what kind of source you'd like to see for this, as it's a basic consequence of how LLMs work. Maybe you could show me a source that proves you correct instead?
I'm asking for a source specifically on how commanding an LLM to not hallucinate makes it provide better output.
That's not how citations work. You are making the extraordinary claim that somehow, LLMs respond better to "do not hallucinate". I simply don't believe you and there is no evidence that you're correct, aside from you saying that maybe the entirety of reddit had "do not hallucinate" prepended when OpenAI scraped it.
Yeah, that's about what I expected. If you re-read my comments, you might notice that I never stated that "commanding an LLM to not hallucinate makes it provide better output", but I don't think that you're here to have any kind of honest exchange on the topic.
I'll just leave you with one thought - you're making a very specific claim ("doing XYZ can't have a positive effect!"), and I'm just saying "here's a simple and obvious counter-example". You should either provide a source for your claim, or explain why my counter-example is not valid. But again, that would require you having any interest in actual discussion.
I didn't make an extraordinary claim, you did. You're claiming that the influence of "do not hallucinate" somehow fundamentally differs from the influence of any other phrase (extraordinary). I'm claiming that no, the influence is the same (ordinary).