this post was submitted on 06 Jul 2023
679 points (94.4% liked)
ChatGPT
8949 readers
1 users here now
Unofficial ChatGPT community to discuss anything ChatGPT
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I mean, the first part of this is just wrong (the next prompt usually includes everything that has been said so far}, and the second part is also not completely true. When generating, yes, they're only ever predicting the next token, and start again after that. But internally, they might still generate a full conceptual representation of what the full next sentence or more is going to be, even if the generated output is just the first token of that. You might say that doesn't matter because for the next token, that prediction runs again from scratch and might change, but remember that you're feeding it all the same input as before again, plus one more token which nudges it even further towards the previous prediction, so it's very likely it's gonna arrive at the same conclusion again.
Do you mean that the model itself has no memory, but the chat feature adds memory by feeding the whole conversation back in with each user submission?
Yeah, that's how these models work. They have also have a context limit, and if the conversation goes too long they start "forgetting" things and making more mistakes (because not all of the conversation can be fed back in).
Is that context limit a hard limit or is it a sort of gradual decline of “weight” from the characters further back until they’re no longer affecting output at the head?
Nobody really knows because it's an OpenAI trade secret (they're not very "open"). Normally, it's a hard limit for LLMs, but many believe OpenAI are using some tricks to increase the effective context limit. I.e. some people believe instead of feeding back the whole conversation, they have GPT create create a shorter summaries of parts of the conversation, then feed the summaries back in.
I think it’s probably something that could be answered with general knowledge of LLM architecture.
Yeah OpenAI’s name is now a dirty joke. They decided before their founding that the best way to make AI play nice was to have many many many AIs in the world, so that the AIs would have to be respectful to one another, and overall adopt pro social habits because those would be the only ones tolerated by the other AIs.
And the way to ensure a community of AIs, a multipolar power structure, was to disseminate AI tech far and wide as quickly as possible, instead of letting it develop in one set of hands.
Then they said fuck that we want that power, and broke their promise.