this post was submitted on 02 Aug 2023
361 points (94.1% liked)
Technology
60386 readers
3791 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I have experience in creating supervised learning networks. (not large language models) I don't know what tokens are, I assume they are output nodes. In that case I think increasing the output nodes don't make the Ai a lot more intelligent. You could measure confidence with the output nodes if they are designed accordingly (1 node corresponds to 1 word, confidence can be measured with the output strength). Ai-s are popular because they can overcome unknown circumstances (most of the cases), like when you input a question slightly different way.
I agree with you on that Ai has a problem understanding the meaning of the words. The Ai's correct answers happened to be correct because the order of the words (output) happened to match with the order of the correct answer's words. I think "hallucinations" happen when there is no sufficient answers to the given problem, the Ai gives an answer from a few random contexts pieced together in the most likely order. I think you have mostly good understanding on how Ai-s work.
You seem like you are familiar with back-propogation. From my understanding, tokens are basically just a bit of information that is assigned a predicted fitness, and the token with the highest fitness is then used for back-propogation.
Eli5: im making a recipe. At step 1, i decide a base ingredient. At step 2, based off my starting ingredient, i speculate what would go good with that. Step 3 is to implement that ingredient. Step 4 is to start over at step 2. Each "step" here would be a token.
I am also not a professional, but I do do a lot of hobby work that involves coding AI's. As such, if I am incorrect or phrased that poorly, feel free to correct me.
I did manage to write a back-propogation algorithm, at this point I don't fully understand the math behind back-propogation. Generally back-propogation algorithms take the activation, calculate the delta(?) with the activation and the target output (only on last layer). I don't know where tokens come in. From your comment it sounds like it has to do something in a unsupervised learning network. I am also not a professional. Sorry if I didn't really understand your comment.
Mathematically, I have no idea where the tokens come in exactly. My studies have been more conceptual than actually getting down to the knitty-gritty, for the most part.
But conceptually, from my understanding, tokens are just a variable that is assigned a speculated fitness, then used as the new "base" data set.
I think chicken would go good in this, but beef wouldn't be as good. My token is the next ingredient i am deciding to put in.
You guys should all check out Andrej Karpathy's neural networks zero to hero videos. He has one on LLMs that explains all this.
Here is an alternative Piped link(s): https://piped.video/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I'm open-source, check me out at GitHub.