this post was submitted on 20 Jul 2023
666 points (97.6% liked)
Technology
59664 readers
3536 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
That's not how these LLMs work. There is a training phase which takes a large amount of compute power, and the training generates a model which is a set of weights and could easily be backed up and version-controlled. The model is then used for inference which is a less compute-intensive process and runs on much smaller hardware than the training phase.
The inference architecture does use feedback mechanisms but the feedback does not modify the model-weights that were generated at training time.
Makes me wonder how exactly they curate said data, its such an insane amount even teams of thousands of human programmers sifting through all of it 24/7 all day everyday wouldn't be able to fact check or assess all the data for years. Presumably they use ai to go over the data scraped and thrown into the model, since I cant imagine any human being able to curate it all.
I've heard from various videos detailing the topic that many of the developers have little to no clue as to what's going on inside the LLM once it's assembled and set about its work on training itself and what not- and I'm inclined to believe them, the human programmers simply set the params, and system up and then the system eats all the data loaded into it and immediately becomes a sort of black box which nobody knows exactly whats going on inside of it to produce the output it does.