this post was submitted on 13 Aug 2023
1092 points (96.1% liked)
Technology
59417 readers
2817 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Of course you can, you can look at every single activation and weight in the network. It's tremendously hard to predict what the model will do, but once you have an output it's quite easy to see how it came to be. How could it be bloody otherwise you calculated all that stuff to get the output, the only thing you have to do is to prune off the non-activated pathways. That kind of asymmetry is in the nature of all non-linear systems, a very similar thing applies to double pendulums: Once you observed it moving in a certain way it's easy to say "oh yes the initial conditions must have looked like this".
What's quite a bit harder to do for the likes of ChatGPT compared to double pendulums is to see where they possibly can swing. That's due to LLMs having a fuckton more degrees of freedom than two.
I don't disagree with anything you said but wanted to just weigh in on the more degrees of freedom.
One major thing to consider is that unless we have 24/7 sensor recording with AI out in the real world and a continuous monitoring of sensor/equipment health, we're not going to have the "real" data that the AI triggered on.
Version and model updates will also likely continue to cause drift unless managed through some sort of central distribution service.
Any large Corp will have this organization and review or are in the process of figuring it out. Small NFT/Crypto bros that jump to AI will not.
IMO the space will either head towards larger AI ensembles that tries to understand where an exact rubric is applied vs more AGI human reasoning. Or we'll have to rethink the nuances of our train test and how humans use language to interact with others vs understand the world (we all speak the same language as someone else but there's still a ton of inefficiency)