this post was submitted on 21 Nov 2023
995 points (97.9% liked)
Technology
59350 readers
6426 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
In my experience, well over half of tech industry workers don't even understand it.
I was just trying to explain to someone on Hacker News that no, the "programmers" of LLMs do not in fact know what the LLM is doing because it's not being programmed directly at all (which even after several rounds of several people explaining still doesn't seem to have sunk in).
Even people that do understand the tech more generally pretty well are still remarkably misinformed about it in various popular BS ways, such as that it's just statistics and a Markov chain, completely unaware of the multiple studies over the past 12 months showing that even smaller toy models are capable of developing abstract world models as long as they can be structured as linear representations.
It's to the point that unless it's in a thread explicitly on actual research papers where explaining nuances seem fitting I don't even bother trying to educate the average tech commentators regurgitating misinformation anymore. They typically only want to confirm their biases anyways, and have such a poor understanding of specifics it's like explaining nuanced aspects of the immune system to anti-vaxxers.
People just say that it's a bunch of
if
statements. Those people are idiots. It's not even worth engaging those people.The people who say that it's just a text prediction model do not understand the concept of a "simple complex" system. After all isn't any intelligence basically just a prediction model?
can you provide links to the studies?
Do Large Language Models learn world models or just surface statistics? (Jan 2023)
Actually, Othello-GPT Has A Linear Emergent World Representation (Mar 2023)
Eight Things to Know about Large Language Models (April 2023)
Playing chess with large language models (Aug 2023)
Language Models Represent Space and Time (Oct 2023)
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets (Oct 2023)
The first two and last two are entirely focused on the linear representations, the studies cited in point three of the third link have additional information along those lines, and the fourth link is just a fun read.
I once asked ChatGPT to stack various items and to my astonishment it has enough world knowledge to know which items to be stacked to make the most stable structure. Most tech workers that I know that are dismissing LLMs as a supercharged autocomplete felt threatened that AI is going to take their jobs in the future.
This was one of the big jumps from GPT-3 to GPT-4.