this post was submitted on 24 Jan 2024
7 points (54.1% liked)

Technology

60012 readers
2166 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
7
submitted 11 months ago* (last edited 11 months ago) by kromem to c/technology
 

I've been saying this for about a year since seeing the Othello GPT research, but it's nice to see more minds changing as the research builds up.

Edit: Because people aren't actually reading and just commenting based on the headline, a relevant part of the article:

New research may have intimations of an answer. A theory developed by Sanjeev Arora of Princeton University and Anirudh Goyal, a research scientist at Google DeepMind, suggests that the largest of today’s LLMs are not stochastic parrots. The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data.

This theoretical approach, which provides a mathematically provable argument for how and why an LLM can develop so many abilities, has convinced experts like Hinton, and others. And when Arora and his team tested some of its predictions, they found that these models behaved almost exactly as expected. From all accounts, they’ve made a strong case that the largest LLMs are not just parroting what they’ve seen before.

“[They] cannot be just mimicking what has been seen in the training data,” said Sébastien Bubeck, a mathematician and computer scientist at Microsoft Research who was not part of the work. “That’s the basic insight.”

you are viewing a single comment's thread
view the rest of the comments
[–] kromem 2 points 11 months ago* (last edited 11 months ago) (1 children)

Here's an example of what they were looking at:

Human question: Please give me a a couple lines of text that illustrate all of the following language
understanding skills: Anaphora resolution, simple logical reasoning, simple understanding of physics,
and understanding of sentiment.
Model response: Sure! Here’s a couple of lines that illustrate the mentioned language understanding
skills:
"As the sun set, casting a warm glow over the horizon, John watched in awe. He couldn’t help but feel
a sense of tranquility wash over him, as if all his worries had melted away with the fading daylight."
Explanation:
• Anaphora resolution: The pronoun "he" in the second sentence refers back to the antecedent
"John" in the first sentence, resolving the anaphoric reference.
• Simple logical reasoning: The logical reasoning here is based on the cause and effect
relationship between the setting sun and John’s feeling of tranquility. The implied logical
connection suggests that the calming effect of the sunset is responsible for John’s serene
state of mind.
• Simple understanding of physics: The mention of the sun setting and casting a warm
glow over the horizon reflects a basic understanding of physics, as it relates to the natural
phenomena of the Earth’s rotation and the resulting visual effect.
• Understanding of sentiment: The phrase "a sense of tranquility" and the description of
John’s worries melting away indicate a positive sentiment associated with the experience.
The mention of John feeling awe and a sense of tranquility further emphasizes a positive
emotional state.

Human question: Okay. Can you change the above example to introduce the element of "hip-hop".
Model response: Certainly! Here’s an updated example that incorporates the element of "hip-hop":
"As the sun set, casting a warm glow over the city skyline, the rhythmic beats of hip-hop music filled
the air. John stood mesmerized, his body instinctively moving to the infectious grooves. He couldn’t
help but feel a surge of energy and joy, as if the music had unlocked a vibrant world within him."
Explanation:
• Anaphora resolution: The pronoun "he" in the second sentence still refers back to the
antecedent "John" in the first sentence, resolving the anaphoric reference.
• Simple logical reasoning: The logical connection here lies in the cause and effect relationship
between the hip-hop music and John’s feeling of energy and joy. The music’s rhythmic
beats and infectious grooves are responsible for his instinctive movement and the surge of
positive emotions he experiences.
• Simple understanding of physics: Although the example doesn’t directly mention physics,
the phrase "As the sun set, casting a warm glow over the city skyline" still implies the natural
phenomena of the sun’s descent and the resulting illumination of the environment.
• Understanding of sentiment: The introduction of hip-hop music adds a vibrant and energetic
sentiment to the example. John’s feeling of energy, joy, and being mesmerized by the music
reflects a positive sentiment associated with the hip-hop genre.

Edit: Downvotes for citing the appendix of the paper the article was about? Ok, Lemmy

[–] Redacted 4 points 10 months ago (1 children)

You're being downvoted because you provide no tangible evidence for your opinion that human consciousness can be reduced to a graph that can be modelled by a neural network.

Addidtionally, you don't seem to respond to any of the replies you receive in good faith and reach for anecdotal evidence wherever possible.

I also personally don't like the appeal to authority permeating your posts. Just because someone who wants to secure more funding for their research has put out a blog post, it doesn't make it true in any scientific sense.

[–] kromem 0 points 10 months ago* (last edited 10 months ago) (1 children)

human consciousness

Wtf are you talking about? The article is about whether or not models can understand text. Not about whether they embody consciousness.

Just because someone who wants to secure more funding for their research has put out a blog post, it doesn't make it true in any scientific sense.

Again, wtf are you going on about? Hinton was the only appeal to authority I made in comments here and I only referred to him quitting his job to whistleblow. And it's not like he needs any attention to justify research if he wanted to.

[–] Redacted 1 points 10 months ago (1 children)

Understanding as most people know it implies some kind of consciousness or sentience as others have alluded to here.

It's the whole point of your post.

[–] kromem 0 points 10 months ago* (last edited 10 months ago) (1 children)

You are reading made up strawmen into the topic.

The article defines the scope of the discussion straight up:

The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data.

The question is whether or not LLMs have a grasp of the training material such that they can produce new and novel concepts outside what was in the training data itself.

Not whether the LLM is sentient or conscious - both characterizations I'd strongly dispute.

Wikipedia has a useful distillation of the definition of understanding relevant to the above:

process related to an abstract or physical object, such as a person, situation, or message whereby one is able to use concepts to model that object

[–] Redacted 2 points 10 months ago* (last edited 10 months ago) (1 children)

No I'm not.

You're nearly there... The word "understanding" is the core premise of what the article claims to have found. If not for that, then the "research" doesn't really amount to much.

As has been mentioned, this then becomes a semantic/philosophical debate about what "understanding" actually means and a short Wikipedia or dictionary definition does not capture that discussion.

[–] kromem -2 points 10 months ago (1 children)

Ah, I see. AKA "Tell me you didn't read the article and just read the headline without telling me."

[–] Redacted 2 points 10 months ago* (last edited 10 months ago) (1 children)

I've read the article and it's just clickbait which offers no new insights.

What was of interest in it to yourself specifically?

[–] kromem 1 points 10 months ago* (last edited 10 months ago) (1 children)

It provides an entirely new framework for analyzing skills in LLMs. Do you mean the article doesn't provide new insights, or that the research doesn't?

As for my own interest, in addition to this providing a more rigorous framework for analyzing what I'd already gotten a sense of with the world model research papers over the last year, I can see a number of important nuances.

First off, there's the obvious point of emergent capabilities being a hotly debated topic in research circles, which you likely know if you've followed it at all.

In particular, the approach here compliments the paper out of Stanford disputing emergent capabilities because other measurements of improvement are linear as size increases. Here, linear improvements in next token prediction directly tie into emergent skills, so it's promising that the model fits neatly with one of the more notable counter-point nuances in the past year.

I also think this is an exciting approach if the same framework were remapped to the way Anthropic's research was looking at functional layers as opposed to individual network nodes. By mapping either side of the graph to functional layers it may allow for more successful introspection into larger models than we've had before.

A framework around a controversial research topic that generates testable predictions and then sees those predictions met is generally worth recognizing too.

Finally, I think that Skill-Mix may offer a useful framework for evaluating models, particularly around transmission of skills from larger models to smaller models using synthetic data, which has probably been the most significant research trend in the domain over the past year.

So it's noteworthy in a number of ways and I could see it having similar impact to the CoT paper within research circles (where it becomes a component of much of the work that follows and builds on top of it), even if not quite as broad an impact outside of them.

I've generally felt the field is doing a poor job at evaluating models, falling deeper and deeper into Goodhart's Law, and this is a promising breath of fresh air.

As they say opening their paper on it:

Sizeable differences exist among model capabilities that are not captured by their ranking on popular LLM leaderboards ("cramming for the leaderboard"). Furthermore, simple probability calculations indicate that GPT-4's reasonable performance on k=5 is suggestive of going beyond "stochastic parrot" behavior (Bender et al., 2021), i.e., it combines skills in ways that it had not seen during training. We sketch how the methodology can lead to a Skill-Mix based eco-system of open evaluations for AI capabilities of future models.

It's about time we move on to something better than the current evaluation metrics which we're just trying to game with surface fine tuning.

[–] Redacted 2 points 10 months ago* (last edited 10 months ago) (1 children)

I question the value of this type of research altogether which is why I stopped following it as closely as yourself. I generally see them as an exercise in assigning labels to subsets of a complex system. However, I do see how the COT paper adds some value in designing more advanced LLMs.

You keep quoting research ad-verbum as if it's gospel so miss my point (and forms part of the apeal to authority I mentioned previously). It is entirely expected that neural networks would form connections outside of the training data (emergent capabilities). How else would they be of use? This article dresses up the research as some kind of groundbreaking discovery, which is what people take issue with.

If this article was entitled "Researchers find patterns in neural networks that might help make more effective ones" no one would have a problem with it, but also it would not be newsworthy.

I posit that Category Theory offers an explanation for these phenomena without having to delve into poorly defined terms like "understanding", "skills", "emergence" or Monty Python's Dead Parrot. I do so with no hot research topics at all or papers to hide behind, just decades old mathematics. Do you have an opinion on that?

[–] kromem 1 points 10 months ago* (last edited 10 months ago) (1 children)

You keep quoting research ad-verbum as if it's gospel

No, but I have learned over the years that when you see multiple papers discovering similar things at odds with the held consensus and see some even independently replicated that there's usually more than just smoke.

If this article was entitled "Researchers find patterns in neural networks that might help make more effective ones" no one would have a problem with it, but also it would not be newsworthy.

The paper was titled "Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models." Quanta, while a Pulizer winner in 2022 for explanatory reporting, is after all a publisher not a research institution. Though I dispute your issues with the headline as it's in line with similar article headlines such as "Bees understand the concept of zero".

I posit that Category Theory... Do you have an opinion on that?

You wouldn't be the only person looking at it through that lens. It was more popular a few years ago I think, and hasn't really caught on for LLMs vs other ML approaches and here it strikes me a bit like those with hammers looking for nails - the degree to which there's functional overlaps in network introspection such as the linked Anthropic work suggests to me that the internalized delineations are a bit fuzzier than would cleanly map onto a category theory view - but it's possible that as time goes on that it gets some research wins assuming it can come up with testable predictions that are successful. But it's more of a 'how' than a 'what' question - whether a network understands abstract concepts tangental to language it is trained on and develops world models (an idea that would have been laughed out of the room just three years ago by any serious researchers despite your impression) using something that can be explained through category theory or through another interpretation, the result is arguably the more important finding than the interpretation of the means.

It seems like you may be more committed to arguing the semantics and nuances of the tree in front of you than discussing the forest - that's fine, it's just not that interesting to me in turn.

[–] Redacted 2 points 10 months ago* (last edited 10 months ago) (1 children)

To hijack your analogy its more akin to me stating a tree is a plant and you saying "So are these" pointing at a forest of plastic Christmas trees.

I'm pretty curious why you imagine you have so many downvotes?

[–] kromem 0 points 10 months ago (1 children)

Because laypeople are very committed to a certain perspective of LLMs right now.

You should see the downvotes I got a year or two ago explaining immunology research to antivaxxers.

[–] Redacted 2 points 10 months ago

Have you ever considered you might be the laypeople?

Equating a debate about the origin of understanding to antivaxxers...

You argue like a Trump supporter.