this post was submitted on 10 Jul 2023
354 points (91.7% liked)

Technology

59675 readers
4666 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Which of the following sounds more reasonable?

  • I shouldn't have to pay for the content that I use to tune my LLM model and algorithm.

  • We shouldn't have to pay for the content we use to train and teach an AI.

By calling it AI, the corporations are able to advocate for a position that's blatantly pro corporate and anti writer/artist, and trick people into supporting it under the guise of a technological development.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 7 points 1 year ago (1 children)

LLMs aren’t really novel in terms of theoretical approach: the real revolution is the amount of computing power and data to throw at them.

This is 100% true. LLMs, neural networks, markov chains, gradient descent, etc. etc. on down the line is nothing particularly new. They’ve collectively been studied academically for 30+ years.

Well LLMs and particularly GPT and its competitors rely on Transformers, which is a relatively recent theoretical development in the machine learning field. Of course it's based in prior research, and maybe there even is prior art buried in some obscure paper or 404 link, but if that's your measure then there is no "novel theoretical approach" for anything, ever.

I mean I'll grant that the available input data and compute for machine learning has increased exponentially, and that's certainly an obvious factor in the improved output quality. But that's not all there is to the current "AI" summer, general scientific progress played a non-minor part as well.

In summary, I disagree on data/compute scale being the deciding factor here, it's deep learning architecture IMHO. The former didn't change that much over the last half decade, the latter did.

[–] pensivepangolin 3 points 1 year ago (1 children)

Now as I stated in my first comment in these threads, I don’t know terribly much about the technical details behind current LLM’s and I’m basing my comments on my layman’s reading.

Could you elaborate on what you mean about the development of of deep learning architecture in recent years? I’m curious; I’m not trying to be argumentative.

[–] [email protected] 2 points 1 year ago (1 children)

Could you elaborate on what you mean about the development of deep learning architecture in recent years?

Transformers. Fun fact, the T in GPT and BERT stands for "transformer". They are a neural network architecture that was first proposed in 2017 (or 2014 depending on how you want to measure). Their key novelty is the method of implementing an attention mechanism and a context window without recursion, which was the method most earlier NNs used for that.

The wiki page I linked above is admittedly a bit technical, this articles explanation might be a bit more friendly to the layperson.

[–] pensivepangolin 1 points 1 year ago

Thanks for the reading material: I’m really not familiar with Transformers other than the most basic info. I’ll give it a read when I get a break from work.