this post was submitted on 16 Oct 2023
260 points (90.1% liked)

Technology

59678 readers
3219 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

It is 'nearly unavoidable' that AI will cause a financial crash within a decade, SEC head says::undefined

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 1 year ago (1 children)

I have... feelings about LLMs being the big thing in AI/ml right now.... because its really not much new. Maybe the transformer model kind of but ultimately LLMs are massive supervised learning neural nets trained on obscene amounts of data. And then other models use that pretrained "foundational model" to work and just tune their parameters. Which is why prompt engineer is becoming a thing.

Corpos are playing by the book here and trying to extinguish any competition before it begins by having people rely on their "foundation" models instead of innovating their own solutions

How many tutorials can you find for implementing LLM NLP tasks that dont include "import this model from X company" id wager its only maybe 33%

[–] dustyData 3 points 1 year ago (1 children)

Part of what makes localized model engines and custom ML chips interesting is precisely their ability to enable small custom local models. Right now LLMs require so much computational power and massive amounts of data to be trained and operate that even the most expensive options lose money with every prompt query.

So, the reason every tutorial starts with "download this model". Is because there's a good chance you don't have the hundreds of super computer cluster chips and the several hundreds of exabytes of scrapped and curated data needed to train a natural language processing model. There's a reason there are only big players in this game.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

Facts.

Even if you could design your own model... How do you acquire a dataset even a fraction of the size those pretrained models from the corps.

Then how do you train the model in a reasonable time. Other than relying on cloud computing which leads to the same problem of only corps can play this game properly right now.

I designed and collected/labeled the data for a relatively small deep CNN for my masters thesis and training it on 60000 images was taking over a dozen hours (this was 5 years ago at this point so that part may be misremembered) on a 1080ti.