this post was submitted on 19 Nov 2023
700 points (97.9% liked)

Technology

60016 readers
2696 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

The article is about Kyutai, a French AI lab with an objective to compete with chatgpt and others with full open source (research papers, models, and training data).

They are aiming to also include the capability to use sound, image, etc... (according to this article (French) https://www.clubic.com/actualite-509350-intelligence-artificielle-xavier-niel-free-et-l-ancien-pdg-de-google-lancent-kyutai-un-concurrent-europeen-a-openai.html )

The post article also talks about some French context.

you are viewing a single comment's thread
view the rest of the comments
[–] cyd 22 points 1 year ago* (last edited 1 year ago) (2 children)

Ideally, they'd just blow the entire $330M training an LLM, and release the weights. In reality, much of that money will probably go into paying salaries, various smaller research projects, etc.

[–] [email protected] 85 points 1 year ago (4 children)

Ideally, they wouldn't be paying salaries? What?

[–] cyd 11 points 1 year ago* (last edited 1 year ago) (1 children)

The context is that LLMs need a big up front capital expenditure to get started, because of the processor time to train these giant neural networks. This is a huge barrier to the development of a fully open source LLM. Once such a foundation model is available, building on top of it is relatively cheaper; one can then envision an explosion of open source models targeting specific applications, which would be amazing.

So if the bulk of this €300M could go into training, it would go a long way to plugging the gap. But in reality, a lot of that sum is going to be dissipated into other expenses, so there's going to be a lot less than €300M for actual training.

[–] interceder270 5 points 1 year ago (3 children)

Is there any way we can decentralize the training of neural networks?

I recall something being released awhile ago that let people use their computers for scientific computations. Couldn't something similar be done for training AI?

[–] [email protected] 4 points 1 year ago

There is a project (AI Horde) that allows you to donate compute for inference. I'm not sure why the same doesn't exist for training. I think the RAM/VRAM requirements just can't be lowered/split.

Another way to contribute is by helping with training data. LAION, which created the dataset behind Stable Diffusion, is a volunteer effort. Stable Diffusion itself was developed at a tax-funded public university in Germany. However, the cost of the processing for training, etc. was covered by a single rich guy.

[–] Sanyanov 1 points 1 year ago

Btw yes! Why not include such project in something like BOINC and let people help training free AI?

[–] dojan -3 points 1 year ago

Folding at home.

I dunno. I wouldn’t lend my spare power to put people out of a job.

[–] [email protected] 5 points 1 year ago

Good luck training an LLM without any developer.