Technology

63132 readers

5026 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

OpenAI Wants to Eat Google Search's Lunch (gizmodo.com)

submitted 1 year ago by ooli to c/technology

25 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 5 points 1 year ago (1 children)

But the content has already been absorbed. I wouldn’t be surprised if they have all of it sucked up (many would argue illegally) and stored as a corpus for them to iterate onto. It’s not like they go out to touch all the web every time they train a new version of their model.

[–] platypus_plumba 3 points 1 year ago (1 children)

Right, they already have scary amounts of data.

[–] QuaternionsRock 1 points 1 year ago

One of the craziest facts about GPT (to me) is that it was trained on 570GB of text data. That’s obviously a lot of text, but it’s bewildering to me that I could theoretically store their entire training dataset on my laptop.