this post was submitted on 31 Jan 2025
107 points (98.2% liked)

Technology

61902 readers
3379 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 10 comments
sorted by: hot top controversial new old
[–] NineMileTower 35 points 1 week ago (1 children)

Nothing. It was pirated for free.

[–] [email protected] 11 points 1 week ago (1 children)

Some have allegedly paid.

“We’ve provided about 20-30 companies/teams with our entire dataset. It’s the same data as on our torrents page, but they get access to high-speed SFTP servers.” 

“Usually, this is in exchange for a large monetary donation or, on occasion, in exchange for good datasets they acquired,” ‘Anna’s Archivist’ adds, noting that all data they obtain is shared publicly.

[–] [email protected] 14 points 1 week ago (1 children)

The fact that Anna's Archive is accepting additional datasets as "payment" makes me comfortable that they're not in this for the money but rather for ideological reasons.

[–] [email protected] 2 points 6 days ago

Or it could be that such trade wouldn’t have to appear in accounting :)

[–] [email protected] 32 points 1 week ago

Guess we've finally reached the moment where letting the giant intellectual property cartels monopolize human culture is going to cause serious economic side effects for other big corporations rather than simply screwing over the general public.

[–] [email protected] 16 points 1 week ago

The future of AI innovation may hinge on the outcome of a global copyright debate.

Meh, US is not the world.

[–] General_Effort 8 points 1 week ago

“We cleaned 860K English and 180K Chinese e-books from Anna’s Archive,” a DeepSeek VL paper, published last March, states.

Hmm.

[–] [email protected] 5 points 1 week ago

Honestly, this is the best thing about the AI hype.

Remember to support your local (shadow) library!

[–] [email protected] 3 points 1 week ago* (last edited 1 week ago)

Yeah, information wants to be free. I'd say we just do away with copyright /s

Or I could try training AI as well once this is settled. Of course I'd need to get a few big harddrives to store a few books, audiobooks, music, Netflix series... Or is this just a perk for big and greedy companies?

[–] [email protected] 3 points 1 week ago

Bibliotik baybeeeee