this post was submitted on 17 Feb 2024
243 points (96.2% liked)

Privacy

31993 readers
122 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Related communities

Chat rooms

much thanks to @gary_host_laptop for the logo design :)

founded 5 years ago
MODERATORS
all 45 comments
sorted by: hot top controversial new old
[–] mexicanmamba 52 points 9 months ago (2 children)
[–] [email protected] 26 points 9 months ago (2 children)

Well, they can (and will) still scrape us if they want. Just nobody's making a buck off of it.

[–] PropaGandalf 2 points 9 months ago (1 children)

All better than that piggyboy getting free money

[–] stockRot 1 points 9 months ago
[–] stoly 1 points 9 months ago (1 children)

That's going to be a lot more work since comments and posts are decentralized here. You can probably easily get some of it but it will be hard to get all of it.

[–] [email protected] 1 points 9 months ago (1 children)

It's actually even easier than that. Instead of setting up an tool to make up requests for the API, you can just set up a bridge that will dump everything right into your database. The wonders of federation.

[–] [email protected] 1 points 9 months ago

If you can set up a Lemmy instance and apply a little elbow grease to manually follow a few instances, that's pretty much all you need to have the data come in automatically. You'd probably need more knowledge about how to actually get the data out of the DB than the initial setup, which could be done by somebody just copying and pasting text.

[–] [email protected] 12 points 9 months ago

The reality though is I can train LLMs off Lemmy data all I want and I don't have to pay ANYONE a dime...

[–] AtariDump 48 points 9 months ago (2 children)
[–] [email protected] 15 points 9 months ago (1 children)

I wish people like spez and zuck cried themselves to sleep, but those beds of cash are probable pretty comfortable. The only real hope is that they're pilloried so thoroughly in history books that, at the ends of their lives, they're bitterly angry at the injustice of how they'll be remembered. The good news is that this is something the public can influence. The bad news is that 99% of the public don't give a shit. Musk might be the only one in this crop of unethical sociopaths who might ene up railing about his legacy; the rest are just going to get away with raping the public and generally recognized as being "shrewd business men." And it's only the men; the women who do this tend to end more poorly - fired by boards, or spending time in jail.

[–] [email protected] 8 points 9 months ago (1 children)

America is truly exceptional... Nonagenarian politicians serve as lawmakers of an economy they barely understand, and part of a system of legalized bribery that reinforces their lack of interest in not understanding, while septuagenarian supreme court interpets and applies laws made in the aftermath of the civil war but are free to bend the meaning of laws as their personal political biases allow, and octagenarian presidents wield extreme unchecked power.

In this system, laws against abuse of personal information and exploitation of data will only be written in 2080 or later, after many lives of common people are damaged, until it damages the life of a congressman and then change happens.

[–] [email protected] 2 points 9 months ago (1 children)

Not for the first time do I wish Lemmy had github-like responses. Up/downvotes are utterly inadequate; why didn't Lemmy learn this lesson from Reddit?

Anyway, I love how succinctly you summed up the state we're in. I've joked before that America would be well-served by the introducion of Carousel; I'm well past the Last Day age, but the older I get, the less it becomes a joke to me. It'd be better for the environment, too.

[–] [email protected] 1 points 9 months ago

Oh the carousel. Anyway I just wished voters would vote more consciously but even that has been rigged so that people vote to those who appeal to their own fears and anger 😞

[–] [email protected] -1 points 9 months ago (1 children)
[–] AtariDump 1 points 9 months ago
[–] Substance_P 22 points 9 months ago (1 children)

Brilliant, A.I does the heavy lifting takes data for free then resells access to it while us who contributed for the last decade don't get a dime.

[–] Anticorp 1 points 9 months ago

Those contributing to it are forced to view ads or pay money for the right to contribute without having ads forced upon them.

[–] eager_eagle 14 points 9 months ago (1 children)

Well, they already made it very clear to everyone back in May that the content created by the community does not belong to the community. Anyone still using that dump deserves to be explored.

[–] [email protected] 15 points 9 months ago

Anyone still using that dump deserves to be explored.

( ͡° ͜ʖ ͡°)

[–] [email protected] 10 points 9 months ago* (last edited 9 months ago)

Nice! Someone owes me 5€ now.

[–] mrcleanup 8 points 9 months ago (1 children)

Time to delete my old accounts, I guess. Is there a bit that will go through and delete all posts and comments too? That would be helpful.

[–] eager_eagle 14 points 9 months ago* (last edited 9 months ago) (2 children)

I used PowerDeleteSuite back in June.

It's private and not paid like Redact. I'd consider editing the comments instead of deleting them to spread the word/reason of deletion.

[–] [email protected] 6 points 9 months ago

Or to poison the dataset

[–] [email protected] 4 points 9 months ago (1 children)

That's what I did. I turned all my comments into Lemmy advertisements, and also an obscene sentence telling u/spez to kill himself (I'm not proud of it at this juncture, but it felt good at the time).

[–] AtariDump 4 points 9 months ago (1 children)
[–] [email protected] 2 points 9 months ago

Hof in German means farm ... so his origin surname is literally from farmer ...

[–] [email protected] 7 points 9 months ago

This is the best summary I could come up with:


Reddit will let “an unnamed large AI company” have access to its user-generated content platform in a new licensing deal, according to Bloomberg yesterday.

The deal, “worth about $60 million on an annualized basis,” the outlet writes, could still change as the company’s plans to go public are still in the works.

The news also follows an October story that Reddit had threatened to cut off Google and Bing’s search crawlers if it couldn’t make a training data deal with AI companies.

Last year, it successfully stonewalled its way out of the biggest protest in its history after changes to its third-party API access pricing caused developers of the most popular Reddit apps to shut down.

As Bloomberg writes, Reddit’s year-over-year revenue was up by 20 percent by the end of 2023, but it was still $200 million shy of a $1 billion target it had set two years prior.

The company was reportedly advised to seek a $5 billion valuation when it opens up for public investment, which is expected to happen in March.


The original article contains 346 words, the summary contains 175 words. Saved 49%. I'm a bot and I'm open source!

[–] 9point6 6 points 9 months ago

Here comes a new wave of users, I guess

Kinda thought they'd manage to go a bit longer than the few months they did

[–] maaneeack 6 points 9 months ago

Glad I edited all my comments to say fuck u/spez

[–] [email protected] 6 points 9 months ago (2 children)

What's the best method to mass edit my comments?

[–] [email protected] 12 points 9 months ago* (last edited 9 months ago) (2 children)

PowerDeleteSuite. I used this when things went hot with Reddit. You can even edit your comments before deleting them, best part for you, you don't have to delete them. (Hopefully Reddit haven't countered this).

[–] toleda 2 points 9 months ago (2 children)

I'm going to try it, Shreddit doesn't work anymore for some reasons. Thanks !

[–] Anticorp 2 points 9 months ago

It works, but it takes a long time, and then Reddit un-deletes your comments. Make sure you set it up to edit your comments before deletion. A message like the one in the image is a pretty good choice.

[–] [email protected] 1 points 9 months ago
[–] laverabe 1 points 9 months ago* (last edited 9 months ago)

Is there a more effective one, that slowly edits all your comments a little bit at a time so it misses their detection over a period of weeks/months? Like scrambling/nonsense sentences.

There was a book whose card when blunk when they looked up.

Like completely non sensical but a real sentence so it would be hard to detect.

[–] [email protected] 1 points 9 months ago

Look at the issues and you will notice it only works on comments visible from the profile page and that not all are visible. It appears that someone made a python script to solve this problem but that you need an API key to use it.

[–] [email protected] 5 points 9 months ago

And finally their Logo makes sense.

[–] [email protected] 4 points 9 months ago* (last edited 9 months ago)

I dont see why someone would need this deal anyways.. most is already available, and most the new stuff probably too, even without API access.
I also expect the fediverse to be crawled and used for training, thats just the thing about publicly available stuff, it gets used, if we like it or not..

[–] [email protected] 3 points 9 months ago (1 children)

Is Lemmy protected of scraping our data for AI?

[–] [email protected] 6 points 9 months ago (1 children)

The opposite; the API to simply take comments and posts in bulk is free and open.

[–] [email protected] 2 points 9 months ago (1 children)

Can an instance close the API or limit it?

[–] [email protected] 4 points 9 months ago

In theory, yes, but instances don't ship with the ability to do that. There would need to be a change to the Lemmy code base if such a thing was to be seriously implemented.

I'm no federation expert, so I can't really comment on whether doing something like requiring API keys would be feasible, unfortunately.

[–] [email protected] 1 points 5 months ago

Ah, more glue on pizza incoming. Personally I don't understand taking reddit posts as a source for LLM training. It's like they never visited reddit and think that all posts/comments are true, or even useful. Depending on the sub, sarcasm can account for anywhere from 5% to 100%.