Ask Lemmy
A Fediverse community for open-ended, thought provoking questions
Rules: (interactive)
1) Be nice and; have fun
Doxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them
2) All posts must end with a '?'
This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?
3) No spam
Please do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.
4) NSFW is okay, within reason
Just remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either [email protected] or [email protected].
NSFW comments should be restricted to posts tagged [NSFW].
5) This is not a support community.
It is not a place for 'how do I?', type questions.
If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email [email protected]. For other questions check our partnered communities list, or use the search function.
6) No US Politics.
Please don't post about current US Politics. If you need to do this, try [email protected] or [email protected]
Reminder: The terms of service apply here too.
Partnered Communities:
Logo design credit goes to: tubbadu
view the rest of the comments
And I completely agree with this. I'm one of those who is a GDPR-fan as well as a fediverse fan.
So this is the fundamental disagreement I feel. Progress generally entails moving things into the hands of the people. We're empowered because we can do things like program our own computers, 3-d print our own devices, and yes run our own social media site.
Deny a person that right, and you take a bit of their power away. By running my own single user instance, I make sure that I always own my own content, no one can take it away from me by suddenly shutting down their website (as has happened to e.g. elle.co for example).
As such, my goal here is to figure out how to let ma & pa joe run their own social media site on the fediverse, while staying GDPR compliant.
Of course, the same can be said of surgery but it's still not allowed. Obviously the harm from letting anyone try it is much worse than strictly regulating it, but is running a social media site on the fediverse likewise so harmful? Is there no way at all to strike the balance?
I've been thinking about this. You are right of course, but I'd wager that this is outside of what most folks running instances can afford. In particular new devs who want to run their own single user instance.
So what's the way forward? I have come up with an idea for this. Basically we need to get some organization like the EU branch of the Electronic Frontier Foundation (EFF) to research this and come up with a HOWTO guide that covers most of the average cases - along with pointers on when something is not covered by the guide (so at least you know going in that you'd need to pay for that extra legal firepower).
I think you have understood correctly. This actually provided me with the epiphany that I needed. On forum-like software that speaks ActivityPub (like pyfedi or mbin), there's no actual need to actually transfer the content. Just send me a notification - with the "user" being a bot account named something like "federation_bot_messenger" with a link to the new post or comment, then bubble it up to the user to open in their browser. No content is shared, and no identifiers like a user name get shared, so there's no risk of a GDPR violation. It's just a link.
One could imagine that fancier web UIs might use an iframe or something to display the content inplace instead of requiring an extra manual click - but it's still only on the end user's browser that the content is transferred.
We could still have traditional federation - but just as you describe, the allow list for that is only for those instances where you know the folks (have contracts you said) and thus are assured that the transfer of content complies with the GDPR. For unknown instances, just do the link sharing. It could be implemented in a way that instances running older software would still see a post by the bot account with just the link inside. (Perhaps as an enhancement, folks could designate a trusted instance as the primary - e.g. my instance trusts lemmy.world as primary, so when it sends the links out, it sends out a lemmy.world link, to take the load off of my own instance from users clicking on links.)
Or am I missing anything here?
I think this is a bit unfair. Clearly they had technically knowledgable advisors at the very least. After all, they came up with exceptions like this,
That said I think I might have been a bit unfair to the lemmy devs. From https://tech.michaelaltfield.net/2024/03/04/lemmy-fediverse-gdpr/ I can see that pretty much all of the issues raised directly on lemmy itself have since been resolved - by a dev writing code to fix the problem. Even if GDPR isn't the highest priority, the devs are clearly at work trying to address what they can when they can.
Hold on. You can't keep personal data longer than needed. Making data disappear from the web is one important demand by the GDPR.
Comments are problematic because they inherently relate to other persons beside yourself. It could be argued that you have to delete your own writings as well when you shut down your instance. Or it could be argued that other people's post may be kept (possibly anonymized) because otherwise your personal data would be incomplete. The 2nd is obviously what reddit is doing. That seems to draw more criticism than praise from the lemmy community, to put it mildly.
The GDPR gives you rights over data, like copyright does. It inherently gives you a right to control what other people do on their own with their own physical property.
You don't need to ask me. The GDPR is a terrible mistake, but that's not what people want to hear. People don't know the law and just chose to believe a happy fantasy. I believe, there is no way - at present - that an ordinary person can maintain an internet presence while being compliant with GDPR and other regulations. Mind, you also need to comply with the Digital Services Act and other stuff. With some skill, you can probably do a webpage, even with ads, but nothing where you interact with visitors and must collect data.
Yes. The DPOs issue guidances and send out newsletters. That would be a place to start. Unfortunately, the different DPOs don't agree on everything. Maybe in a few years, this will all be at a point where ordinary people can be on the safe side by simply following a manual. The problem is that this will still require extra time and effort. Well, content moderation also requires a lot of time and effort. Maybe it won't be so much extra effort that it becomes impossible for hobbyists, but - on the whole - the future of the European internet belongs to big players.
I was thinking the same. Ironically, that is a problem because if there is such an alternative, then it must be used. If you can reach your goal by processing less personal data, then you must do so.
You'd only be hosting the communities created on your own instance. Apart from that, you'd simply authenticate the identities of users. One question is what that would do to server load. I don't know.
Unfortunately, confirming the identities also means transferring personal data. It would also mean that the remote instance is able to connect an IP-address to a username; potentially allowing the real life identity to be uncovered. Proxying the posts/comments may be the better solution, but when and how that should be done has no clear answer.
Yes. Those are commonly referred to as industry lobbyists.
I don't know what exception that is. There are rules for data breaches. I'm not at all sure how much you have to do to block crawlers.
Agreed, but - while it might be permissible legally to wipe out my data and content, what if I want to retrieve a copy afterwards?
I wouldn't want to keep control over other people's content, but regarding my own...
Well, in that case, baring credible contradicting information from another source, I think it's reasonable to accept the note from the former worker of a DPO. Would you agree?
Hmm. Will need a good think about this - perhaps I should adjust my commenting style to avoid direct quoting and such...
All the more reason to get started on it, I suppose.
Well, and dealing with responsible for user content from your instance's local users - but since it's just the one instance (or small handful if you trust a few others) it's still much more managable. And it becomes zero for, e.g., single-user instances (since those would have zero other users and thus zero other content to worry about hosting).
That's why I had the idea of creating and using the federation-bot account - this way there's no confirmation of identities or transfer of personal data.
Server admin question. Can save that for serverfault.com and the like IMVHO
One of those things that need experimentation and research to determine, but an answer can be found.
Hmm - if different DPOs can't agree, then I don't see how we get to the point of a user friendly manual.
This is what's inherently disturbing to me. I am one of those hoping that the GDPR would be a tool for the opposite (a way to rein in the big players, so to speak).
It was a surprise to read from the former DPO worker that email as a system is not compliant with the GDPR.
Hmm. I am starting to see why you take this view. Not saying I agree, but I can understand the frustration. That said, PIPEDA in Canada came to pass in 2000 - it's considered to have GDPR-equivalency and we've not had the sort of issues that you are raising with PIPEDA, which makes me optimistic that the GDPR can likewise be something that folks can live with.
Even if it is flawed it's still a step in the right direction IMVHO. I'm in Canada, which had PIPEDA back in 2000 - 18 years before the GDPR took effect in the EU. Hence I believe a solution is workable and a balance can be struck - even if in the worst case that means additional legislation to tweak the existing law. (Though I'd not even go that far - for example, from the former DPO, it seems that if EU courts all agreed that the API behind federation was covered by the "involuntary data transfer" exception then Lemmy would already be GDPR compliant (or mostly so) as-is of the time that I write this.)
You have the right to request a copy of all your personal data from whoever controls it. Apparently that feature is still missing from lemmy.
That quote is from here: https://lemmy.world/post/1060627
I think I agree with pretty much everything they wrote. From what I understand, the apostrophes indicate that this is not official jargon. You can't prevent web-scraping with any reasonable effort, so you don't have to. The internet already exists. It's too late to stop it now; better focus on stopping future progress.
Mind that there is nothing involuntary about federation. It's not like web-scraping in that respect. You can just turn it off. You are left with something like an old school forum or reddit. No problem.
If you take the view that context is a necessary part of your personal data, then merely avoiding quotes is probably not enough. Practically, the way reddit is doing things seems to be fine.
But what if someone wants to participate in a community on a different instance? At least, the texts and their context, along with the username and home instance, need to be revealed.
Taking a mental step back, it's probably premature to worry about technological implementations. Sending data around does not have to be a violation. Compliance will require partly better information, and partly different administration. The legal aspects should be worked out before the necessary tools for the administrators are implemented.
There are also a lot of regulation for the backend, that instance owners have to comply with but which won't be noticed by users. Documenting the data processing, who has access, possibly make data impact assessments, maybe notify the local data protection office, ... There's also more from the DSA, like releasing transparency reports on moderation twice a year, making regular backups and testing those, ... I'm not quite sure what all is demanded by the DSA. Oh, and by german law there also needs to be a (physical) address that can be served legal papers.
I'm thinking about the issue of web-scraping, in particular. Some say that it's almost always illegal. The European Commission, for one, disagrees.
I pulled this from google: https://www.morganlewis.com/pubs/2024/05/eu-regulator-adopts-restrictive-gdpr-position-on-data-scraping-impacting-ai-technologies
Web-scraping is in some ways related. You could also get (almost all of) the data through scraping. If it's not legal to scrape lemmy without permission, then it's probably not legal to spin up your own instance and get the data that way. It depends on your purpose, of course.
That's also why I find the whole issue a little silly. Someone outside Europe could just scrape the data from the web interface and not worry about the GDPR. You'd have to put all of Europe behind a firewall to make it make sense. That's a prime example of why I say the people in charge of the GDPR have no idea of the technology they are regulating.
Such regulation inherently favors big players. The cost of creating a compliant service/app/etc is fairly constant, regardless of the size of the user base.
Besides, the GDPR inherently favors elites. Has anyone ever tracked your private jet on twitter? Or chased after you to get paparazzi pictures? Some people's personal data is worth a lot more than that of others. Most people will never have to worry about scrubbing unflattering media stories from search engines, or have the money to hire professionals to do it right.
Tell me what you hope the GDPR will achieve and I'll tell you if there is any chance. I'd write what the fundamental problems are, but time is short.
Sorry for the late response, your last comment didn't federate, so I just saw it.
I run my own single user instance and it's not that hard... I'd have to make some SQL queries to the database directly to retrieve the info but it's straightforward.
Yep that's the one.
Agreed.
Yes but that also makes it less useful and viable, unfortunately. I guess it really is like email if we consider federation an essential feature. I can set up my own email server that doesn't talk to any other, but then it's not too useful since it'd just me sending emails to myself.
So, federation is a must, but the question is how to make it work.
What more would need to be done?
And now I hit some kind of length limit so I had to break up the post. Moving right along,
It would still work. The difference instance would fetch the link containing the requested content and pass that on to the end user, where either the web UI running on the user's browser or the user's app would load the content. (Akin to a web browser loading the web page). It'd be up to to the piece running on the end user's computer to match it all together.
Yes, but the point is that, like an old-school forum, this is not revealed except by (and from) the original instance hosting the content, and only to the end user. It's not revealed until the end user's app/browser fetches the content from the original server. So since only a link is federated, the PII only exists on those two places. Meaning that the server admin has a much easier job to delete data, as they only have to get it deleted off their own instance.
If the end user then does webscraping ... well how can you prevent that?
And if someone creates a malicious instance that follows the link and screenscrapes it ... I assume it also falls under the "cannot prevent" bucket.
The problem here is that means we devs have to sit back and wait. When will we get the answers we need? And how long do we have to be exposed before we can actually work on solving the problem?
We really do need a foundation like the EFF to provide that legal advice and support, but I think coming up with technical fixes is still worthwhile even as we wait...
This seems like a good legal guide for an admin's and instance's jurisdiction is a must.
Interesting. In the US you can hire a lawyer to service that purpose, typically. In some jurisdictions, I wonder if something like https://www.alliancevirtualoffices.com/ may also work.
You've mentioned this a bunch of times but .. what's the DSA again? I have no doubt it's related but curious to understand exactly what it is and how it fits in.
Could there be jurisdictions that have only DSA and no GDPR, and others with GDPR and no DSA?
Ok, once more, continuing,
Thank you, that's a really good example! I understand the need to rein in AI, of course. My point stands (and it doesn't seem like you disagree) - a user friendly manual remains difficult to achieve.
Interesting. So pyfedi is a good example - the software supports backfilling when the instance discovers a new community/magazine on another instance for the first time, but it does it via API only. This means no backfilling of comments, and sometimes you can see posts from years ago in a stale magazine but which don't get backfilled because the API doesn't return them.
Clearview AI is a good example of exactly this kind of bad actor, see https://lemmy.world/comment/12151959
But it seems like even then there are ways to enforce.
Interestingly I've seen the reverse happen - websites blocking access to ip addresses that appear to be based in the EU to avoid having to deal with the GDPR and its ramifications.
I disagree. The issue you're describing is a common one in terms of extraterritoriality. How does the IRS get US citizens who are dual citizens living abroad to still pay taxes to the US? Enforcing laws extraterritorially is never easy, but as the IRS has proven, it is possible.
Me too. I'd say this is point one of what I'd like the GDPR to achieve.
Same here. I'm thinking one way forward may be to add funding to expand the agencies - one side does the regulation, but the other side offers free services to small business and individuals to help them comply.
No, I think that's a plus of the GDPR. Cost is on the company to comply and relevant gov't agency to chase up if the company doesn't. Facebook was brought in line, so it seems like a success so far. An example of point one above working.
Isn't this specifically covered by the journalism exception that the GDPR providers? https://verfassungsblog.de/the-gdprs-journalistic-exemption-and-its-side-effects/
I can kind of understand this though. What if I want that hidden so militants with missiles can't shoot me down? Easily justifiable by protection of life.
See where I mention point one above.
Seeing as it's a couple of months later, I'd add that I'm willing to wait if you think you will ever get around to it. Though you have already brought up some good points - the most salient one beinrg that GDPR compliance is simply too expensive and not user friendly for a small time individual, but I still feel that this is something that can be improved upon without major revisions to the GDPR itself.