this post was submitted on 22 Jun 2023
171 points (98.9% liked)

Lemmy

2172 readers
57 users here now

Everything about Lemmy; bugs, gripes, praises, and advocacy.

For discussion about the lemmy.ml instance, go to [email protected].

founded 4 years ago
MODERATORS
 

Hey folks! Just realized something that makes Lemmy different from Reddit. Because of the federation, your votes are not technically anonymous on Lemmy. At least, I think.

Although there’s no UI to look at a user’s voting history yet, one could conceivably be built by an instance. Perhaps coincidentally, I hear there’s instances out there populated by mostly bots?

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 66 points 1 year ago (3 children)

From a technical standpoint, it's not different from Reddit. The only difference here is that normal people can host their own instances, whereas Reddit is only hosted by the company and they can keep it under wraps.

[–] [email protected] 34 points 1 year ago (4 children)

Agreed from a technical standpoint.

But the implications are still interesting. One might (big might) trust Reddit as an organization not to use this data for evil, but with federation, there’s nothing stopping an instance from simply releasing all users’ voting history to be public.

Of course, my instance didn’t even ask for an email to sign up, so my entire account is anonymous that way.

I wonder if there are technical ways to federate votes anonymously?

[–] [email protected] 19 points 1 year ago (1 children)

Yeah, I wonder how you can federate anonymously while still maintaining defenses against vote manipulation.

[–] [email protected] 5 points 1 year ago (2 children)

I think you could probably do something like have the votes be reported in aggregate by the instance.

Any individual instance admin could use defences against vote manipulation by their own users, and other instances' admins could use defences against one particular instance being widely used for vote manipulation.

[–] [email protected] 3 points 1 year ago

I know some privacy oriented services (Brave Browser comes to mind) aggregate telemetry data like that to preserve privacy. Perhaps something like that is possible for Lemmy as well.

[–] [email protected] 1 points 1 year ago (1 children)

Someone could just run a rogue instance host all their bots on there, hiding it from anyone else.

[–] [email protected] 2 points 1 year ago

Right, but that’s where defederation comes in. Good faith admins can detect their own users and selectively ban them, while bad-faith admins running a server full of brigaders can be defederated if, for example, they detect anomalous patterns coming from that instance.

[–] [email protected] 12 points 1 year ago* (last edited 1 year ago)

but with federation, there’s nothing stopping an instance from simply releasing all users’ voting history to be public.

Which kbin.social does.

[–] [email protected] 12 points 1 year ago (2 children)

Maybe you could hash the user and post together somehow this way it is hashed but also unique per post. If you only hashed the username then the entirety of the user's voting history would be known if the hash was reverted.

[–] [email protected] 6 points 1 year ago (1 children)

Could be hashed and salted, with a random salt.

The trouble is, then, that it’s harder to disallow users from voting multiple times if the voting user isn’t on the post’s home instance.

[–] [email protected] 5 points 1 year ago (2 children)

Couldn't someone vote multiple times anyway by just having a bunch of different accounts?

[–] [email protected] 3 points 1 year ago

Yes, true, the current system does allow that. But the current system also doesn’t allow users to accidentally vote twice (and it remembers your vote)— this is the feature I think would be more challenging to implement if we were to hash & salt the user's ID.

[–] frostphunk 1 points 1 year ago

That’s always been a problem on Reddit and is on Lemmy now too though

[–] [email protected] 4 points 1 year ago (1 children)

Hashing can't effectively protect known values. If you want to know if someone voted for a post you can just hash their username and post ID. This is trivial and cheap.

If you want to know who voted on a post you just find every username you can find and hash it. It isn't super cheap but isn't very expensive either. There are only 8G people on the planet, many bitcoin rigs can calculate this in seconds. Sure, you can use a more expensive hash and there may be more accounts than people but it will remain feasible.

This is the same reason you can't hash phone numbers in a meaningful way.

The best option is probably just for the instance to report counts and you just have to trust it. If it is noticed that an instance seems to be inflating votes you stop counting its votes. People can work together to create blocklists for known cheating instances. Your instance would still know this but at least it is within your trust, not federated publicly.

[–] [email protected] -1 points 1 year ago (1 children)

Nah, if you can properly hash a password such that it doesn't match the same properly hashed password from a different website then you can properly hash usernames in this case such that others couldn't reverse it or put in the same input and get the same output you created. The technology is there. It's more of a question if it's really worth it. At least for now I'm not concerned with a malicious admin leaking someone's vote history.

https://en.wikipedia.org/wiki/Salt_(cryptography)

[–] [email protected] 2 points 1 year ago

No, hashing passwords is a different case because you know what the user is so you can use a unique salt. The password itself is also high entropy. For this use cause you can have at best per-post salt.

Think about it. The task that you are asking for is to quickly check if a user has voted for a post to prevent duplicates. So literally the operation you want is the same as you are trying to prevent. If you can enumerate users then you an by definition check if they have voted for a post.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (1 children)

But the implications are still interesting. One might (big might) trust Reddit as an organization not to use this data for evil, but with federation, there’s nothing stopping an instance from simply releasing all users’ voting history to be public.

Another potential privacy issue is that deleted content stays on server and I believe it's similar with posted images.

[–] [email protected] 4 points 1 year ago (2 children)

I think this issue is overblown. Instances of Lemmy might run modified code and choose to save things that the user intended to delete, of course, but the default setup of Lemmy seems reasonable to me in terms of how it treats deletion.

Currently it keeps deleted posts forever to allow users to un-delete if they choose, but deleting your account clears everything. And I believe there’s work in progress to discard deleted posts after 30 days. Details here: https://github.com/LemmyNet/lemmy/issues/2977

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago)

Thank you for pointing this out. I was looking into privacy in relation to Lemmy and came across this post where I got the wrong idea I guess. I couldn't find much else online at the time

And I believe there’s work in progress to discard deleted posts after 30 days.

That would be a nice addition

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (1 children)

This keeps on being asserted but it is far from true. If defederation happens or your local goes offline, posts/comment history/profile/votes will remain on other widely used instances and out of your control.

A large instance has already defederated with 2 other larger instances. If you run a personal instance I feel it will become very, very common to be be locked out of managing your data.

You can expect defederation to happen all the time as that is a deliberate part of the open federated model.

And that is to say nothing about federation simply breaking sometimes.

I already have been locked out of content that exists on other instances that will remain forever and I've only been around a short while. I don't care personally, but people keep asserting this claim that only bad actors or scrapers will dupe your data. Federated data is very different than a non-federated copy for many reasons and that matters to some people. Everyone should understand deleting your account, or modifying your content will often not remove your content outside your instance, and many people engage outside their local. It will likely exist in federated, Lemmy searchable form forever in some capacity (in the current iteration anyway).

Not trying to spread FUD, but if we want to maintain users they have to be educated as they will find out eventually and not be happy.

I have some working drafts on policies for admins to help them navigate and explain their responsibilities to their users.

It is a bit of a weird read outside of the context, but this is an optional primer I have drafted that will hopefully help explain the distinctions:

https://github.com/BanzooIO/federated_policies_and_tos/blob/main/optional-privacy-policy-intro.md

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago) (1 children)

Yes, that’s a fair point. Just because you send a “I have deleted this message” signal out into the universe doesn’t mean that everyone will receive or obey it.

I assumed that was understood.

But that’s very different from instances intentionally and malevolently keeping data despite indicating to users that it was deleted, which is what I think folks’ privacy concerns are about.

EDIT: What I mean is that the federation model is inherently non-private in a certain sense (but in the same sense that someone could take a screenshot of your Reddit comment and your deleting your comment won’t delete their copy). But Lemmy is not egregiously misusing data.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (1 children)

This is largely assumed by someone like yourself or I who understands the implications. I am finding it evident that a lot of people are not aware.

There is also a distinction to a potential screenshot, a scrape or archive no one visits, and a federated copy on a widly used instance you have lost access to.

I edited my comment above to include a project I am working on to hopefully help admins get this across and educate users on how to appropriately engage to their comfort level.

[–] [email protected] 2 points 1 year ago (1 children)

I appreciate your commitment to this privacy consideration. I personally don’t think it’s the hill I’d prefer to die on, but I welcome your contributions.

[–] [email protected] 0 points 1 year ago

Thanks! I'm for mass adoption and want admins to succeed. That starts with keeping users educated (and admins covered).

[–] [email protected] 13 points 1 year ago (2 children)

In fact, Reddit has suspended people for upvoting before.

[–] [email protected] 17 points 1 year ago (1 children)

True, but in Unidan's defense, it was a jackdaw, not a crow.

[–] [email protected] 3 points 1 year ago (1 children)

We need Unidan back now more than ever 🤗🐦‍⬛

[–] [email protected] 2 points 1 year ago
[–] [email protected] 12 points 1 year ago (1 children)

You're kidding surely. That's actually awful. Any source for this? Would love to read more about it.

[–] IMongoose 7 points 1 year ago

Not from normal upvoting, but vote manipulation like was mentioned above with unidan. Basically using multiple accounts to upvote your own post for visibility.

[–] [email protected] 1 points 1 year ago (1 children)

That's not really true, since on reddit only the one host can see the votes, as opposed to anyone who is willing to put the effort in.

[–] [email protected] 1 points 1 year ago

That's exactly what I mean when I said:

whereas Reddit is only hosted by the company and they can keep it under wraps.