this post was submitted on 22 Jun 2023
22 points (95.8% liked)
Lemmy.world Support
3230 readers
26 users here now
Lemmy.world Support
Welcome to the official Lemmy.world Support community! Post your issues or questions about Lemmy.world here.
This community is for issues related to the Lemmy World instance only. For Lemmy software requests or bug reports, please go to the Lemmy github page.
This community is subject to the rules defined here for lemmy.world.
You can also DM https://lemmy.world/u/lwreport or email [email protected] (PGP Supported) if you need to reach our directly to the admin team.
Follow us for server news 🐘
Outages 🔥
https://status.lemmy.world
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Part of me wonders if we're starting to see a limitation of ActivityPub? I might be wrong but I don't see any sort of batching in the protocol? So every post, comment or heck every upvote will not only trigger a service call from your machine to your instance but would it not also trigger dozens or more service calls from your instance out to all of the federated servers? And as more and more instances come online the number of service calls per action starts to grow exponentially?
I don’t think it’s a protocol issue. It’s not that different from how e-mail lists work. Except most users are on a very small number of big instances (which makes it easier).
In Usenet it’s even worse as messages have to go through multiple hops and aren’t filtered by what groups users subscribe too, and that worked decades ago on hardware less powerful than a cheap cloud instance is today.
Lemmy is still tiny in comparison so I think it must be a software issue.
Ya I keep forgetting it's similar to email. But at the same time for every reply, is it not similar to doing a reply-all, with each federated instance being in the recipients list? And don't forget what happens when everyone starts doing a reply-all when on an email with a large distribution list... The email server ends up taking hours or even days to processes that queue. Hopefully I'm over thinking it, but I'll learn more over the next few weeks as I study ActivityPub more and more.
Please share what you learn and observe: [email protected]
Lemmy hasn't even reached 1990's level email system design for a database-backend MTA. The outbound delivery queue has no management tools, is not saved when a sever stops, and runs in-process with the same service as the rest of lemmy_server - for interactive end-users. Federation really needs to be moved to a separate server service, the massive number of connections to send votes and comments isn't even holding up under current loads. And this is without major social events with dozens of comments per second, people leaving Reddit was largely limited by signup problems - major servers crashing and restricted signup approvals.
Huh! That's possible, but that doesn't sound right to me. I think for example that if I post a comment on a lemmy.ml thread, my client sends a request to A, lemmy.world, and then A sends a request to B, lemmy.ml, and then third-parties C would see the result of my like if they individually request content from B.
I don't think cross-instance interaction is shared with every federated instance at once. They each need to go "find out" on their own.
From what I can tell, and I've only been reading the docs for a few days now. Everything is considered an actor.... Users, comments, posts, communities and even the instance itself. When you create a new post to a community you're essentially replying to that community's actor. If that community is not on your instance then your instance needs to post that reply via an ActivityPub service call. Now once the target instance receives that post request it adds it to it's database but then that instance needs to inform all of the other instances that something has been added to that community. So it'll queue up a service call to every other instances that has federated with that community's instance.
So in the end if you're a user on instance A but trying to create a post to a community that belongs to instance B, A has to tell B about the post. Next B then, since it's the owner of that community, has to inform instance C,D,E, etc... that there's now a new reply to that community.
Again I might still be missing something as I've only been reading these for about a day or so. But if this is correct then each action anyone takes produces a lot of network traffic to update every other instance in the fediverse.
Oh, I see what you mean! That's right... Instances update their listeners, they're not polled.
So then what would cause the slowdown in updating federated instances? The servers themselves are still responding. I guess there must be a queue of updates to other instances that's not getting emptied fast enough? That would explain a comment of mine to a separate instance showing up here but not there. I'm not familiar with the inner workings.
Thanks for the discussion by the way!
Ya for every action we take, that could mean 500+ service calls that the instance itself has to make. (Just go to /instances on any server and count the number that are listed). And that's assuming that everyone is only communicating on their own instances and just lurking on content from other instances.
I'm currently trying to build an enhanced search engine just for the fediverse, my first thought was to build essentially an instance that has no communities and would subscribe to literally everything to get updates, but after looking at the ActivityPub protocol I'm worried that my server would instantly crash from all of the network traffic. I'm not quite ready to shell out for anything larger than a Pi just yet. So now I'm looking to see what I can do with the public APIs and just Lemmy. This way I can poll for the data rather than receive pushes....
Agreed. There is also a lot of protocol boilerplate overhead, digital signing, to send a single Like of a comment one to one server. Thee implementation also has no way for the server operator to know that another server is backing up or failing, and when the server is stopped - the queue gets lost.