We're aware of ongoing federation issues for activities being sent to us by lemmy.ml.
We're currently working on the issue, but we don't have an ETA right now.
Cloudflare is reporting 520 - Origin Error when lemmy.ml is trying to send us activities, but the requests don't seem to properly arrive on our proxy server. This is working fine for federation with all other instances so far, but we have seen a few more requests not related to activity sending that seem to occasionally report the same error.
~~Right now we're about 1.25 days behind lemmy.ml.~~
You can still manually resolve posts in lemmy.ml communities or comments by lemmy.ml users in our communities to make them show up here without waiting for federation, but this obviously is not something that will replace regular federation.
We'll update this post when there is any new information available.
Update 2024-11-19 17:19 UTC:
~~Federation is resumed and we're down to less than 5 hours lag, the remainder should be caught up soon.~~
The root cause is still not identified unfortunately.
Update 2024-11-23 00:24 UTC:
We've explored several different approaches to identify and/or mitigate the issue, which included replacing our primary load balancer with a new VM, updating HAproxy from the latest version packaged in Ubuntu 24.04 LTS to the latest upstream version, finding and removing a configuration option that may have prevented logging of certain errors, but we still haven't really made any progress other than ruling out various potential issues.
We're currently waiting for lemmy.ml admins to be available to reset federation failures at a time when we can start capturing some traffic to get more insights on the traffic that is hitting our load balancer, as the problem seems to be either between Cloudflare and our load balancer, or within the load balancer itself. Due to real life time constraints, we weren't able to find a suitable time this evening, we expect to be able to continue with this tomorrow during the day.
As of this update we're about 2.37 days behind lemmy.ml.
We are still not aware of similar issues on other instances.
Update 2024-11-25 12:29 UTC:
We have identified the underlying issue, where a backport for a bugfix resulting in crashes in certain circumstances was accidentally reverted when another backport was applied. We have applied this patch again and we're receiving activities from lemmy.ml again. It may take an hour or so to catch up, but this time we should reliably be getting there again. We're currently 4.77 days behind.
We still don't have an explanation why the logs were missing in HAproxy after going through Cloudflare, but this shouldn't cause any further federation issues.
Update 2024-11-25 14:31 UTC:
Federation has fully caught up again.
There's actually been some issues with Reddthat as well, not seeing users ect
do you have some more details about that?
I'm sorry I didn't see this, I mean I don't have much info for you, but basically I have noticed on lemmy.world posts if say I post on it with my alt I cannot see those posts even though I can see my lemmy.world account with my alt and all the comments it makes.
I have tested this with a few accounts and it seems to be consistent
could you share some specific examples?
feel free to pm them if you don't want to post them in public.
Like it wasn't even posted as shown here, I noticed it about In July or so
i still didn't understand what you were referring to, but now that i looked at this comment thread on reddthat.com I can see that the other account that commented here is banned from lemmy.world: https://lemmy.world/modlog?userId=1250220
the justification for the ban is just "spam", which unfortunately doesn't provide much context, and I don't see anything immediately obvious that'd justify it. especially considering that it's been a year since the ban, it was likely not necessary to issue a permanent ban for that. i've unbanned your reddthat account now.
Oh that makes so much more sense, I can't imagine what I might have done to make that happen but that sure solves one mystery for me, thank you!
I too would love to know what your experiencing (so I can fix it!)