I made this based on the gripe about some of the silent failures with federation. Might help users choose other servers. Might help admins troubleshoot. Open to comments and criticisms!

top 25 comments

sorted by: hot top controversial new old

[–] [email protected] 21 points 2 years ago (1 children)

Oooohhh ... Nice!! I'm repeatedly impressed at how many hackers are going ahead and just getting some stuff done here!!

Questions/thoughts:

What instance is used as a reference for the delay? One you self-host (lemmy.management)?
Sooo ... what's the deal with lemmy.ml ... that seems to have gone beyond lag and is basically falling over ... seems like the devs have neglected their own instance's health?
What's that Redash? Is it a plotly thing or some other product that just uses their graphing library? How have you found it?

[–] [email protected] 10 points 2 years ago

What instance is used as a reference for the delay? One you self-host (lemmy.management)?

Yes. lemmy.management. It is purposefully updating subscribed communities to as many as possible (via automation.) This doesn't correct for network lag, but the idea was to capture the "federation" lag. There's no code I'm aware of that allows admins to prioritize outbound federation traffic. I could be wrong though.

Sooo … what’s the deal with lemmy.ml … that seems to have gone beyond lag and is basically falling over … seems like the devs have neglected their own instance’s health?

I just collect the data.

What’s that Redash? Is it a plotly thing or some other product that just uses their graphing library? How have you found it?

https://redash.io I don't remember how I found it. Probably an "awesome" list on github.

[–] UncleStewart 10 points 2 years ago

On mobile, when touching the "Federation Lag-o-meter (now - 1h)" statistics, the page is hard to scroll. Other than this the page is gold

[–] [email protected] 9 points 2 years ago (1 children)

Nice work! Maybe add feddit.de?

[–] [email protected] 7 points 2 years ago* (last edited 2 years ago) (1 children)

Fixed! The regex was not getting content from < 0.18.0 instances. Thanks!

EDIT: I am wrong, it was something else in feddit.de's messages I THOUGHT was a version thing, but must be a localization thing. A string in the JSON was breaking some regex. Regardless.. fixed.

[–] [email protected] 5 points 2 years ago

Awesome, thank you :⁠-⁠)

[–] [email protected] 9 points 2 years ago

Graph should remove the outlier as it is skewing the results for every other instance and not letting to see smaller numbers show up.

Or we should move to log scale so that it can be displayed correctly.

[–] [email protected] 9 points 2 years ago (2 children)

When I saw the bar looking like the Burj Khalifa, I assumed it was .world instead of .ml. Interesting.

Props to [email protected] for dealing admirably with the Rexxit hug of death.

[–] [email protected] 10 points 2 years ago* (last edited 2 years ago)

I’m expecting that JSON parsing is a huge overhead with the fediverse. I work on a SAAS that needs to do all its internal processing in under 10 ms, and serializing/deserializing ends up being a sizable chunk of server time. I saw a 40% reduction in runtime using simdjson for deserializing, and there exists a rust crate for it, but I haven’t had time to look the Lemmy code over.

Can anyone with an overloaded instance get on their command line and gather a decent flamegraph so the performance folks can aim optimizations in the right direction?

https://github.com/brendangregg/FlameGraph

[–] [email protected] 2 points 2 years ago (1 children)

Beehaw is currently doing the Burj

[–] [email protected] 1 points 2 years ago

Yep, it seems completely different to when I last looked.

It seems everyone gets a turn a top.

[–] tenth 9 points 2 years ago (1 children)

Great idea. I was trying to figure out if it was lemmy.world trying to deal with new users or a bug with Memmy app that caused random errors

Is it possible to have the lag metrics by instances in a table format? Its so hard to view your site on mobile

[–] [email protected] 5 points 2 years ago (1 children)

I didn't even load it on mobile. I will check it out tonight and maybe just create a separate "mobile friendly" dashboard.

[–] Wailzy 8 points 2 years ago

Not the person you’re replying to, but I didn’t find it awful on mobile. The zoom by dragging worked well, as did the double tap to view the whole dataset.

For a quick browse I wasn’t frustrated at all and found the information I wanted to in a short amount of time!

[–] [email protected] 7 points 2 years ago

It'll be interesting to see how this changes through the day! I know .world tends to slow down later in the day when the US contingent is getting going.

(also, yay lemm.ee)

[–] [email protected] 7 points 2 years ago (1 children)

This is awesome! Hopefully it'll help spread the load among instances. Definitely going to use this to see which instance to move to (and which to avoid)

[–] [email protected] 8 points 2 years ago

Keep in mind this is a one hour snapshot. I am working on a historical rating as well to give a better indication of overall long term stability.

[–] [email protected] 6 points 2 years ago (1 children)

This looks great. Is there any chance that this could be extended to include Kbin as well, since those instances federated with Lemmy, too?

[–] [email protected] 7 points 2 years ago (1 children)

I am actually working on that! Stay tuned. Like days though, don't get too excited. :)

[–] [email protected] 5 points 2 years ago* (last edited 2 years ago) (1 children)

Aye aye. I'm mildly excited.

[–] [email protected] 1 points 2 years ago

kbin posts DO show up in the details table. you would need to know the ip they are coming from. they don't include their instance host name in the header, which is why it's not in the table and instance is null for some IPs. also I don't scrape and subscribe kbin magazines like i do for lemmy ATM, so the traffic will be low. probably just a few from kbin.social.

[–] [email protected] 5 points 2 years ago

This looks really good.

As an admin of a small kbin instance, I'll be keeping an eye on updates from you as this will be very handy!

[–] [email protected] 2 points 2 years ago (1 children)

This is really cool! Would it be possible to grab this data as json, csv or some other equivalent format? I'm working on making my own lemmy client and this would be very helpful to be able to display i think

[–] [email protected] 2 points 2 years ago

Should already be able to:

https://redash.io/help/user-guide/integrations-and-api/api

For example: https://aftershock.lemmy.management/api/queries/4/results

The API key for public users is the same as the dashboard slug: oT7pdcoeHWccpvZCNmTpJKoGZND8ZdRO3wDWpMug

[–] [email protected] 2 points 2 years ago

Awesome work! Added my instance!

load more comments