this post was submitted on 08 Aug 2023
0 points (50.0% liked)

Lemmy Project Priorities Observations

5 readers
1 users here now

I've raised my voice loudly on meta communities, github, and created new [email protected] and [email protected] communities.

I feel like the performance problems are being ignored for over 30 days when there are a half-dozen solutions that could be coded in 5 to 10 hours of labor by one person.

I've been developing client/server messaging apps professionally since 1984, and I firmly believe that Lemmy is currently suffering from a lack of testing by the developers and lack of concern for data loss. A basic e-mail MTA in 1993 would send a "did not deliver" message back to message sender, but Lemmy just drops delivery and there is no mention of this in the release notes//introduction on GitHub. I also find that the Lemmy developers do not like to "eat their own dog food" and actually use Lemmy's communities to discuss the ongoing development and priorities of Lemmy coding. They are not testing the code and sampling the data very much, and I am posting here, using Lemmy code, as part of my personal testing! I spent over 100 hours in June 2023 testing Lemmy technical problems, especially with performance and lost data delivery.

I'll toss it into this echo chamber.

founded 1 year ago
MODERATORS
 

Lemmy.world shared a slow query, in detail! Yey!

SELECT post . id, post . name, post . url, post . body, post . creator_id, post . community_id, post . removed, post . locked, post . published, post . updated, post . deleted, post . nsfw, post . embed_title, post . embed_description, post . thumbnail_url, post . ap_id, post . local, post . embed_video_url, post . language_id, post . featured_community, post . featured_local, person . id, person . name, person . display_name, person . avatar, person . banned, person . published, person . updated, person . actor_id, person . bio, person . local, person . private_key, person . public_key, person . last_refreshed_at, person . banner, person . deleted, person . inbox_url, person . shared_inbox_url, person . matrix_user_id, person . admin, person . bot_account, person . ban_expires, person . instance_id, community . id, community . name, community . title, community . description, community . removed, community . published, community . updated, community . deleted, community . nsfw, community . actor_id, community . local, community . private_key, community . public_key, community . last_refreshed_at, community . icon, community . banner, community . followers_url, community . inbox_url, community . shared_inbox_url, community . hidden, community . posting_restricted_to_mods, community . instance_id, community . moderators_url, community . featured_url, community_person_ban . id, community_person_ban . community_id, community_person_ban . person_id, community_person_ban . published, community_person_ban . expires, post_aggregates . id, post_aggregates . post_id, post_aggregates . comments, post_aggregates . score, post_aggregates . upvotes, post_aggregates . downvotes, post_aggregates . published, post_aggregates . newest_comment_time_necro, post_aggregates . newest_comment_time, post_aggregates . featured_community, post_aggregates . featured_local, post_aggregates . hot_rank, post_aggregates . hot_rank_active, post_aggregates . community_id, post_aggregates . creator_id, community_follower . id, community_follower . community_id, community_follower . person_id, community_follower . published, community_follower . pending, post_saved . id, post_saved . post_id, post_saved . person_id, post_saved . published, post_read . id, post_read . post_id, post_read . person_id, post_read . published, person_block . id, person_block . person_id, person_block . target_id, person_block . published, post_like . score, coalesce ( ( post_aggregates . comments - person_post_aggregates . read_comments ), post_aggregates . comments ) FROM ( ( ( ( ( ( ( ( ( ( ( ( post_aggregates INNER JOIN person ON ( post_aggregates . creator_id = ? . id ) ) INNER JOIN post ON ( post_aggregates . post_id = ? . id ) ) INNER JOIN community ON ( post_aggregates . community_id = ? . id ) ) LEFT OUTER JOIN community_person_ban ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_person_ban . person_id = ? . creator_id ) ) ) LEFT OUTER JOIN community_follower ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_follower . person_id = ? ) ) ) LEFT OUTER JOIN post_saved ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_saved . person_id = ? ) ) ) LEFT OUTER JOIN post_read ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_read . person_id = ? ) ) ) LEFT OUTER JOIN person_block ON ( ( post_aggregates . creator_id = ? . target_id ) AND ( person_block . person_id = ? ) ) ) LEFT OUTER JOIN community_block ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_block . person_id = ? ) ) ) LEFT OUTER JOIN post_like ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_like . person_id = ? ) ) ) LEFT OUTER JOIN person_post_aggregates ON ( ( post_aggregates . post_id = ? . post_id ) AND ( person_post_aggregates . person_id = ? ) ) ) LEFT OUTER JOIN local_user_language ON ( ( post . language_id = ? . language_id ) AND ( local_user_language . local_user_id = ? ) ) ) WHERE ( ( ( ( ( ( ( ( community . removed = ? ) AND ( community . deleted = ? ) ) AND ( post . removed = ? ) ) AND ( post . deleted = ? ) ) AND ( community . local = ? ) ) AND ( ( community . hidden = ? ) OR ( community_follower . person_id = ? ) ) ) AND ( post . nsfw = ? ) ) AND ( community . nsfw = ? ) ) ORDER BY post_aggregates . featured_local DESC, post_aggregates . hot_rank DESC, post_aggregates . published DESC LIMIT ? OFFSET ?

top 11 comments
sorted by: hot top controversial new old
[–] [email protected] 1 points 1 year ago
[–] [email protected] 1 points 1 year ago
SELECT post . id, post . name, post . url, post . body, post . creator_id, post . community_id, post . removed, post . locked, post . published, post . updated, post . deleted, post . nsfw, post . embed_title, post . embed_description, post . thumbnail_url, post . ap_id, post . local, post . embed_video_url, post . language_id, post . featured_community, post . featured_local, person . id, person . name, person . display_name, person . avatar, person . banned, person . published, person . updated, person . actor_id, person . bio, person . local, person . private_key, person . public_key, person . last_refreshed_at, person . banner, person . deleted, person . inbox_url, person . shared_inbox_url, person . matrix_user_id, person . admin, person . bot_account, person . ban_expires, person . instance_id, community . id, community . name, community . title, community . description, community . removed, community . published, community . updated, community . deleted, community . nsfw, community . actor_id, community . local, community . private_key, community . public_key, community . last_refreshed_at, community . icon, community . banner, community . followers_url, community . inbox_url, community . shared_inbox_url, community . hidden, community . posting_restricted_to_mods, community . instance_id, community . moderators_url, community . featured_url, community_person_ban . id, community_person_ban . community_id, community_person_ban . person_id, community_person_ban . published, community_person_ban . expires, post_aggregates . id, post_aggregates . post_id, post_aggregates . comments, post_aggregates . score, post_aggregates . upvotes, post_aggregates . downvotes, post_aggregates . published, post_aggregates . newest_comment_time_necro, post_aggregates . newest_comment_time, post_aggregates . featured_community, post_aggregates . featured_local, post_aggregates . hot_rank, post_aggregates . hot_rank_active, post_aggregates . community_id, post_aggregates . creator_id, community_follower . id, community_follower . community_id, community_follower . person_id, community_follower . published, community_follower . pending, post_saved . id, post_saved . post_id, post_saved . person_id, post_saved . published, post_read . id, post_read . post_id, post_read . person_id, post_read . published, person_block . id, person_block . person_id, person_block . target_id, person_block . published, post_like . score, coalesce ( ( post_aggregates . comments - person_post_aggregates . read_comments ), post_aggregates . comments ) 
FROM ( ( ( ( ( ( ( ( ( ( ( ( post_aggregates 
    INNER JOIN person ON ( post_aggregates . creator_id = ? . id ) ) 
    INNER JOIN post ON ( post_aggregates . post_id = ? . id ) ) 
    INNER JOIN community ON ( post_aggregates . community_id = ? . id ) ) 
    LEFT OUTER JOIN community_person_ban ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_person_ban . person_id = ? . creator_id ) ) ) 
    LEFT OUTER JOIN community_follower ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_follower . person_id = ? ) ) ) 
    LEFT OUTER JOIN post_saved ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_saved . person_id = ? ) ) ) 
    LEFT OUTER JOIN post_read ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_read . person_id = ? ) ) ) 
    LEFT OUTER JOIN person_block ON ( ( post_aggregates . creator_id = ? . target_id ) AND ( person_block . person_id = ? ) ) ) 
    LEFT OUTER JOIN community_block ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_block . person_id = ? ) ) ) 
    LEFT OUTER JOIN post_like ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_like . person_id = ? ) ) ) 
    LEFT OUTER JOIN person_post_aggregates ON ( ( post_aggregates . post_id = ? . post_id ) AND ( person_post_aggregates . person_id = ? ) ) ) 
    LEFT OUTER JOIN local_user_language ON ( ( post . language_id = ? . language_id ) AND ( local_user_language . local_user_id = ? ) ) ) 
WHERE ( ( ( ( ( ( ( ( community . removed = ? ) AND ( community . deleted = ? ) ) AND ( post . removed = ? ) ) AND ( post . deleted = ? ) ) AND ( community . local = ? ) ) AND ( ( community . hidden = ? ) OR ( community_follower . person_id = ? ) ) ) AND ( post . nsfw = ? ) ) AND ( community . nsfw = ? ) ) 
ORDER BY post_aggregates . featured_local DESC, post_aggregates . hot_rank DESC, post_aggregates . published DESC 
LIMIT ? OFFSET ?
[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

Oblivious unpleasant answer to scaling

I'm surprised lemmy.world hasn't brought up one of the issues...

Maybe they just keep overlooking. I don't think it is the number of users that is as high as the amount of data.

Archiving off PostgreSQL older data could help. Lemmy was designed with live-websockets and instant notification of every new comment... which worked fine with very little data in PostgreSQL.

Lemmy.world database keeps growing. A quick-fix is to devise a way to move votes/posts/comments out of the tables say 45 days older. Even archive out person records once those posts associated with that person are moved out.

The "dirty little secret" is that smaller instances people started 30 days ago run faster because THEY HAVE NO DATA in PostgreSQL!

I'd rather we get specific AUTO_EXPLAIN details out of Lemmy.world, but it is maybe EASIER than it is getting people to fix Rust SQL ORM statements.

[–] [email protected] 1 points 1 year ago

[email protected] discovered today, but no posts...

[–] [email protected] 1 points 1 year ago (1 children)

I don't mention it as often in public, but lemmy.world is unable to keep up with database size. Errors are common.

[–] [email protected] 1 points 1 year ago

Posting users have made about less content on Lemmy platform due to lemmy.world overloads: https://lemmy.world/post/2856311

[–] [email protected] 1 points 1 year ago

https://lemmy.ml/post/3024393

People seem to have been given the impression that a full Lemmy instance is kind of like a "super client" to read content from.

What is really needed is an API caching layer... off-loading from lemmy.world and other overloaded servers would be best if API clients didn't download the same content over and over.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

Also, a person can block a community and also subscribe to a community, and the query looks at these independent. ToDo: add testing code to confirm API behavior when a user blocks a community they are subscribed to and vise-versa.

A denormalized table for person community_id to match multiple states could be of benefit. Especially in edge cases where a particular login account has a lot of blocks or subscribes to filter out posts and comments from.

[–] [email protected] 1 points 1 year ago

API returning deleted comments

For anonymous users, has lemmy_server main branch diverged in behavior from 0.18 branch? Need to review testing scripts and run against both.

[–] [email protected] 1 points 1 year ago (1 children)

I don't like the JOIN approach lemmy uses. They may as well denormalize it all into one table with 200 fields for a post

Back when I did SQL with DB2 and PostgreSQL and hardware was far more limited... I always thought defensively.

If you have 11,000 posts in a community and you are requesting New/Hot/Active - do an IN clause that whacks them off at the pass.

especially since most people are focused on hitting page 1 and maybe page 2 and 3. You gotta slice that data down. The problem exists on New, so there is no aging out with Hot/Active.

Open-ended queries with LIMIT and JOIN can tangle up. Especially since post-aggregates id isn't post_id?

[–] [email protected] 1 points 1 year ago

What exactly is the relationship of post_aggregates and post?

aggregates id: 503771 to post id 2133833. Both tables have 502204 records in them, equal. But why is post ID incrementing so far ahead? Is federation pounding on the index with failed INSERT statements?