this post was submitted on 19 Aug 2023
1 points (100.0% liked)

Lemmy Project Priorities Observations

5 readers
1 users here now

I've raised my voice loudly on meta communities, github, and created new [email protected] and [email protected] communities.

I feel like the performance problems are being ignored for over 30 days when there are a half-dozen solutions that could be coded in 5 to 10 hours of labor by one person.

I've been developing client/server messaging apps professionally since 1984, and I firmly believe that Lemmy is currently suffering from a lack of testing by the developers and lack of concern for data loss. A basic e-mail MTA in 1993 would send a "did not deliver" message back to message sender, but Lemmy just drops delivery and there is no mention of this in the release notes//introduction on GitHub. I also find that the Lemmy developers do not like to "eat their own dog food" and actually use Lemmy's communities to discuss the ongoing development and priorities of Lemmy coding. They are not testing the code and sampling the data very much, and I am posting here, using Lemmy code, as part of my personal testing! I spent over 100 hours in June 2023 testing Lemmy technical problems, especially with performance and lost data delivery.

I'll toss it into this echo chamber.

founded 1 year ago
MODERATORS
 

Weeks ago I had my moment of facing the attitude of keeping all this secret.

Just casually mention join_collapse_limit was tried behind-the scenes a month ago, then why are there zero post or comments in the entire Lemmy search for join_collapse_limit? I searched the entire GitHub project - no mention of join_collapse_limit. But Ready on the Spot to reveal the secret private communications tried join_collapse_limit log ago.

You know what join_collapse_limit is telling yo8u? Too many JOIN is a performance problem! The entire ORM Rust code and reliance on new JOIN is going to lead to other unpredictable performance problems that varies when there are 10,000 posts vs 2 million posts! And that's exactly the history of 2023... watching the code performance wildly swing based on the size of communities being queried, etc.

What I see is that pull request for ideas get created only after noise is made on a subject. There is a lack of openness to make mistakes in public.

For me,** the server crashes are what annoys me**, not human beings working on solutions. But for most of the people on the project, what seems to anthem is needing to have proper tabs vs. spaces on source code and even adding layers of SQL formatting tools in the middle of what clearly can be described as an SQL performance crisis.

Things keep getting broken: the HTML sanitation took a few hours to add to the project but now weeks of broken titles, HTML code blocks, even URL parameters are now broken on everyday links. The changes to delete behavior have orphaned comments and that has gone on for weeks now.

top 4 comments
sorted by: hot top controversial new old
[–] [email protected] 1 points 1 year ago

The reason join_collapse_limit needs OPEN DISCUSSION is because it highlights the core of he problem. Too many JOIN in the primary logic of listing posts. The 'too many fields' was kind of obvious, the size of the SELECT statement is huge! It's machine generated.

And I can't even REMOVE joins that aren't needed for anonymous. The Rust objects are so binding, that "saved posts" - which can not be saved for an anonymous user, can't be decoupled.

The servers crashing isn't treated like an actual problem.... as a siren going off saying the code design is faulty. The mere existence of join_collapse_limit as a topic being ignored - shows the lack of design concern. Now instance blocking is being added, another new layer of work for this query.

[–] [email protected] 1 points 1 year ago (1 children)

Back to Basics

All this INSERT overhead, real-time counting. Real time votes. But it is only chewing up dead tuples with constant rewrites of PostgreSQL rows to +1 every single thing in the site to give non-cached results.

And it isn't benefiting the SELECT side of reading that data, it's burdening it.

The subscribed table is likely merged for federated and local users. But when it comes time to list posts, having to sort through remote users data in the same table is overhead for every post listing. Same goes for votes, and yes - every SELECT looks at granular votes - because it wants to show the UI which items were already voted on. But it's a huge amount of data in that table to filter out all the votes on outdated posts, votes from user snot even on this server, etc.

And there are no limits... you could block every person and make the database have to labor away filtering out all the people you blocked. You can block a community. The testing code to reproduce these edge cases alone is a lot of work that isn't being done... and it creates sitting time bomb that some user who hits the 'save' on every post or block on every user throws queries into wild behaviors.

I think some sanity has to be considered, like "2 weeks worth of posts" is how data is organized.... and then at least someone who goes wild with save post or blocking users - there is a cut-off.

I think the personalization of data should pretty much be an entire post-production layer of the app. The core engine should be focused on post and comment storage and retrieval. "saved post" lists, blocking of instances, blocking of persons... let post-production deal with that.

There will be major world news events where people want to get in and see the latest comments, and the code will be crashing left and right because of personal block lists that some hand full of users built up to 80,000 people (on a single account) with some script file. Meanwhile, nobody has made a test script file to see what happens at 80,000 people on a block list....

[–] [email protected] 1 points 1 year ago

Ok... so where to begin?

  1. language choices. I think it's a noble gesture, but it's hard to ignore the overhead factor and all the end user who accidentally hide their posts and comments by getting confused by it.

  2. all sorts but "Most comment", "old", and "Controversial" come down to recent posts. Nobody is complain about a 3 week old post not appearing... with one exception, featured. I think I have some tricks to play with featured. Can some basic sanity be added to the project by putting a limit on time? 3 days? Are most people here to browse the most recent 3 days of content? 7 days? Can all data be divided and organized around this? With the exception being: single community?

  3. Is there a limp mode? Can something short of Beehaw and Lemmy.world turning off their entire front page - need to be built into the app. I think it needs to be done. In emergency / limp mode, you could cut off old data, or cut off personalization.

I think the project has fundamentally misinformed the population that servers are too busy because of too many users. I just don't see that many users!! Everything I see is too many JOIN statements! Moving to new virgin servers starts with zero data, that's why it worked. Lemmy.world has way more data than some empty instance that is 3 weeks old. And the project leaders have failed to understand or communicate this basic issue.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

PostgreSQL keeps failing

And I feel like the project keep ignoring that basic fact. The servers crashing aren't a feature, they are a bug! Yes, now there are 1500 instances to brag about, but they are all pulling data from lemmy.world and all the broken things in federation are smoldering issues.

join_collapse_limit is the PostgreSQL design team telling you don't build apps with 15 JOIN on real-time no-caching queries. And look what happens, it goes off into wild behaviors depending on the amount of data that has built up on a given server. And new instances starting with zero data gives the illusion that the problem is solved... but once data starts getting into that database, the overhead of all that JOIN logic and counting grows and grows.