this post was submitted on 07 Aug 2023
1 points (100.0% liked)

Lemmy Project Priorities Observations

5 readers
1 users here now

I've raised my voice loudly on meta communities, github, and created new [email protected] and [email protected] communities.

I feel like the performance problems are being ignored for over 30 days when there are a half-dozen solutions that could be coded in 5 to 10 hours of labor by one person.

I've been developing client/server messaging apps professionally since 1984, and I firmly believe that Lemmy is currently suffering from a lack of testing by the developers and lack of concern for data loss. A basic e-mail MTA in 1993 would send a "did not deliver" message back to message sender, but Lemmy just drops delivery and there is no mention of this in the release notes//introduction on GitHub. I also find that the Lemmy developers do not like to "eat their own dog food" and actually use Lemmy's communities to discuss the ongoing development and priorities of Lemmy coding. They are not testing the code and sampling the data very much, and I am posting here, using Lemmy code, as part of my personal testing! I spent over 100 hours in June 2023 testing Lemmy technical problems, especially with performance and lost data delivery.

I'll toss it into this echo chamber.

founded 1 year ago
MODERATORS
 

did a git pull request for new index on post searches by community

top 10 comments
sorted by: hot top controversial new old
[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

I'm still pondering the least-intrusive way to deal with PostgreSQL fragmenting the comment table so badly the way Lemmy does things.

  1. Rust code does INSERT, waits for the primary key ID
  2. Rust code does UPDATE, puts the ID on path
  3. Rust code does UPDATE, puts the ID on ap_id
  4. At some point her TRIGGERS are running to comment_aggregates on the comment row, plus counting the new comment in site/community/person totals.

Then there is counting, which there is a request in [email protected] this past weekend to have a new count of direct-children, on top of the tree counting of children.

Right now when a reply comment is made, each parent in in the branch is updated by count. This UPDATE is one of the slower parts of Lemmy's PostfreSQL transactions. But it also is generating post_aggregate dead rows due to the children count update.

[–] [email protected] 1 points 1 year ago (4 children)

update comment_aggregates ca set child_count = c.child_count from ( select c.id, c.path, count(c2.id) as child_count from comment c join comment c2 on c2.path <@ c.path and c2.path != c.path and c.path <@ $1 group by c.id ) as c where ca.comment_id = c.id

ok, can I rework this to be a simple + 1?

[–] [email protected] 1 points 1 year ago

duration: 21.028 ms
"Query Text": "\nupdate comment_aggregates ca set child_count = c.child_count\nfrom (\n select c.id, c.path, count(c2.id) as child_count from comment c\n join comment c2 on c2.path <@ c.path and c2.path != c.path\n and c.path <@ '0.1600900'\n group by c.id\n) as c\nwhere ca.comment_id = c.id",

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

duration: 23.487 ms
"Query Text": "\nupdate comment_aggregates ca set child_count = c.child_count\nfrom (\n select c.id, c.path, count(c2.id) as child_count from comment c\n join comment c2 on c2.path <@ c.path and c2.path != c.path\n and c.path <@ '0.1571057'\n group by c.id\n) as c\nwhere ca.comment_id = c.id",

[–] [email protected] 1 points 1 year ago

duration: 19.590 ms
"Query Text": "\nupdate comment_aggregates ca set child_count = c.child_count\nfrom (\n select c.id, c.path, count(c2.id) as child_count from comment c\n join comment c2 on c2.path <@ c.path and c2.path != c.path\n and c.path <@ '0.1337228'\n group by c.id\n) as c\nwhere ca.comment_id = c.id"

[–] [email protected] 1 points 1 year ago

duration: 104.552 ms
"Query Text": "\nupdate comment_aggregates ca set child_count = c.child_count\nfrom (\n select c.id, c.path, count(c2.id) as child_count from comment c\n join comment c2 on c2.path <@ c.path and c2.path != c.path\n and c.path <@ '0.1624758'\n group by c.id\n) as c\nwhere ca.comment_id = c.id"

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

Why is this running more than once?

2023-08-07T18:32:24.800036Z WARN SQLwatch: comment_aggregates set child_count comment_id 1630788 parent_id 1495576 top_parent 0.1495576 parent_path Ltree("0.1495576.1533116.1550665.1551301.1552433.1552634.1553721.1556462.1558492.1565954.1581787.1606263.1630787")

2023-08-07T18:33:10.299908Z WARN SQLwatch: comment_aggregates set child_count comment_id 1630788 parent_id 1495576 top_parent 0.1495576 parent_path Ltree("0.1495576.1533116.1550665.1551301.1552433.1552634.1553721.1556462.1558492.1565954.1581787.1606263.1630787")

2023-08-07T18:34:42.692087Z WARN SQLwatch: comment_aggregates set child_count comment_id 1630788 parent_id 1495576 top_parent 0.1495576 parent_path Ltree("0.1495576.1533116.1550665.1551301.1552433.1552634.1553721.1556462.1558492.1565954.1581787.1606263.1630787")

comment 1630788 was edited, is it re-running count update on an edit?

[–] [email protected] 1 points 1 year ago

slacking on updating the comment count on nested replies that concurrency self-awareness could kick in and skip.

the 5000 comment "big social news", typically some major political, disaster, social media, space, whatever event.

[–] [email protected] 1 points 1 year ago

another topic

Studying the index evolution regarding sorting of posts....

Featured could be accomplished by having an artificial posting time put on featured post. post_aggregate. Say set 1 year into the future, making them appear before other posts. 1 year + the original publish date or whatever featured sequence is desired.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

another topic

"DeArrow (community project to replace all clickbait titles and thumbnails from Youtube with appropriate ones, it’s probably better than I’m making it sound)" (search Lemmy comments)

I've brought this up for decades on Reddit... crowd-sourcing titles for news articles and even having a list of desired editors to provide their overlaid commentary (3 lines or less?). Saving clicks through better information.

With lemmy, a self-hidden comment (sort of line unux . filenames/foldernames) - a markdown convention in the content that tells API clients to normally post at bottom and/or hide.... of alternate headlines and taglines for a posting.