this post was submitted on 01 Jul 2023
86 points (97.8% liked)

Asklemmy

44151 readers
2542 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_[email protected]~

founded 5 years ago
MODERATORS
 

I already get rate-limited like crazy on lemmy and there are only like 60,000 users on my instance. Is each instance really just one server or are there multiple containers running across several hosts? I’m concerned that federation will mean an inconsistent user experience. Some instances many be beefy, others will be under resourced… so the average person might think Lemmy overall is slow or error-prone.

Reddit has millions of users. How the hell is this going to scale? Does anyone have any information about Lemmy’s DB and architecture?

I found this post about Reddit’s DB from 2012. Not sure if Lemmy has a similar approach to ensure speed and reliability as the user base and traffic grows.

https://kevin.burke.dev/kevin/reddits-database-has-two-tables/

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 11 points 2 years ago (1 children)

Well, I run an instance, too. It's not big at all, but I was thinking about the issue of scaling, too. You can only scale up a single server so much...

But on the other hand, Lemmy is still young. We'll find solutions to that problem.

Also, interesting article. I only took a glance at it, but having only two tables kind of suggests that Reddit is using a relational database. So, if they're not "normalizing" everything, why not use a completely different paradigm, like what MogoDB etc. has?

[–] [email protected] 6 points 2 years ago* (last edited 2 years ago) (1 children)

The database isn't really the problem in the current state of things. The server is because:

  • Until 0.18 there was no caching (for the UI) and the poorly implemented websockets
  • The developers have admited that they aren't proficient in SQL, in which case, why not using an ORM instead? Sure, they aren't perfect but they will do better than the average developer at scale.
  • There is no queue system for activityPub requests
  • Because there is no queue, user requests and federation have the same priority when it shouldn't and one can bottleneck the other
  • Live inserts are used meaning that regardless of the DB used, performance is going to be killed since inserting data 1 at a time several times a second is a major waste of resource

Tl;dr: It's trying to do everything and not that well. So users suffer because they have to share resources with non-UI related tasks.

The database suffer because it has to do an insert of 1 object X 50 times in a second when it could do it once for all 50 items.

Federation suffers because you can't offload it to a seperate machine farm whose job will be to receive and send ActivityPub requests and send/read data from the correct queues to do so.

[–] BitOneZero 5 points 2 years ago

Federation also does a lot of live HTTP connects to other peers. It looks up users for messages. The whole design is very resource intensive, one single vote, comment, post at a time. There is also a lot of boilerplate JSON overhead in sending something as simple as a single vote.