Relevant section you mentioned (https://lemm.ee/post/691830):
We found that the many 502-errors were caused by an issue in Lemmy/markdown-it.actix or whatever, causing nginx to temporarily mark an upstream to be dead. As a workaround we can either 1.) Only use 1 container or 2.) set ~~
proxy_next_upstream timeout;
~~max_fails=5
in nginx.Currently we're running with 1 lemmy container, so the 502-errors are completely gone so far, and because of the fixes in the Lemmy code everything seems to be running smooth. If needed we could spin up a second lemmy container using the ~~
proxy_next_upstream timeout;
~~max_fails=5
workaround but for now it seems to hold with 1.Edit So as soon as the US folks wake up (hi!) we seem to need the second Lemmy container for performance. So that's now started, and I noticed the
proxy_next_upstream timeout
setting didn't work (or I didn't set it properly) so I usedmax_fails=5
for each upstream, that does actually work.