Technology

58992 readers

7805 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

History's Major Downtimes: Lessons from the Biggest Outages (lemmy.world)

submitted 1 week ago by evrimsel to c/technology

2 comments fedilink hide all child comments

Online service reliability is crucial in the digital age. Even robust systems can face unexpected outages, affecting various platforms. Let's explore the insights!

https://robotalp.com/blog/historys-major-downtimes-lessons-from-the-biggest-outages/

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 2 points 1 week ago (1 children)

Easy answer. Don't use platforms. Use protocols. Lemmy doesnt go down, Mastodon doesnt go down, nostr doesnt go down, Monero doesnt go down, Bitcoin does not go down.

Facebook goes down, zoom goes down, AWS goes down.

[–] somebodysomewhere 5 points 1 week ago

reason for that is isolation and reduncancy though. Most incidents/outages are the result of a change and in the cases you mentioned they are mitigated by the fact that not all instances receive updates at the same time. Presumably, the error is noticed in one place and traffic is then served by healthy instances.

By all accounts these are practices that significant service providers follow. In fact AWS typically rolls out updates to us-east-1 before updating other regions to use it as a canary to warn against issues.

With federated services, this is less of a conscious decision and tends to happen only because instance maintainers update on different schedules.

Blue-green deployments and failover are common mitigation strategies and mature organizations actively employ these. Conversely, these patterns are integral to the decentralized nature of the fediverse and other distributed solutions such as cdn.