this post was submitted on 01 Jul 2023
217 points (99.1% liked)
Interesting News from Around the World
212 readers
1 users here now
Interesting news from around the world.
Post format
Post title should mirror the news source title.
Post URL
Post URL should be the original link to the article (even if paywalled) and archived copies left in the body.
Icon attribution | Banner attibution
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
People fail to understand that large projects have inertia. He could have shuttered all twitter offices, fired all employees, and only paid the server bills, and the website would probably continue to function just fine for a few months.
But as a devops/SRE, this whole saga has been awesome to watch
Weren't they not paying their AWS bills for a while?
Back in March it was reported they weren’t paying their AWS bill. Two weeks ago it was reported they weren’t paying their GCP bill either.
And often the tipping point is invisible. Some small routine or service degrades, but outwardly everything still works fine... there is just more strain on the services and clients that use that service, causing them to slowly degrade over the next few hours, days, or weeks, which in turn puts more strain on the services that call those services... etc etc.
Until one day the system is so degraded major things start breaking. It seems like it came out of nowhere, but the initial failure happened weeks ago and has been cascading since then.
Once a system hits that point it's often not enough to just fix the initial problem because so much of the ecosystem around it has been thrown out of whack.
See the film Passengers for an example of cascade failures from systems trying to cover for each other.
The Expanse has a whole b-plot about an artificial ecosystem going through cascade failure in one of its arcs.
As a way-too seasoned web developer who appreciates working alongside great SREs, this has been pretty interesting. I'm honestly surprised more hasn't gone wrong but maybe that's yet to come. Since they are (I imagine) losing users instead of growing it might actually avoid running into future scaling issues that were looming.