this post was submitted on 02 Aug 2023
3 points (100.0% liked)

Main - Lemmy.tf

128 readers
12 users here now

founded 1 year ago
MODERATORS
 

So after a few days of back and forth with support, I may have finally received some insight as to why the server keeps randomly rebooting. Apparently, their crappy datacenter monitoring keeps triggering ping loss alerts, so they send an engineer over to physically reboot the server every time. I was not aware that this was the default monitoring option on their current server lines, and have disabled it so this should avoid forced reboots going forward.

I am standing up a basic ping monitor to alert me via email and SMS if the server actually goes down, and can quickly reboot it myself if ever needed (may even write some script to reboot via API if x concurrent ping fails, or something). Full monitoring stack is still in progress but not truly necessary to ensure stability at the moment.

top 1 comments
sorted by: hot top controversial new old
[–] [email protected] 1 points 11 months ago

I'm going to take a look at getting some monitoring infrastructure spun up.