Programming

16207 readers

440 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]

founded 1 year ago

MODERATORS

[email protected]

how does the status monitoring website work under the hood? (programming.dev)

submitted 4 months ago* (last edited 4 months ago) by [email protected] to c/[email protected]

7 comments fedilink hide all child comments

I am talking about the services which let you monitor the status of a website whether the website is up and operational or down or under heavy load.

how do they work under the hood?

for example:

https://githubstatus.com

https://instatus.com

I am building something similar for monitoring my web projects.

top 7 comments

sorted by: hot top controversial new old

[–] echo64 33 points 4 months ago

The answer that the status service websites will tell you: we automatically detect outages by performing http requests and checking responses for errors

the actual answer: some overworked developer gets woken up at 3am via pagerduty and manually set the status website to an outage state

[–] [email protected] 12 points 4 months ago

You should check out Uptime Kuma which offers different monitor types. This should give you a good start for your own implementations. Or maybe you'll find that Uptime Kuma already covers your usecase.

[–] [email protected] 8 points 4 months ago

A lot of external status services just send a HTTP request to a certain url, if it succeeds then it's up, if it errors or times out then it's down. They also usually let you check if TCP ports do the usual handshake thing if you aren't using HTTP.

The response time can also be used to check if a site is running slower than usual too, and if you have a use for it you can usually specify the required response code for success.

Although I wouldn't be surprised if GitHub has some per-server analytics they can also use to estimate the load, but Instatus would work as described above.

Sometimes these sorts of things are referred to as health checks, if you're looking for search terms. For example Docker can be set up to poll a container's web server every few minutes, and mark it as unhealthy it if it stops replying using the HEALTHCHECK instruction in the Dockerfile.

[–] TCB13 8 points 4 months ago

Simple, do a GET or HEAD HTTP request to the monitored website with a 3 second timeout. If you get a 200 response code then you can assume the website is online and okay.

Why HEAD? Because:

if a URL might produce a large download, a HEAD request could read its Content-Length header to check the filesize without actually downloading the file. (...) A response to a HEAD method should not have a body

Using HEAD instead of GET will make it so your code doesn't have to actually download your frontpage to get the status. This will speed things up and reduce bandwidth usage.

Note: webservers may also return response codes for redirects, like 301 or 308 and and this case you usually do a follow up request to the URL the server pointed you at in order to check if it returns 200. Some HTTP libraries have built in ways to handling this and with a simple boolean they'll follow the redirect for you.

[–] [email protected] 4 points 4 months ago

A webservice can be passively monitored.
So, the status system would check DNS records, ping IP addresses and do a get request to check it gets a 200 response. Further metrics like ping and response times could be monitored and report if they are too high, indicating heavy load.
Uptime Kuma is a foss project that is popular amongst self-hosters.

A webservice can actively report for monitoring. So a webservice would monitor its CPU/RAM/network usage, database connections, cache misses, stuff like that. If you are load balancing, then an additional service would be needed to aggregate the results of all these and decide when its degraded performance due to too many nodes being offline/overloaded.
Things like prometheus, netdata can do the metrics.

Or, like how i think a lot of these work, just report it manually. Ive seen quite a few companies that report green status, despite having fairly huge issues

[–] nutsack 4 points 4 months ago* (last edited 4 months ago)

curl -k https//example.com/healthcheck

[–] [email protected] 2 points 4 months ago

Nearly always it's by "pinging" which may or may not actually use ping. Some server somewhere is sitting there querying the server every minute to see if it responds with a 200 - for the better statuses they'll try and activate various routes and report whether portions of the server are available.