Add support for health-checking #113

JarvisCraft · 2021-09-17T00:35:36Z

Description

Kubernetes-backed servers may want to report their current status via a health check which may rely on implementation-specific details (such as servers exposing their internal connectivity info).

In order to implement this behaviour as part of this plugin:

HealthCheck and HealthChecks interfaces were added to facilitate the creation of such health checks
HealthChecks became registered as a Bukkit service so that other plugins can modify it
an extra endpoint /health was added in order to response with 2xx (i.e. 200) code when the checks return healthy status and 4xx (i.e. 403) otherwise

Affected code

Among the changes, the creation of MetricsController was abstracted into a static factory.

Notes

Default Jetty handler is now used for handling incorrect endpoint.
It may be reasonable to make exact endpoints used configurable but it is not part of this PR (as this was not configurable before).

Combustible · 2021-11-09T14:59:11Z

This is great, I actually just bounced through here thinking about adding a PR for something similar.

It looks like this doesn't include a default health check though? It looks like it'll always return healthy in the current state.

What we do on our system is start a timer task that updates a last tick time variable with the current time. Every time a liveness check comes in, we check if it's been less than 15s since the last tick. If yes, healthy, if not, server is probably hung and unhealthy. This is also handy because it means the server shows up as unhealthy until it starts processing ticks (desired behavior which allows kubernetes to block players joining a server that isn't ready yet)

I doubt you need reference code for this... but here it is anyway:
Runnable created on startup/enable

          new BukkitRunnable() {
              @Override
              public void run() {
                  mLastTickTime = System.currentTimeMillis();
              }
          }.runTaskTimer(mPlugin, 1, 1);

Request handler

          @Override
          public void handle(HttpExchange request) throws IOException {
              final byte[] response;
              if (System.currentTimeMillis() - mLastTickTime < ALIVE_TICK_THRESHOLD) {
                  response = "Alive!".getBytes();
                  request.sendResponseHeaders(200, response.length);
              } else {
                  response = "Server not ticking - possibly hung".getBytes();
                  request.sendResponseHeaders(503, response.length);
              }
              OutputStream stream = request.getResponseBody();
              stream.write(response);
              stream.close();
          }

sladkoff · 2021-12-13T23:35:15Z

@JarvisCraft Sorry for the delayed response. I think this is quite useful and the PR looks very promising already! However, I'm currently having a hard time deciding if this technically belongs into this plugin or not because it has nothing to do with Prometheus (as in "minecraft-prometheus-exporter" 😄 )

That's more of a philosophical question though.

Personally, I feel like the options are:

Merge this and rename the plugin to something more broad like 'minecraft-monitor' / 'minecraft-status-exporter' / etc (I'm bad with names)
- good: 'all-in-one' plugin for Minecraft observability stuff
- bad: codebase might grow a bit and become more complex, some people might never need one or the other (Prometheus or health checks)
Offer a separate plugin based on this PR ('minecraft-health-exporter'...)
- good: two simple plugins with clear responsibilities
- bad: the new plugin will cause some (code) duplication (another HTTP server...)

I might be overthinking. What's your opinion on this?

Combustible · 2021-12-14T07:43:04Z

The biggest reason I'd like to see this exist in this plugin is that it already has a web server. Not that the overhead is at all high to have two web servers... but this one is far more likely to actually be maintained than a stand-alone health checking plugin IMO, and it doesn't "cost" anything to just add an extra endpoint to the existing server (the port, etc. is already all set up)

I'm currently shading a health checking webserver into my own already-giant plugin for exactly this purpose, which I don't particularly like. Of all the plugins we run with, this is the only one that it makes sense to combine into. I'd also use a stand-alone plugin if someone maintained it, but it's just one more thing I need to configure (we are running in docker containers, need to expose ports / etc.). Not hard, just another step.

JarvisCraft · 2021-12-18T13:57:20Z

Hi there,

@sladkoff, I guess that the better option is to merge the functionality into this single plugin as it lets reuse the HTTP-server bound to the single port. As for the naming it's of course up to you :) I don't think that this would lead to uncontrolled growth of the codebase. Especially, considering that the current change does not require the plugin itself to implement any health-checks but provides a mechanism (via ServicesManager) for other plugins to register their own checks.

@Combustible, sorry for the late reply. Yes, the changeset deliberately does not include any health check by itself because different servers may have different statuses for these. This can be added on top of current API but I think that it should (at least) be presented as a separate PR once this one is in its ready-approved state.

sladkoff · 2021-12-23T22:36:51Z

Alright guys, thanks for the input. I'll try to review this in detail in the coming weeks 🤞

stale · 2022-07-10T22:18:10Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

sladkoff · 2024-02-29T17:31:31Z

I'm really really sorry for the delay on this 🙈

I've rebased this branch and created a new PR: #229

JarvisCraft added 3 commits September 17, 2021 02:38

feat: add HealthCheck

76054d4

feat: add HealthChecks

10d6904

feat: implement /health endpoint

65a86ed

JarvisCraft changed the title ~~feat: add HealthCheck~~ Add support for health-checking Sep 17, 2021

fix: use correct handler hierarchy

f981e4a

sladkoff added the enhancement label Nov 4, 2021

sladkoff added this to the 2.5.0 milestone Nov 4, 2021

sladkoff modified the milestones: 2.5.0, 2.6.0 Apr 11, 2022

stale bot added the wontfix label Jul 10, 2022

sladkoff removed the wontfix label Jul 11, 2022

sladkoff closed this Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for health-checking #113

Add support for health-checking #113

JarvisCraft commented Sep 17, 2021

Combustible commented Nov 9, 2021

sladkoff commented Dec 13, 2021

Combustible commented Dec 14, 2021

JarvisCraft commented Dec 18, 2021

sladkoff commented Dec 23, 2021

stale bot commented Jul 10, 2022

sladkoff commented Feb 29, 2024

Add support for health-checking #113

Add support for health-checking #113

Conversation

JarvisCraft commented Sep 17, 2021

Description

Affected code

Notes

Combustible commented Nov 9, 2021

sladkoff commented Dec 13, 2021

Combustible commented Dec 14, 2021

JarvisCraft commented Dec 18, 2021

sladkoff commented Dec 23, 2021

stale bot commented Jul 10, 2022

sladkoff commented Feb 29, 2024