-
-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for health-checking #113
Conversation
This is great, I actually just bounced through here thinking about adding a PR for something similar. It looks like this doesn't include a default health check though? It looks like it'll always return healthy in the current state. What we do on our system is start a timer task that updates a last tick time variable with the current time. Every time a liveness check comes in, we check if it's been less than 15s since the last tick. If yes, healthy, if not, server is probably hung and unhealthy. This is also handy because it means the server shows up as unhealthy until it starts processing ticks (desired behavior which allows kubernetes to block players joining a server that isn't ready yet) I doubt you need reference code for this... but here it is anyway: new BukkitRunnable() {
@Override
public void run() {
mLastTickTime = System.currentTimeMillis();
}
}.runTaskTimer(mPlugin, 1, 1); Request handler @Override
public void handle(HttpExchange request) throws IOException {
final byte[] response;
if (System.currentTimeMillis() - mLastTickTime < ALIVE_TICK_THRESHOLD) {
response = "Alive!".getBytes();
request.sendResponseHeaders(200, response.length);
} else {
response = "Server not ticking - possibly hung".getBytes();
request.sendResponseHeaders(503, response.length);
}
OutputStream stream = request.getResponseBody();
stream.write(response);
stream.close();
} |
@JarvisCraft Sorry for the delayed response. I think this is quite useful and the PR looks very promising already! However, I'm currently having a hard time deciding if this technically belongs into this plugin or not because it has nothing to do with Prometheus (as in "minecraft-prometheus-exporter" 😄 ) That's more of a philosophical question though. Personally, I feel like the options are:
I might be overthinking. What's your opinion on this? |
The biggest reason I'd like to see this exist in this plugin is that it already has a web server. Not that the overhead is at all high to have two web servers... but this one is far more likely to actually be maintained than a stand-alone health checking plugin IMO, and it doesn't "cost" anything to just add an extra endpoint to the existing server (the port, etc. is already all set up) I'm currently shading a health checking webserver into my own already-giant plugin for exactly this purpose, which I don't particularly like. Of all the plugins we run with, this is the only one that it makes sense to combine into. I'd also use a stand-alone plugin if someone maintained it, but it's just one more thing I need to configure (we are running in docker containers, need to expose ports / etc.). Not hard, just another step. |
Hi there, @sladkoff, I guess that the better option is to merge the functionality into this single plugin as it lets reuse the HTTP-server bound to the single port. As for the naming it's of course up to you :) I don't think that this would lead to uncontrolled growth of the codebase. Especially, considering that the current change does not require the plugin itself to implement any health-checks but provides a mechanism (via @Combustible, sorry for the late reply. Yes, the changeset deliberately does not include any health check by itself because different servers may have different statuses for these. This can be added on top of current API but I think that it should (at least) be presented as a separate PR once this one is in its ready-approved state. |
Alright guys, thanks for the input. I'll try to review this in detail in the coming weeks 🤞 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'm really really sorry for the delay on this 🙈 I've rebased this branch and created a new PR: #229 |
Description
Kubernetes-backed servers may want to report their current status via a health check which may rely on implementation-specific details (such as servers exposing their internal connectivity info).
In order to implement this behaviour as part of this plugin:
HealthCheck
andHealthChecks
interfaces were added to facilitate the creation of such health checksHealthChecks
became registered as a Bukkit service so that other plugins can modify it/health
was added in order to response with 2xx (i.e. 200) code when the checks return healthy status and 4xx (i.e. 403) otherwiseAffected code
Among the changes, the creation of
MetricsController
was abstracted into a static factory.Notes
Default Jetty handler is now used for handling incorrect endpoint.
It may be reasonable to make exact endpoints used configurable but it is not part of this PR (as this was not configurable before).