Health Checks
Health Checks give you the ability to monitor the health of your application by writing a small tests which returns either a healthy, degraded or un-healthy result. This is useful not only to test the internal health of your application but also it's external dependencies such as an third party api which your application relies on to function correctly.
Health checks are written by either inheriting HealthCheck
or using the IHealthCheckFactory
provided via the IMetricsHostBuilder
. App Metrics automatically registers any class inheriting HealthCheck
and will execute all checks asynchronously by either an IMetricReporter
that is configured or when using App.Metrics.Extensions.Middleware and the /health
endpoint is requested. External monitoring tools can be configured to request the /health
endpoint to continously test the health of your api and alert via desired means. Healthy results from this endpoint will return a 200 status code whereas if any health check fails the endpoint will return a 500 status code.
Configuring Health Checks
Ensure that health checking is enabled in your Startup.cs
in the case of an AspNet Application or in your Program.cs
in the case of a Console Application.
public class Startup
{
public void ConfigureServices(IServiceCollection services)
{
services.AddMvc(options => options.AddMetricsResourceFilter());
services.AddMetrics()
// .AddJsonSerialization() - Enables json format on the /metrics-text, /metrics, /health and /env endpoints.
.AddJsonMetricsSerialization() // Enables json format on the /metrics-text endpoint.
.AddJsonMetricsTextSerialization() // Enables json format on the /metrics endpoint.
.AddJsonHealthSerialization() // Enables json format on the /health endpont.
.AddJsonEnvironmentInfoSerialization() // Enables json format on the /env endpont.
.AddHealthChecks()
.AddMetricsMiddleware();
}
public void Configure(IApplicationBuilder app, ILoggerFactory loggerFactory)
{
app.UseMetrics();
app.UseMvc();
}
}
Implementing a Health Check
Healths checks can be implemented as a stand alone class:
public class DatabaseHealthCheck : HealthCheck
{
private readonly IDatabase _database;
public DatabaseHealthCheck(IDatabase database)
: base("DatabaseCheck")
{
_database = database;
}
protected override Task<HealthCheckResult> CheckAsync(CancellationToken token = default(CancellationToken))
{
// exceptions will be caught and the result will be un-healthy
_database.Ping();
return Task.FromResult(HealthCheckResult.Healthy());
}
}
Or via the fluent building in your startup code:
public class Startup
{
public void ConfigureServices(IServiceCollection services)
{
services.AddMetrics()
.AddJsonSerialization()
.AddHealthChecks(factory =>
{
factory.Register("DatabaseConnected",
() => Task.FromResult("Database Connection OK"));
})
.AddMetricsMiddleware();
}
public void Configure(IApplicationBuilder app, ILoggerFactory loggerFactory)
{
app.UseMetrics();
}
}
Note
As well as scanning the executing assembly for health checks, App Metrics will also scan all referenced assemblies which have a dependency on App.Metrics and register any health checks it finds.
Predefined Health Checks
App Metrics includes some pre-defined health checks which can be registered on startup as shown below.
public void ConfigureServices(IServiceCollection services)
{
var threshold = 1;
services.AddMetrics()
.AddJsonSerialization()
.AddHealthChecks(factory =>
{
// Check that the current amount of private memory in bytes is below a threshold
factory.RegisterProcessPrivateMemorySizeHealthCheck("Private Memory Size", threshold);
// Check that the current amount of virtual memory in bytes is below a threshold
factory.RegisterProcessVirtualMemorySizeHealthCheck("Virtual Memory Size", threshold);
// Check that the current amount of physical memory in bytes is below a threshold
factory.RegisterProcessPhysicalMemoryHealthCheck("Working Set", threshold);
// Check connectivity to google with a "ping", passes if the result is `IPStatus.Success`
factory.RegisterPingHealthCheck("google ping", "google.com", TimeSpan.FromSeconds(10));
// Check connectivity to github by ensuring the GET request results in `IsSuccessStatusCode`
factory.RegisterHttpGetHealthCheck("github", new Uri("https://github.com/"), TimeSpan.FromSeconds(10));
})
.AddMetricsMiddleware();
}
Metric Health Checks
Metric health checks can be used to define SLAs for your application, for example we could define that the overall 98th Percentile of web transactions should be < 100 ms to to considered healthy, between 100ms and 200ms to be considered degrading and unhealthy otherwise.
public void ConfigureServices(IServiceCollection services)
{
services.AddMetrics()
.AddHealthChecks(factory =>
{
factory.RegisterMetricCheck(
name: "Overall Response Time",
options: MyMetricsRegistry.OverallWebRequestTimer,
passing: value => (message: $"OK. 98th Percentile < 100ms ({value.Histogram.Percentile98}{MyMetricsRegistry.OverallWebRequestTimer.DurationUnit.Unit()})", result: value.Histogram.Percentile98 < 100),
warning: value => (message: $"WARNING. 98th Percentile > 100ms ({value.Histogram.Percentile98}{MyMetricsRegistry.OverallWebRequestTimer.DurationUnit.Unit()})", result: value.Histogram.Percentile98 < 200),
failing: value => (message: $"FAILED. 98th Percentile > 200ms ({value.Histogram.Percentile98}{MyMetricsRegistry.OverallWebRequestTimer.DurationUnit.Unit()})", result: value.Histogram.Percentile98 > 200));
});
}
Apdex
When using App.Metrics.Extensions.Middleware
an Apdex health check is also automatically registered, this can be disabled through the configuration options. The apdex health check will provide a healthy result for a satisfied score, degraded result for a tolerating score and an un-healthy result for a frustrating score.
Viewing from a Web Host
Below is a snippet from a /health
response generated by a web host using the App.Metrics.Extensions.Middleware
nuget package.
{
"degraded": {
"message queue reached the threshold": "5000 messages in the processing queue which is above the threshold of 4000",
"signing certificate expiry": "the signing certificate is going to expire in 1 week"
},
"healthy": {
"database connection": "able make a connection to the database"
},
"status": "Unhealthy",
"timestamp": "0001-01-01T00:00:00.0000Z",
"unhealthy": {
"unable to ping x api": "a connection to x could not be made"
}
}
Health Check Results
There are three types of health checks: Healthy, Degraded and Unhealthy as show above.
- Healthy: Can be used to indicate that the check has passed, in a Web Host a 200 Http Status will be returned.
- Degraded: Can be used to indicate that the check has failed but the application is still functioning as expected, this can be useful for example when different thresholds are met on the number of messages in a queue or an SSL certificate used for signing tokens is about to expire. In a Web Host a 200 Http Status will be returned along with a Warning response header.
- Unhealthy: Can used used to indicate that the check has failed, in a Web Host a 500 Http Status will be returned. Any health check which throws an uncaught exception will be returned as an unhealthy result when reporting.
Health Check results can be returned as follows:
HealthCheckResult.Healthy(message);
HealthCheckResult.Unhealthy(message);
HealthCheckResult.Degraded(message);
Note
An external monitoring tools could be used to request the /health
endpoint at a configured interval to continously monitor and alert the health of your API.