Systems health API

Challenge

Distributing functionality over specialized back-end systems and integrating those services through middleware works good for an incremental digital transformation. However, with the fragmentation of responsibility over multiple systems it becomes increasingly difficult to grasp the overall system health and performance. Besides the many benefits of front-ends being loosely coupled to back end systems, one drawback is that when a (performance) issue hits a front-facing service, it can be hard to diagnose the root cause of that issue.

Solution

NetMatch built a health monitoring system implemented as several .NET core Service Fabric hosted microservices. At the producing side, the distributed middleware services produce telemetry that is sent to a central Azure Application Insights instance. On the consuming side, besides mapping and aggregating the Application Insights data, the web API service also performs live connectivity checks. Alerts are realized using stateful actors that evaluate the available telemetry and - when a notable change in health is detected - issue notifications via e-mail, browser push notifications and slack channel messages. The health API also exposes a web endpoint for depending systems to check and progressively enable functionality based on the system health.