Observability

Observability, the ability to understand the internal state of a system by examining its outputs, is a key aspect of maintaining high-performing, resilient microservices. When applications are observable, operations teams can quickly identify the root causes of bugs, bottlenecks, and other inefficiencies.

Open Liberty provides a robust framework for developing observable applications through health checks, metrics, logging, and tracing.

Health checks: Monitoring service availability

A health check is a special REST API implementation that validates the status of a microservice and its dependencies. Health checks help prevent downtime, reduce errors, and ensure system reliability. They monitor the availability and performance of individual services, detecting issues before they impact users. Health checks are an important tool for managing cloud native applications because they can communicate the status of an application container to the host cloud platform. Health checks report whether an application is running, ready, or has successfully started up. The cloud platform monitor service can then use this information to stop or restart certain containers to keep the system running efficiently.

You can configure health checks for your Open Liberty applications with MicroProfile Health. With MicroProfile Health, microservices can self-check their health and publish their overall status to a defined endpoint.

Metrics: Measuring performance

Metrics are quantitative measurements that provide insights into the state and performance of your microservice. Operations teams use metrics to track key performance indicators (KPIs) such as response times, error rates, and throughput. By collecting and analyzing metrics, they can identify trends, spikes, and anomalies that might indicate potential issues or areas for optimization. Metrics provide real-time statistics that you can analyze with specialized monitoring tools, such as Prometheus and Grafana.

When you enable MicroProfile Telemetry 2.0 or later, Open Liberty automatically collects and exports a default set of metrics. For more information about these metrics, see the MicroProfile Telemetry metrics reference list. Additionally, you can use the OpenTelemetry API to define custom metrics in your application code, if needed. Open Liberty also supports collecting metrics with MicroProfile Metrics. However, MicroProfile Telemetry provides a comprehensive solution for traces, metrics, and logging.

Logging: Understanding application behavior

Logging is the process of recording events within the system, capturing detailed information about what happened, when, and why. It’s essential for debugging, auditing, and understanding application behavior. Logs provide a wealth of information, including errors, warnings, and debug messages. Logs can provide context for issues that are discovered in metric data. Log messages are also included in the spans that make up distributed traces.

Open Liberty has a unified logging component that handles messages that are written by applications and the runtime, and provides First Failure Data Capture (FFDC) capability. Logging data that is written by applications by using the System.out, System.err, or java.util.logging.Logger streams is combined into the Open Liberty runtime logs by default.

Tracing: Following requests

While logs contain data about specific events in a system, traces represent the requests that trigger these events. A trace consists of multiple spans, each of which represents a single operation in a request, such as an HTTP request or a database call. A span includes a name, time-related data, log messages, and other metadata about what occurs during a transaction. When you implement tracing in your microservices, operations teams can better identify bottlenecks, debug complex issues, and optimize inter-service communication.

OpenTelemetry, an open source framework, provides a standardized approach to collect, process, and export trace data. When you enable OpenTelemetry for Open Liberty, Jakarta RESTful Web Services and JAX-RS applications are instrumented for trace by default. Spans are automatically generated for incoming HTTP requests, including static files, servlets, and JSPs. These spans are automatically exported according to the configured OpenTelemetry exporter settings.

Automatic instrumentation is available only for JAX-RS and Jakarta RESTful web service applications. To create spans for other operations, such as database calls, you can add manual instrumentation to the source code for those operations by using the OpenTelemetry API. Alternatively, you can attach the OpenTelemetry Java agent to any Java 8+ application. For more information about these options, see Code instrumentation for MicroProfile Telemetry tracing.

MicroProfile Telemetry: A unified solution for observability

With MicroProfile Telemetry 2.0 and later, you can manage Open Liberty logs, metrics, and traces in a standardized way by using the OpenTelemetry protocol. For more information, see Collect logs, metrics, and traces with OpenTelemetry.

Implementing observability in your microservices involves a combination of metrics, logging, and tracing. With Open Liberty and OpenTelemetry, you can gain valuable insights into your application behavior so you can identify and resolve issues quickly. Observability is not a one-time task but an ongoing process that must be integrated into your development lifecycle. By making observability a priority, you can build more reliable, performant, and maintainable microservices.

For hands-on tutorials on different observability configurations for Open Liberty, check out our Observability guides.