The IdP includes a framework for instrumentation, diagnostics, and performance management that complements the logging support and integrates with it to allow tuning of instrumentation overhead. The framework is based on the Metrics library and includes a number of related features:
some built-in metrics providing low-level system information, software versions, etc.
a small number of built-in metrics exposed by the IdP itself, including a customizeable set of configuration properties
a facillity for installing timers and counters to measure request handling
an API for accessing metrics (all or a subset) in JSON format
the ability to schedule reports of metrics in JSON format out to an http(s) collection point
Most of these features are configured whole or in part via conf/admin/metrics.xml and are discussed below.
The core bean in the metric system is a MetricRegistry that stores all of the active metrics in the system for use by various other reporting objects. Certain kinds of metrics are subject to your control and can be selectively installed or skipped based on what gets registered, at startup. By default various sets of metrics are installed for you via a bean in metrics.xml that inherits from shibboleth.metrics.RegisterMetricSets, which is a way of using Spring to call the registration method. The pre-installed metrics largely overlap with the information available from the now-deprecated status page.
Also provided by default, but commented out, are references to a number of lower level metric sets related to the JVM.
The way to think about these particular metrics is that they're "external" to the IdP for the most part, in the sense that they're objects that access public interfaces of IdP, OpenSAML, and Java components, and use them to report information. Because they're external, they can be selectively installed (or not) just by registering them (or not).
A lot of the other interesting metrics in the system are "internal", in the sense that they're implemented either wholly, or in conjunction with, internal classes themselves, and typically are not something you specifically need to register yourself. Instead, you have the ability to enable or disable all metrics using a MetricFilter.
The registry of metrics is extended by Shibboleth with the ability to "filter" metrics by reusing the logging configuration's ability to adjust the logging level of particular categories hierarchically. By naming metrics in a similar way, it's possible to effectively enable and disable metrics at runtime, which can have very small (or in rare cases, less small) impacts on performance. You may want to leave a metric configured all the time, but toggle it on only at particular times. Using the logging layer is easy, flexible, reloadable, and ignorable if you don't care about the overhead.
Every metric has a name in dotted-separator notation (i.e., the same as a logging category) and you can control the on/off status of a particular metric or set of metrics by manipulating the level of a corresponding logging category with the prefix "metrics." That is, a metric named "net.shibboleth.idp.version" is controlled by a logging category named "metrics.net.shibboleth.idp.version".
By default, all metrics are enabled at the "INFO" level.
There are two built-in mechanisms provided for reporting out metrics, along with a variety of options available within the Metrics library itself that deployers may choose to wire in with Spring. For example, it's possible to expose metrics via JMX, through logging, via the console, to databases, etc.
The two generic approaches that come "ready to deploy" are both centered around use of JSON as a reporting format, and both pull (via a REST API) and push (via an HTTP reporter) are provided.
As a (greatly enhanced) replacement for the status page, an Administrative flow is provided for access to specific metrics, named groups, or all of the metrics in the system, in JSON or JSONP format.
High level configuration of this flow is handled by conf/admin/general-admin.xml, but in most cases the defaults suffice and the real configuration, if any, takes place in conf/admin/metrics.xml.
There are two main features you can configure:
Naming custom groupings of metrics for ease of access, and access control.
Applying specific access control policies to metrics or groups of metrics.
By default, the REST API provides access to either all metrics (via /idp/profile/admin/metrics), or to a specific metric by attaching the name to the end of the path (e.g., /idp/profile/admin/metrics/net.shibboleth.idp.version).
In addition, a number of predefined "groups" are configured for you by default, in a bean named shibboleth.metrics.MetricGroups. The groups associate an object that implements the MetricFilter interface with a shorthand name, and allow you to access all the metrics that satisfy the filter by using the group name on the end of the path in place of a specific metric name. For example, the default configuration allows a number of metrics related to the metadata layer to be accessed via /idp/profile/admin/metrics/metadata.
The other purpose of these groups is for access control. By default, access to the entire REST API is governed by a default policy identified by shibboleth.metrics.DefaultAccessPolicy. If you need more fine-grained control, you can fill in the map defined by shibboleth.metrics.AccessPolicyMap with entries connecting a metric name or metric group name with a named policy to apply.
Note that metrics may be accessed by multiple paths (e.g., directly by themselves or via a group), so typically group or specific metric policies would be used to "open up" broader access to a set of metrics beyond the default policy, which would remain more locked down.
A couple of additional obscure features: it's possible to control the "Access-Control-Allow-Origin" header and turn the feed into a JSONP callback using String beans (see the Reference below) for more exotic uses of the API.
As an alternative to a pull model, the IdP can push metric information out to an HTTP server as a collection point. This is not enabled by default, but some basic configuration support is provided inside a comment in conf/admin/metrics.xml.
You can define any number of reporters using the parent bean shibboleth.metrics.HTTPReporter. The reporter bean is configured with the URL to access, and can be injected with an HttpClient bean for control over many aspects of the HTTP connection, such as timeouts, how to evaluate a server's TLS certificate, etc. It's also possible to add a MetricFilter constructor argument to limit which metrics are actually included.
In addition to defining the bean itself, it must be "scheduled" by calling its "start" method with a timer interval, as illustrated in the default file.
The format of the data is identical to the format returned by the REST API.
In addition to the built-in metrics, support exists across a number of components (primarily the Action beans and a few of the major services like the Attribute Resolver and Attribute Filter) to add counters and timers to specific requests. This is done by providing a script (or alternative Java code, but a script is simpler) to add the counters and timers to install.
The names of the metrics are arbitrary; the metrics are attached to specific objects based on an identifier for the object that usually will consist of either the Spring bean ID or the simple name of a Java class. This usually requires some specific knowledge of the internals of the system, but advice is available on the support list about how to measure particular parts of the system.
The two kinds of metrics supported are simple counters that measure how often a component runs, and timers that measure the time elapsed between a starting and stopping point. With respect to profile actions, the system will check to see if the execution of the action should be counted and if a timer should be started (at the beginning of execution) or stopped (at the end). The latter means that it's possible to start a timer in one action and stop it in another, bracketing larger parts of a request than just a single action.
(The above is best explained with an example, so keep reading.)
Control over the installation of these metrics is managed by a bean of type Function<ProfileRequestContext,Boolean> named shibboleth.metrics.MetricStrategy. The function's job is to manipulate a MetricContext child context to configure the metrics to activate for the request. The javadoc for that child context class describes the methods available to add counters and timers.
The example provided by default illustrates a simple use case: measuring the time it takes to resolve attributes during the request. In the example, a timer named "idp.attribute.resolution" is installed that starts when the "ResolveAttributes" action starts and stops when the "FilterAttributes" action finishes. This demonstrates that timers can extend beyond a single step in a flow. You could just as easily time only the "ResolveAttributes" action alone by using the same ID for both parameters.
Another example demonstrates the use of a counter tracking the number of times the SAML 1.1 Attribute Query flow executes based on counting each time the "DecodeMessage" action runs. It shows how, because the function is run dynamically, it's possible to conditionally enable metrics based on the specific profile flow being run.