DynamicHTTPMetadataProvider
Namespace: urn:mace:shibboleth:2.0:metadata
Schema: http://shibboleth.net/schema/idp/shibboleth-metadata.xsd
Overview
The DynamicHTTPMetadataProvider
fetches entity metadata just-in-time from a remote HTTP server. The metadata request URL is constructed by applying a transform to the entityID
. The transform strategy is configurable, with a simple way to configure support for the Metadata Query Protocol.
Metadata is cached in memory subject to a complex set of interacting settings and the cache indicators within the metadata itself, and also can be saved to disk and reloaded back into memory at reload or startup time to restore the state of the cache. This isn't a fully redundant safety net but can be used as part of an overall strategy to reduce the risk of relying on remote sources in real-time. Ultimately, remote sources have to be bulletproof or there will be outages. This can be mitigated but not fully eliminated as a risk.
As part of this “machinery”, the default HTTP client used with this provider is the “memory-caching” variant mentioned on the HttpClientConfiguration page, which automatically honors HTTP caching headers and caches results in memory. This mechanism operates independently of, and in addition to, all of the other caching behavior defined below, so bear this in mind when implementing your own metadata services for use with this provider. If you don’t want this behavior, simply define your own non-caching client to inject via the httpClientRef
XML attribute.
Use this provider with remote metadata
The DynamicHTTPMetadataProvider
is used with remote metadata. See the MetadataManagementBestPractices topic for more information.
Reference
Examples
A typical use case is to load entity metadata dynamically from a metadata query server (i.e., a server that supports the Metadata Query Protocol). Here is a complete example:
Note that the <MetadataQueryProtocol>
child element encodes the base URL of the Metadata Query Protocol. For example, consider the following child element:
The previous <MetadataQueryProtocol>
child element is equivalent to the following <Template>
child element:
The above configuration explicitly formulates an MDQ protocol URL. This example is for illustration purposes only. If the server supports the Metadata Query Protocol, a <MetadataQueryProtocol>
child element should be used instead. This intentionally hides the details of the Metadata Query Protocol.
Finally, here is an example of the well-known location strategy:
Frequently Asked Questions
What does “dynamic” mean?
A DynamicHTTPMetadataProvider
fetches entity metadata as needed. We say that the IdP queries for SP metadata just-in-time.
Compare this to a FileBackedHTTPMetadataProvider
that batch loads all of the entity descriptors in a metadata file whether or not the individual entity descriptors are actually needed. In contrast, a DynamicHTTPMetadataProvider
loads exactly those entities that are needed—no more, no less. In this sense, a DynamicHTTPMetadataProvider
is much more efficient.
OTOH, all metadata query protocols are synchronous protocols by definition. Basically the IdP is blocked until it obtains the metadata it needs.
How does metadata query work?
When an IdP receives a SAML protocol request from a particular SP, the IdP must first obtain entity metadata for that SP. If the IdP has no such metadata in its possession, metadata resolution proceeds sequentially according to a configured chain of metadata providers. Upon encountering a DynamicHTTPMetadataProvider
in the chain, the IdP consults an HTTP client that acts as an intermediary between the IdP and the query server.
The HTTP client implements a shared HTTP cache. (RFC 7234) If the desired metadata is already cached, and the stored response is fresh, the client immediately returns the cached metadata to the IdP. Otherwise the client issues an HTTP request to the query server. Upon receiving a response from the server, the client caches the response and finally returns the metadata to the IdP.
In either case, the IdP parses the metadata and applies any metadata filters configured on the DynamicHTTPMetadataProvider
. The metadata that ultimately emerges from the configured metadata pipeline is cached locally (in memory) for future use.
The next time the IdP receives a SAML protocol request from this SP, it again traverses the chain of providers until it encounters the DynamicHTTPMetadataProvider
. This time, however, the IdP does not bother to consult the HTTP client since the needed metadata is in the IdP’s local cache.
How long does the metadata remain in the IdP’s local cache?
The IdP’s local cache is governed by the Dynamic Attributes. In particular, the minCacheDuration
and maxCacheDuration
attributes strongly influence the life cycle of metadata in the local cache. Any cacheDuration
and validUntil
attributes in the metadata itself also influence the behavior of the local cache.
Does the HTTP client cache the response in memory?
Yes, by default the HTTP client caches responses in memory. Consequently, two copies of each entity descriptor reside in memory, one managed by the HTTP client as an HTTP response, and another "first-order" metadata object managed directly by the IdP.
The HTTP client may be overridden to perform file-based caching but that cache will not survive a restart so the overall benefit of file caching is low. In most cases, a memory cache is preferred, and the metadata plugin can perform its own persistent caching to disk, which does survive a restart.
Does the HTTP client support HTTP conditional GET?
Yes, the HTTP client supports HTTP conditional GET (RFC 7232) for optimal performance but the inner workings of the HTTP client are opaque to the IdP. If the IdP does in fact consult the HTTP client, and the client returns metadata to the IdP, the IdP blindly parses the metadata and applies the metadata filters. There are no optimizations implemented on the IdP side to prevent re-parsing the XML because the fragments are small enough to limit the benefit.
What if the metadata query server is down or unavailable?
When the HTTP client sends an HTTP request to a metadata query server, the SAML protocol exchange is blocked until a response is received from the server and returned to the IdP. If the client reports a failed request, the IdP continues with the next provider in the configured chain of providers. If the offending DynamicHTTPMetadataProvider
is the last provider in the chain, metadata resolution fails.
What can I do to minimize the impact of metadata query?
There are at least three things you can do to help minimize the impact of metadata query:
Configure
minCacheDuration
and/ormaxCacheDuration
Configure the HTTP Connection Attributes
Configure a robust chain of metadata providers
As noted above, the minCacheDuration
and maxCacheDuration
attributes strongly influence the life cycle of metadata in the local cache. The goal is to avoid needless interaction with the HTTP server. To achieve this goal, you need to understand the life cycle of the metadata on the server. For this reason, it is best to ask your federation operator for specific recommendations.
OTOH, the federation operator may influence the life cycle of metadata in the IdP’s local cache by including a cacheDuration
attribute in the metadata itself. In that case, the deployer has fewer configuration options to consider, by design.
The HTTP Connection Attributes include the following attributes:
connectionRequestTimeout
(default:PT5S
): The maximum amount of time to wait for a connection to be returned from the HTTP client's connection pool manager.connectionTimeout
(default:PT5S
): The maximum amount of time to wait to establish a connection with the remote server.socketTimeout
(default:PT5S
): The maximum amount of time to wait between two consecutive packets while reading from the socket connected to the remote server.
As noted above, each of these attributes defaults to 5 seconds. You may want to tighten these timeout values further, depending on what you know about the route to the server or the server itself.
Even so, you will note that none of those settings actually limits the amount of time an entire request may take, as that as isn’t a feature of the Apache HttpClient, so do bear in mind that actual time taken may be unpredictable if the source is unreliable and choose those sources wisely.
Regardless of the IdP configuration or the service-level agreement you have with the server operator, things will go wrong. One thing you can do to hedge your bets is to deploy a local query server as backup. Alternatively, one or more high-value SPs can be pre-loaded into memory.