This document dates back to January 2013, when we were doing design for V3 of the Shibboleth Identity Provider. The Shibboleth MDA (Metadata Aggregator) was also being designed around the same time, and there seemed to be an opportunity to make use of the MDA within the IdP as part of its metadata processing system. In the end, we went in a different direction but the notes are preserved here.
This document is an informal input to the IdP V3 design discussions, not part of the process itself. My suspicion is that to date most of the Shibboleth team have not spent much if any time with the MDA, and know little about it other than could be deduced from the name. The purpose of this document is to give everyone enough understanding of the MDA code and the philosophy behind it to be able to make reasoned judgements about whether the MDA framework might be used as a component of the metadata handling design for the V3 IdP.
Table of Contents |
---|
MDA Backgrounder
"Metadata Aggregator"
The name would tend to give you the impression that "the metadata aggregator" is:
...
Unfortunately, the name is quite misleading and in fact neither of these impressions is really true. The MDA product is really a generic processing framework for items of arbitrary data, which happens to provide features which are very useful for processing SAML 2.0 metadata amongst other things. It comes with a set of processing stages which can be used for building metadata aggregators (again, amongst other things) but is intended to be extended by the creation of new stage implementations.
Items
Data to be manipulated by the MDA framework is encapsulated by an Item<T>
. You can access the underlying object through an unwrap()
method.
...
One common pattern is to extract some critical information from the wrapped object into item metadata, then use that in subsequent processing stages. As well as cache-like performance benefits, note that anything operating just on the item metadata can be agnostic about the type of the wrapped item. For example, a stage which performs whitelisting or blacklisting of entities by name can be written to operate against names in the item metadata; such a stage will work on items representing metadata of any underlying type.
Stages and Pipelines
Manipulation of items is performed by stages, which are arranged into sequences in pipelines.
...
If you want to see an extreme example of this kind of thing, see my blog post about the UK federation metadata system.
Error Handling
Another important use of item metadata is in error handling. The general pattern used is for checking and error handling to be separated. Stages which perform checking signal problems by adding instances of subclasses of StatusMetadata
to an item's metadata. For example, an error results in an ErrorStatus
being added, a warning results in a WarningStatus
and so forth. Handling any errors or warnings is then left either to downstream stages or the pipeline's caller to be handled as appropriate.
...
Even logging of conditions is configurable, through StatusMetadataLoggingStage
.
Application to the IdP
Here are my personal opinions on some of the design issues for IdP V3 as they relate to possible use of the aggregator framework.
Engine, not Provider
It's clear that the metadata aggregator framework in its current form can't just be plugged in to the IdP as a metadata resolver or provider in general. However, a lot of the things that need to be done by such a provider are available as stages within the existing codebase: signature verification, validUntil
processing, entity whitelisting and blacklisting,
...
I think Chad and I saw this kind of pattern as useful because the MDA framework was designed to be extremely extensible. It's relatively easy to gin up something, for example, that blacklists any entity with entityID containing "http://iay.org.uk
" (e.g., using an XPathFilteringStage
). Similarly, inserting a fixed Irish flag logo into any entity whose MDRPI says its from the Irish registrar but which doesn't already have an MDUI logo defined is a pretty simple application of XSLTransformationStage
.
...
No, you caught me, I'm joking.
Hierarchy
The aggregator framework is based around processing collections, with the assumption that each member of such a collection will normally be the metadata for one entity. The implementation of SAML 2.0 processing in terms of Item<Element>
with DOM documents for each item means that handling more structured metadata involving <EntitiesDescriptor>
is possible, but I don't believe that in general it is advisable if it can be avoided.
...
I think we're really only talking about two use cases in practice: the possibly hierarchical attachment of trust root information through <KeyAuthority>
extensions, and attribute release to groups of entities named by the <EntitiesDescriptor>
elements they are hierarchically contained within.
Trust Anchors
The V2 IdP makes use of trust roots made available as <KeyAuthority>
extensions within role descriptors, entity descriptors, and all enclosing <EntitiesDescriptor>
s up to the document root. In practice, I suspect that almost everyone who runs PKIX just has one blob of trust root information at the top level, and that almost no-one uses trust roots within any nested scopes. Such levels of sophistication are pretty much guaranteed to be Shibboleth only, for one thing.
What I'd suggest is converting each <KeyAuthority>
(or perhaps each <KeyAuthority>/<KeyInfo>
found as an extension into ItemMetadata
, and appending all that are in scope for a particular entity to its item metadata bucket. I don't believe that ordering of such trust roots within the <EntitiesDescriptor>
tree affects the semantics, as the PKIX trust engine will try each one in turn until one works, but I'd note that it would be relatively straightforward to make sure that the order in which they appear as one descends or ascends the tree can in fact be preserved, as the item metadata bucket values are ordered lists.
Group Membership
The V2 IdP has rather funky relying party semantics, which we were planning on changing to some extent anyway. In V2, you can designate a relying party by name and that will match against:
...