Managing Untrusted Metadata
This article describes a semi-automatic process for managing untrusted SAML metadata using a Shibboleth LocalDynamicMetadataProvider
and a complementary set of command-line tools.
First configure a Shibboleth LocalDynamicMetadataProvider. In particular, configure a sourceDirectory
as a local repository of metadata. The latter is referred to as $sourceDirectory
in the code fragments below.
Install the SAML Library of command-line tools. Note that BIN_DIR
and LIB_DIR
are environment variables created during the installation process. These environment variables are used repeatedly in the code fragments below.
Identify a metadata source location to be managed. Perform the following sequence of steps for each metadata source location:
Prime the cache with a copy of the metadata
Filter the metadata into the source directory of the
LocalDynamicMetadataProvider
Check the metadata on the server
If the metadata on the server is different than the metadata in cache, investigate the differences
If the differences are acceptable, update the cache with fresh metadata
Filter the metadata into the source directory of the
LocalDynamicMetadataProvider
Go to step 3
The following examples illustrate the basic process.
Example 1: IRBManager
We start with a relatively simple example of remote metadata:
https://shibboleth.irbmanager.com/metadata.xml
A non-InCommon Shibboleth SP that consumes InCommon metadata
Last-Modified: Tue, 28 Jul 2015 13:32:54 GMT
Supports HTTP Conditional GET
See the relevant discussion thread on the mailing list
If you trust the SP owner to do the Right Thing, and the reliance on commercial TLS is not a concern, configure a Shibboleth FileBackedHTTPMetadataProvider to refresh the metadata at least daily:
Example 1: Configure a FileBackedHTTPMetadataProvider
<MetadataProvider id="IRBManager" xsi:type="FileBackedHTTPMetadataProvider"
metadataURL="https://shibboleth.irbmanager.com/metadata.xml"
backingFile="%{idp.home}/metadata/IRBManager.xml" maxRefreshDelay="P1D">
<!-- filter all but the listed entity -->
<MetadataFilter xsi:type="Predicate" direction="include">
<Entity>https://shibboleth.irbmanager.com/</Entity>
</MetadataFilter>
</MetadataProvider>
If, OTOH, security and/or interoperability are a concern, manage the metadata as illustrated below.
Given the HTTP location of the metadata to be managed, and the source directory of a Shibboleth LocalDynamicMetadataProvider, initialize both the cache and the source directory as follows:
Initialize the cache
# Steps 1 and 2
$ md_location=https://shibboleth.irbmanager.com/metadata.xml
$ $BIN_DIR/md_refresh.bash $md_location \
| $BIN_DIR/md_tee.bash $sourceDirectory \
> /dev/null
Presumably the following command is executed some time later, after the metadata resource has been modified on the server:
Check the cache
# Step 3
$ $BIN_DIR/http_cache_check.bash $md_location && echo "cache is up-to-date" || echo "cache is dirty"
cache is dirty
If the cache is dirty, manually inspect the differences between the metadata on the server and the metadata in the cache:
Inspect the file differences
If the differences are acceptable, update both the cache and the source directory with the new metadata:
Update the cache
To semi-automate the above process, implement a cron job that executes the command in step 3:
Example 1: Cron job to check the cache
Example 2: Amazon Web Services
The AWS documentation entitled How to Use Shibboleth for Single Sign-On to the AWS Management Console shows how to use a FileBackedHTTPMetadataProvider to consume AWS metadata. What the documentation doesn't say, however, is that the AWS server does not support HTTP conditional requests, so every time the metadata provider runs, it loads fresh metadata even if the metadata has not changed on the server.
Moreover, the NameIDFormat
elements in AWS metadata are bogus. The elements must be removed from metadata in order for the integration to be successful. Since AWS metadata includes a @validUntil
 attribute, downloading a static copy of the metadata is not advisable, however.
https://signin.aws.amazon.com/static/saml-metadata.xml
Last-Modified
date unknownDoes not support HTTP Conditional GET (no
ETag
in response)Unauthorized URN-based entityID (
urn:amazon:webservices
)Includes
@validUntil
attribute (expires annually)No encryption certificate
NameIDFormat
is wrong (showstopper)Â
Current
NameIDFormat
values in metadata:Â
urn:oasis:names:tc:SAML:2.0:nameid-format:transient
urn:oasis:names:tc:SAML:2.0:nameid-format:persistent
Login apparently works fine when these two
NameIDFormat
values are removed from metadataThis might work:
urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress
Role-based attribute release is tricky (see the AWS documentation and search the Shibboleth archives for details)
See relevant discussion thread on the mailing list
As in the previous example, initialize both the cache and the source directory, but this time filter the NameIDFormat
elements from the metadata before copying to the source directory:
Initialize the cache
Since the server does not support HTTP Conditional GET, the tool used in the previous example (http_cache_check.bash
) will not work. Here we use a diff-like tool that compares the file on the server to the cached file byte-by-byte:
Compare the files
Manually inspect the differences between the metadata on the server and the metadata in the cache:
Inspect the file differences
If the new metadata is acceptable, update both the cache and the source directory with the new metadata:
Update the cache
To semi-automate the above process, implement a cron job that executes the command in step 3:
Example 2: Cron job to compare files
Implement a separate cron job that periodically checks the source directory for expired or soon-to-be-expired metadata:
Example 2: Cron job to sweep the source directory
Note that the above script removes all expired metadata from the source directory, not just AWS metadata.