This article describes a semi-automatic process for managing untrusted SAML metadata using a Shibboleth LocalDynamicMetadataProvider
and a complementary set of command-line tools.
Table of Contents |
---|
First configure a Shibboleth LocalDynamicMetadataProvider. In particular, configure a sourceDirectory
as a local repository of metadata. The latter is referred to as $sourceDirectory
in the code fragments below.
...
Identify a metadata source location to be managed. Perform the following sequence of steps for each metadata source location:
Prime the cache with a copy of the metadata
Filter the metadata into the source directory of the
LocalDynamicMetadataProvider
Check the metadata on the server
If the metadata on the server is different than the metadata in cache, investigate the differences
If the differences are acceptable, update the cache with fresh metadata
Filter the metadata into the source directory of the
LocalDynamicMetadataProvider
Go to step 3
The following examples illustrate the basic process.
...
We start with a relatively simple example of remote metadata:
https://shibboleth.irbmanager.com/metadata.xml
A non-InCommon Shibboleth SP that consumes InCommon metadata
Last-Modified: Tue, 28 Jul 2015 13:32:54 GMT
Supports HTTP Conditional GET
See the relevant discussion thread on the mailing list
If you trust the SP owner to do the Right Thing, and the reliance on commercial TLS is not a concern, configure a Shibboleth FileBackedHTTPMetadataProvider to refresh the metadata at least daily:
...
Example 1: Configure a FileBackedHTTPMetadataProvider
Code Block | ||
---|---|---|
| ||
<MetadataProvider id="IRBManager" xsi:type="FileBackedHTTPMetadataProvider" metadataURL="https://shibboleth.irbmanager.com/metadata.xml" backingFile="%{idp.home}/metadata/IRBManager.xml" maxRefreshDelay="P1D"> <!-- filter all but the listed entity --> <MetadataFilter xsi:type="Predicate" direction="include"> <Entity>https://shibboleth.irbmanager.com/</Entity> </MetadataFilter> </MetadataProvider> |
...
Given the HTTP location of the metadata to be managed, and the source directory of a Shibboleth LocalDynamicMetadataProvider, initialize both the cache and the source directory as follows:
Initialize the cache
Code Block | ||||
---|---|---|---|---|
| ||||
# Steps 1 and 2 $ md_location=https://shibboleth.irbmanager.com/metadata.xml $ $BIN_DIR/md_refresh.bash $md_location \ | $BIN_DIR/md_tee.bash $sourceDirectory \ > /dev/null |
Presumably the following command is executed some time later, after the metadata resource has been modified on the server:
Check the cache
Code Block | ||||
---|---|---|---|---|
| ||||
# Step 3 $ $BIN_DIR/http_cache_check.bash $md_location && echo "cache is up-to-date" || echo "cache is dirty" cache is dirty |
If the cache is dirty, manually inspect the differences between the metadata on the server and the metadata in the cache:
Inspect the file differences
Code Block | ||||
---|---|---|---|---|
| ||||
# Step 4 $ $BIN_DIR/http_cache_diff.bash $md_location |
If the differences are acceptable, update both the cache and the source directory with the new metadata:
Update the cache
Code Block | ||||
---|---|---|---|---|
| ||||
# Steps 5 and 6 # force a metadata refresh $ $BIN_DIR/md_refresh.bash -F $md_location \ | $BIN_DIR/md_tee.bash $sourceDirectory \ > /dev/null |
To semi-automate the above process, implement a cron job that executes the command in step 3:
...
Example 1: Cron job to check the cache
Code Block | ||
---|---|---|
| ||
#!/bin/bash # environment variables # (also export TMPDIR if it doesn’t already exist) export BIN_DIR=/tmp/bin export LIB_DIR=/tmp/lib export CACHE_DIR=/tmp/http_cache export LOG_FILE=/tmp/bash_log.txt # the name of this script script_name=${0##*/} # specify the HTTP resource location=https://shibboleth.irbmanager.com/metadata.xml # check the cache against the server $BIN_DIR/http_cache_check.bash $location >&2 status_code=$? if [ $status_code -eq 1 ]; then echo "WARN: $script_name: cache is NOT up-to-date for resource: $location" >&2 elif [ $status_code -gt 1 ]; then echo "ERROR: $script_name: http_cache_check.bash failed ($status_code) on location: $location" >&2 fi exit $status_code |
Example 2: Amazon Web Services
The AWS documentation entitled How to Use Shibboleth for Single Sign-On to the AWS Management Console shows how to use a FileBackedHTTPMetadataProvider to consume AWS metadata. What the documentation doesn't say, however, is that the AWS server does not support HTTP conditional requests, so every time the metadata provider runs, it loads fresh metadata even if the metadata has not changed on the server.
Moreover, the NameIDFormat
elements in AWS metadata are bogus. The elements must be removed from metadata in order for the integration to be successful. Since AWS metadata includes a @validUntil
attribute, downloading a static copy of the metadata is not advisable, however.
https://signin.aws.amazon.com/static/saml-metadata.xml
Last-Modified
date unknownDoes not support HTTP Conditional GET (no
ETag
in response)Unauthorized URN-based entityID (
urn:amazon:webservices
)Includes
@validUntil
attribute (expires annually)No encryption certificate
NameIDFormat
is wrong (showstopper)Current
NameIDFormat
values in metadata:urn:oasis:names:tc:SAML:2.0:nameid-format:transient
urn:oasis:names:tc:SAML:2.0:nameid-format:persistent
Login apparently works fine when these two
NameIDFormat
values are removed from metadataThis might work:
urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress
Role-based attribute release is tricky (see the AWS documentation and search the Shibboleth archives for details)
See relevant discussion thread on the mailing list
As in the previous example, initialize both the cache and the source directory, but this time filter the NameIDFormat
elements from the metadata before copying to the source directory:
Initialize the cache
Code Block | ||||
---|---|---|---|---|
| ||||
# Steps 1 and 2 $ md_location=https://signin.aws.amazon.com/static/saml-metadata.xml # log a warning if the metadata will expire within 5 days $ $BIN_DIR/md_refresh.bash $md_location \ | $BIN_DIR/md_require_valid_metadata.bash -E P5D \ | /usr/bin/xsltproc $LIB_DIR/remove_NameIDFormat.xsl - \ | $BIN_DIR/md_tee.bash $sourceDirectory \ > /dev/null |
Since the server does not support HTTP Conditional GET, the tool used in the previous example (http_cache_check.bash
) will not work. Here we use a diff-like tool that compares the file on the server to the cached file byte-by-byte:
Compare the files
Code Block | ||||
---|---|---|---|---|
| ||||
# Step 3 $ $BIN_DIR/http_cache_diff.bash -Q $md_location && echo "cache is up-to-date" || echo "cache is dirty" cache is dirty |
Manually inspect the differences between the metadata on the server and the metadata in the cache:
Inspect the file differences
Code Block | ||||
---|---|---|---|---|
| ||||
# Step 4 $ $BIN_DIR/http_cache_diff.bash $md_location |
If the new metadata is acceptable, update both the cache and the source directory with the new metadata:
Update the cache
Code Block | ||||
---|---|---|---|---|
| ||||
# Steps 5 and 6 # force a metadata refresh $ $BIN_DIR/md_refresh.bash -F $md_location \ | $BIN_DIR/md_require_valid_metadata.bash -E P5D \ | /usr/bin/xsltproc $LIB_DIR/remove_NameIDFormat.xsl - \ | $BIN_DIR/md_tee.bash $sourceDirectory \ > /dev/null |
To semi-automate the above process, implement a cron job that executes the command in step 3:
...
Example 2: Cron job to compare files
Expand | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
| collapse
| true
|
Implement a separate cron job that periodically checks the source directory for expired or soon-to-be-expired metadata:
...
Example 2: Cron job to sweep the source directory
Expand | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
| collapse
| true
|
Note that the above script removes all expired metadata from the source directory, not just AWS metadata.