/
XMLAttributeExtractor

XMLAttributeExtractor

Overview

Identified by type="XML", this AttributeExtractor implements an XML-based rule syntax for designating SAML attributes and name identifiers to decode into internal attributes. Your configuration will almost certainly be using this plugin type. Next to metadata, this is probably the most commonly "touched" aspect of the configuration on the SAML side and forms the basis of the useful behavior of the SP to get information out of the SAML exchange into server variables for your application to access.

At a basic level, what you're doing here is taking each individual SAML Attribute or Name Identifier you want to support and creating a corresponding <am:Attribute> element (or uncommenting or adjusting one that already exists by default) in the configuration to match and map that incoming thing into an internal thing.

A full set of examples of different kinds of rules can be found in the XMLAttributeExtractorExamples topic.

The plugin supports extraction of SAML Attributes and Name Identifiers from the following SAML constructs (it does not know how to pull any other data from these elements, only SAML Attributes and Name Identifiers):

  • <saml:Assertion>

  • <saml:Attribute>

  • <saml:NameIdentifier>

  • <saml2:Assertion>

  • <saml2:Attribute>

  • <saml2:NameID>

  • <saml2:EncryptedAttribute>

The actual extraction process relies on a secondary layer of AttributeDecoder plugins that actually consume the XML content and turn it into string data.

This extractor's configuration is implemented as a reloadable XML resource, which means that the XML rules can be supplied inline, in a local file, or a remote file, and can be monitored for changes and reloaded on the fly (but see the warning below). The root of the XML in any of those cases MUST be an <am:Attributes> element, either as a child element in an existing file or the root of a different file.

Exercise caution allowing this particular configuration to be reloadable if you rely on HTTP request headers to access attribute information. You should avoid using headers, in which case the file can be safely made reloadable, but when headers are used there are limitations internally on how the system tracks which headers are being reserved for SP data and if this falls out of sync, it becomes possible for a client to smuggle/spoof headers into unprotected variables the SP hasn't recognized it needs to protect yet because the set changed.

A full restart of both the shibd daemon and the web server is generally required to make header extraction changes reliable, and as a result the shipping defaults mark this configuration as non-reloadable.

General Configuration

Each <am:Attribute> child element installs a rule for extracting a particular SAML Attribute or type of Name Identifier into an internal SP attribute. The source of the attribute is identified with the name (and possibly nameFormat) XML attributes and internally tagged by the id.

The name property in the rule corresponds to the Name XML attribute of a SAML <saml2:Attribute> or <saml1:Attribute> element or the Format XML attribute of a SAML <saml2:NameID> or <saml1:NameIdentifier> element.

The Shibboleth SP by default will install rules using a nameFormat of urn:mace:shibboleth:1.0:attributeNamespace:uri and urn:oasis:names:tc:SAML:2.0:attrname-format:uri to accommodate both SAML versions.

The nameFormat property can be omitted unless a different NameFormat is being used. This property is also omitted/ignored when extracting information from a <saml2:NameID> or <saml1:NameIdentifier> element.

The internal id is typically short/simple and is used throughout all other SP components (such as attribute filters).

Multiple <am:Attribute> rules can share the same id; the implication is that a given internal name may be mapped from multiple externally-named sources to consolidate multiple sources down into one representation.

There are examples of rules later in this page to illustrate how this works; it's very simple in practice.

Reference

XML Namespaces

This page refers to several different namespaces by convention as detailed below:

NameSpace

URI

Description

NameSpace

URI

Description

saml

urn:oasis:names:tc:SAML:1.0:assertion

SAML1 Assertion namespace

saml2

urn:oasis:names:tc:SAML:2.0:assertion

SAML2 Assertion namespace

am

urn:mace:shibboleth:2.0:attribute-map

The Shibboleth SP Attribute Map namespace

conf

urn:mace:shibboleth:3.0:native:sp:config

The Shibboleth SP core configuration namespace

md

urn:oasis:names:tc:SAML:2.0:metadata

The SAML2 Metadata namespace

mdattr

urn:oasis:names:tc:SAML:metadata:attribute

The SAML2 EntityAttribute Metadata Extension namespace

Attributes

Aside from the type="XML" attribute itself, there is no other attribute content specific to this plugin type.

It supports all of the attributes common to all reloadable configuration resources:

Child Elements

The following child element must be provided, either inline, or as the root element of a local or remote XML resource to load from, which would be specified via the attribute(s) above.

Name

Cardinality

Description

Name

Cardinality

Description

<am:Attributes>

1

Root element of configuration

When a non-inline configuration is used, it supports the following child elements common to all reloadable configuration resources.

<am:Attributes> Element Reference

This is the root element of the mapping configuration. Most of the advanced features here are for use with a feature you are very unlikely to encounter in practice, the ability to embed signed SAML Assertions inside a metadata extension as a sort of attestation of some information about an IdP. The only likely content you'll use are the rules themselves via the <am:Attribute> element.

Attributes

The following optional attributes are supported:

Name

Type

Default

Description

Name

Type

Default

Description

metadataPolicyId 

string

 

Optional reference to a security policy to apply to SAML assertions processed in the <mdattr:EntityAtributes> metadata extension (see below).

metadataAttributeCaching 

boolean

true

When false, disables the caching of decoded attribute information that is normally done to improve the efficiency of extracting attribute information from the <mdattr:EntityAttributes> metadata extension (see below). The usual reason to turn this off is to support language-aware decoding of attribute values.

Child Elements

The following child element content is supported:

Name

Cardinality

Description

Name

Cardinality

Description

<am:MetadataProvider

0 or 1

Supplies a dedicated MetadataProvider for use in validating SAML assertions processed in the <mdattr:EntityAtributes> metadata extension (see below)

<am:TrustEngine

0 or 1

Supplies a dedicated TrustEngine for use in validating SAML assertions processed in the <mdattr:EntityAtributes> metadata extension (see below)

<am:AttributeFilter>

0 or 1

Supplies a dedicated AttributeFilter for use in filtering data from SAML assertions processed in the <EntityAtributes> metadata extension (see below).

<am:Attribute>

1 or more

An extraction rule (see next section)

<am:Attribute> Element Reference

Each <am:Attribute> element describes an extraction rule, the core of this plugin's behavior.

Attributes

Name

Type

Req?

Description

Name

Type

Req?

Description

name

string

Y

SAML Attribute Name or Name Identifier Format to extract

nameFormat

string



Optional setting to constrain the matching by Attribute Name to also take into account the AttributeNamespace (SAML 1) or NameFormat (SAML2). The SP defaults this to matching to make attribute names based on URIs automatic. Ignored when matching with Name Identifier.

id

string

Y

Internal ID by which the extracted SP attribute will be known, generally a common/short/simple name for the data element

Child Elements

Name

Cardinality

Description

Name

Cardinality

Description

<am:AttributeDecoder>

0 or 1

Optionally specifies an attribute decoder to use. A simple/string decoder is used if not otherwise specified, which is sufficient if the values are simple strings without any unusual structure.

Examples

These examples just illustrate how the overall configuration looks within shibboleth2.xml.

For more detailed examples of how to create actual rules, see the XMLAttributeExtractorExamples subpage.

Inline Attribute Extractor
<config:AttributeExtractor type="XML"> <am:Attributes xmlns="urn:mace:shibboleth:2.0:attribute-map" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <am:Attribute name="urn:oid:2.5.4.3" id="cn"/> <am:Attribute name="urn:oid:1.3.6.1.4.1.5923.1.1.1.6" id="eppn"> <am:AttributeDecoder xsi:type="ScopedAttributeDecoder"/> </am:Attribute> </am:Attributes> </config:AttributeExtractor>



Loading an externally specified mapping file
<AttributeExtractor type="XML" validate="true" reloadChanges="false" path="attribute-map.xml"/>

Metadata Attribute Extraction

The plugin supports a metadata extension used widely in mature metadata deployments, the <mdattr:EntityAttributes> element, which can be used to attach <saml2:Attribute> and <saml2:Assertion> elements to an <md:EntityDescriptor>.  Extracting this data allows information about a user's IdP to be exposed in the same way that a user's other attributes are exposed.

To ensure they can be distinguished from more typical user data, the trigger for this feature is the metadataAttributePrefix property in the <ApplicationDefaults> element. Setting this property is both a precondition for metadata attribute extraction and a value that is prepended to the internal attribute names that result. For example, a prefix of "Meta-" will turn an extracted attribute called "mail" into "Meta-mail".

In the case of bare <saml:Attribute> elements, often termed "tags", this is the entire picture. Any such elements found within an <mdattr:EntityAttributes> extension are processed identically to user attributes with the same set of decoding and mapping rules. The extension may appear on both the <md:EntityDescriptor> and <md:EntitiesDescriptor>elements, and the plugin will walk the metadata tree from the role to the entity level up through any parent groups, and process each extension it finds.

Signed Assertions in Metadata

At only the entity level, the use of embedded, signed SAML assertions is also supported, but this is quite a bit more complex, and may require additional configuration as follows. It's also very rare in practice.

A dedicated MetadataProvider can be defined for the plugin to use when evaluating the assertions. You can think of this as "meta-metadata", definitions of issuers of assertions about issuers of assertions (now you see why it's rare). Issuers of these assertions are required to supply SAML 2.0 Metadata with the <md:AttributeAuthorityDescriptor> role as a convention.

If you don't define a dedicated MetadataProvider for the plugin to use, it will reuse the metadata supplied to the SP as a whole. This may not be the desired behavior.

Optionally, you can also define an embedded instance of a TrustEngine. If one isn't supplied, the normal one defined to the SP will be used when verifying the signed assertions. It's common that this will be sufficient.

The evaluation process is controlled using a Security Policy that is typically specific to this purpose and is referenced by a metadataPolicyId attribute. Normally, rules that would apply to generic assertion processing within the SP, such as replay and freshness checking, do not apply to assertions found in metadata, so a separate policy has to be used.

Note that unsigned assertions are NOT permitted (you'd just embed a <saml2:Attribute> directly).

Finally, you can also define an embedded instance of an AttributeFilter. This enables a special "internal" filtering step to be applied to the attributes extracted from each assertion, separate from the other filtering performed by the SP. Specifically, this filtering step is performed with knowledge of the actual "issuer" of the attributes (the issuer of the embedded assertion). This step is also performed before the attributes are internally renamed and prefixed with the metadataAttributePrefix.

When the SP performs its standard filtering, the metadata attributes will have been renamed, and the "issuer" is presumed to be the user's IdP itself, because all of the attributes have been pooled by that time.

In other words, you should perform any specialized filtering of metadata-based attributes based on their source by using the dedicated filtering step here, and refer to the attributes by their "unprefixed" names. Other, generic filtering rules based on attribute values (e.g. checking syntax) can be applied using the standard filtering step, referring to metadata-based attributes by the "prefixed" names.

Related pages