SAML 2.0 (but not 1.x) defines a kind of NameID called a "persistent" identifier, with a Format of
The term "persistent" refers to the fact that it's not a per-session identifier, but a stable value; in addition, SAML persistent identifiers have very particular properties, notably they're intended to be opaque and "pairwise". The latter means that for the most part every SP receives a different value for the same user. The term "pairwise" is the more typical way of describing this whole concept today.
You should consider carefully whether this kind of identifier meets your needs. They can be difficult to deal with, they work very poorly with a wide variety of applications, and they are often more trouble than they're worth. Be cautious, and don't commit to supporting something you don't want to have to support.
There were some internal code changes in V4 but the configuration is designed to be backward-compatible. The most significant change is that a public interface now exists to support third party extension of the way these values are produced and managed if it becomes necessary.
The IdP has a couple of built-in implementations of this interface (called "strategies" in the configuration) to produce this kind of identifier. The strategy used is controlled with the idp.persistentId.generator property in saml-nameid.properties.
It's come to light that at least some (perhaps many, or even most) applications do not support case-sensitive handling of identifiers. This SAML format is explicitly defined to be case-sensitive, but it is much, much wiser not to expect that. Older versions of the software generate identifiers with the Base64 encoding and this is much less safe, so if you're not already supporting identifiers produced by them, you would be wise to generate the values using a Base32 encoding, which is designed to support case-insensitive applications. New installs include a property explicitly set to produce Base32 values, but upgrades of older configurations will continue to use Base64 for compatibility reasons.
To enable either approach, you will need to uncomment the generator bean in saml-nameid.xml for SAML 2 once you set the appropriate properties highlighted below.
<util:list id="shibboleth.SAML2NameIDGenerators"> <ref bean="shibboleth.SAML2TransientGenerator" /> <ref bean="shibboleth.SAML2PersistentGenerator" /> </util:list>
There is no equivalent SAML 1 bean, as this is a SAML 2-only feature.
The default PairwiseIdStore implementation is a hash/digest-based approach called "Computed" that avoids the need for a database to store the IDs, but is incapable of reverse-mapping a given identifier (e.g., as part of a SAML attribute query), or revoking or changing the identifier associated with a subject. Tracking back to a subject for debugging purposes generally involves the use of audit logs rather than direct access to a mapping of users. It's not the best approach in the abstract, but it is much simpler to deploy.
To enable the Computed strategy, you must set additional properties:
The attribute used as the source key need not be released (in the sense of an attribute filter policy) to the SP.
One of the disadvantages of strictly computing IDs is a loss of manageability of the values, particularly the ability to change a value should it become compromised. The IdP includes a feature allowing fine-grained override of the salt value used to generate IDs for specific users and/or relying parties, by means of a Java Map bean, which can be declared in saml-nameid.xml, and by default is named shibboleth.ComputedIdExceptionMap
The Java type of the object is a mouthful: Map<String,Map<String,String>> (i.e., it's a string-keyed map whose values are themselves maps). It's easier to grasp this in practice in the example below.
The primary keys are the names of subjects/users, or an asterisk (*) to signify a wildcard rule.
The values are maps of Relying Party names to salt values. These keys are the names of relying parties or an asterisk (*) as a wildcard, and the values are either a substitute salt string to use, or can be null to block the generation of an ID altogether.
One use for this feature is to maintain an old salt value for a legacy service while relying on a new value for everybody else:
Overriding salt for a single SP
The alternative PairwiseIdStore generates random identifiers on first use and stores them in a database for future use. This has some benefits and addresses some of the limitations of the computed approach, but requires a highly available database accessible to every IdP node and is very difficult (bordering on impossible) to make reliable. Note that it is not possible to implement such a database using asynchronous/unreliable replication. This will lead to conflicts and race conditions, and eventually a risk of errors and duplicate entries. This is the main reason it isn't easy to get working, as most applications simply can't tolerate these kinds of conflicts easily.
The "vanilla" DDL needed for this approach is:
Stored ID Table Definition
You will need to define the table above in your database, and you must define a primary key as shown above or the implementation will not function as intended. The absence of this constraint will normally be detected at startup time and prevent use of the mechanism.
Also ensure that the collation associated with the "localId" column is appropriate for use with the source attribute you specify. An inappropriate collation can render the attribute non-unique. In particular, it has been observed that a case-sensitive collation is needed if using the Active Directory objectSid as the source attribute, to ensure that persistent IDs are uniquely identified. "utf8_bin" has been found to work in this circumstance.
Using this strategy requires setting the properties described earlier, as well as some additional changes:
A default feature of the stored strategy is that it uses the computed strategy to produce the initial identifier for each subject, to help with migration. If you don't need that to happen, you can set the idp.persistentId.computed property to an empty value and ignore that feature entirely, but it isn't a terrible idea to leverage this because it hedges your bets. If you find that the stored model is unworkable in practice, you may be able to easily convert back to the computed approach if all your values are compatible with it.
Examples of each type of bean using an unspecified database and the DBCP2 pooling library (included with the IdP) follows. You will need to determine what driver class to plug into the bean definition for your database and the proper URL to use. Always use current drivers when possible; bug fixes for obscure problems tend to be frequent. When in doubt, grab a newer one.
Example persistent ID store beans in saml-nameid.xml
There are a few cases where more advanced customization of the stored approach may be required, and this is accomodated by defining your own custom bean that inherits from "shibboleth.StoredPersistentIdGenerator" and defines any additional bean properties required (see the JDBCPairwiseIdStore javadoc).
The option to define and reference your own bean rather than just supplying a plain DataSource is present to allow you to override the default table and column names used in the data store, the SQL queries used, the timeout, etc, but most of these settings are now accessible in V4.1 via simple Java properties and will not require a bean definition.
Properties defined in saml-nameid.properties to customize various aspects of persistent NameID generation behavior follow:
Beans defined in saml-nameid.xml and related system configuration are as follows: