Page Comparison

...

The current implementation manages the storage of an IdPSession as a set of records under a context matching the session ID. One of those records is a "master" record with a fixed key of "_session" that contains a JSON serialization of the "core" sesson data attached to the IdPSession interface. The master record also contains a pair of arrays containing the keys to all associated AuthenticationResult or SPSession objects (which are the flow ID and or SP name/ID respectively). This is done because the storage API doesn't provide a way to enumerate the keys within a context, so this provides a foreign key lookup from IdPSession to its content.

The master record is set to expire based on the session timeout value, and the expiration slides forward on every update of the activity time. There is no actual "lifetime" bound because the session itself has no security value.

---

With the SP, I don't try and turn the NameID, etc. into a fully unique key, just a mostly unique one. I just index by a possibly truncated version of the NameID value, ignoring the qualifiers. Then I validate the results I get back and check for an exact match using the content of the Session before I include it in the set returned.

The StorageService API I have used doesn't accomodate secondary indexing natively. In the SP, I built a secondary index by maintaining a list of the session IDs associated with a SessionIndex, mapped to the index key. The cleanup problem is dodged there by not doing it. This works to a point under two conditions:

there's an upper bound on lifetime of the session, which puts an upper bound on the life of the list of sessions indexed by a given key
there aren't a huge number of sessions to be indexed by one key that would make reading and writing the list inefficient

The second of these is a problem for load testing because it can generate a huge number of sessions for a single NameID. This is pretty easy to workaround by having an option not to do the indexing, which is not needed when load testing anyway. I have never come across a legitimate scenario outside of load tests where the number would get large enough to be such a problem.

V2 doesn't bound the IdP session lifetime, because the individual login methods do get lifetimes, and that's also true in V3. And adding one doesn't seem to work very well, since it creates a hard stopping point that would dump all existing active results and force new authentication, even if the user literally logged in again seconds before. The mixing of the IdP session and the AuthnResult object lifetimes and policies to handle SSO create a disincentive for having a real "lifetime" on the session.

But I think the individual "ServiceSesson" records should have a lifetime, based on some kind of upper bound along with the SessionNotOnOrAfter for the relying party, plus some slop. All of that should be known in the SAML profile code, which is another reason why it's clear the ServiceSession work should be pushed later in the flow, not as part of general Session upkeep.

So I'm thinking we could lay this out like so (triples are context, key, record):

The IdPSession would be stored as (sessionId, "_session", serialization of IdPSession data) with expiration based on inactivity timeout + logout slop
Individual AuthenticationResults stored as (sessionId, flowId, serialization of AuthenticationResult) with expiration based on inactivity timeout
ServiceSessions would be stored as (sessionId, RP entityID, serialization of ServiceSession) with expiration based on expected session lifetime at RP + logout slop

That handles the basic data storage and cleanup, and lookup by session key.

The secondary indexing would be stored as (RP entityID, possibly truncated NameID, multimap of SAML SessionIndex to IdPSession keys) and the expiration would be set/updated to the upper bound of the expirations of the ServiceSessions referenced by a given secondary index.

The secondary indexing matches the lookup requirements of both SAML queries and logout directly. I think we could save some work and just serialize the data there by directly serializing a Multimap, but that would be an internal detail, just a tactical choice.

I also don't really think most other protocols will ever need anything like this. It doesn't really work well in SAML as it is. So we can cross that API bridge when we come to it if something really different comes along.Attached to the same context as the master record, each individual AuthenticationResult and SPSession added to the IdPSession is serialized and stored under the foreign key stored in the master record (the flow ID or SP name/ID).

AuthenticationResults are serialized by the corresponding AuthenticationFlowDescriptor, which implements the StorageSerializer API and handles all the details. Similarly to the master record, the expiration is timeout-driven, based on the last activity of the result, along with an offset that prevents a result from disappearing from storage too quickly.

SPSessions are serialized using a plugin registered for the underlying SPSession class, which allows subclasses to contain extended information that gets serialized by a custom plugin. To make this reversible, the class type of the SPSession is prefixed to the serialized data so it can be read back to determine the right plugin to use to reconstitute the object. The expirations are set based on the actual expiration of the SPSession (as determined by the profile flow that created it) plus some "slop" that keeps knowledge of the SPSession available for logout purposes.

Collectively, these records all expire at somewhat varying times, but they are all bounded and are eventually cleaned up by the StorageService without any intervention.

An example layout:

Context	Key	Value	Expiration
<session ID>	_session	<serialized form of IdPSession and subrecord keys>	session last activity + timeout + offset
<session ID>	AuthenticationFlow/JAASForm	<serialized AuthenticationResult>	result last activity + timeout + offset
<session ID>	AuthenticationFlow/SPNEGO	<serialized AuthenticationResult>	result last activity + timeout + offset
<session ID>	https://sp.example.org/shibboleth	net.shibboleth.idp.saml.session.SAML2Session:<serialized SPSession>	SP session expiration + offset
<session ID>	https://sp2.example.org/shibboleth	net.shibboleth.idp.saml.session.SAML2Session:<serialized SPSession>	SP session expiration + offset

A word about versioning: a big reason for manipulating the various record expirations is to be able to update and recover the last activity time without actually touching the record. This is safe because the versioning policy for such a field would be last-update-wins anyway, and it avoids needing to modify the serialized form of a record to perform the most common update.

Additionally, AuthenticationResult and SPSession records are otherwise static; they can't change in any other respects, so once serialized they never change. The master record does change any time an address is added, or a sub-record is attached (to maintain the foreign key lists), and the record versioning feature of the StorageService prevents race conditions when updates occur at the same time.

Secondary Index

If all of that sounds horrible, the real fun is implementing a secondary index using the SPSession records to be able to track back to an IdPSession based on an SP name and a custom key. Without SQL semantics, the secondary index is implemented using a simple delimited list of session IDs in a record keyed by the SP name and whatever custom key is appropriate for the session type. This is implementation-specific, but in SAML 2 is probably going to be a truncation of the NameID value. It doesn't have to unique because we're storing a list of sessions in the record, not just one.

The index is maintained by a read followed by a create or update to maintain the session ID list in the record. This is guarded using the versioning support, again, to prevent races and maintain coherent data.

Another "trick" is that the expiration of the record is based on the expiration of the SPSession record being indexed on. This ensures that the index record survives at least as long as every SPSession that may trigger a lookup.

The one warning is that this very low-tech solution breaks under load testing scenarios, because that tends to generate a huge number of sessions with the same SP name and custom key, whereas under normal use you never see significant numbers that are identical. With such a large number, reading and writing the list of sessions becomes inefficient. The same problem has been observed with the Shibboleth SP for the same reason. In practice it doesn't matter, one simply disables this secondary indexing when load testing.

Following along with the example above, the secondary records created might be:

Context	Key	Value	Expiration
https://sp.example.org/shibboleth	<NameID value>	<session ID>	SP session expiration + offset
https://sp2.example.org/shibboleth	<NameID value>	<session ID>	SP session expiration + offset

Versions Compared

Old Version 3

New Version 4

Key

Secondary Index