Sessions

The Session layer in the IdP is what tracks information associated with a Subject across multiple transactions separated significantly in time. It is not involved in managing state for a particular transaction or for a web flow; that's managed by Spring as part of the SWF layer.

IdPSession

The core object managed by the Session layer is an IdPSession, which contains the following:

an ID
creation time
last activity time
canonical principal name of Subject
zero or more AuthenticationResults indexed by the result's flow ID
zero or more SPSessions indexed primarily by the SP's name/entityID

An IdPSession can also be bound during creation and afterward to client addresses, one per address family, and offers a method to check for a timeout that also updates the last activity time. This decouples use cases that care about client address or timeout checks from those that don't.

Only a single AuthenticationResult for a given flow, and only a single SPSession for a given SP are tracked in a session. Of course, multiple such objects might exist in distinct sessions with a Subject (e.g., across different devices).

The AuthenticationResult construct is discussed extensively on the Authentication page, and is how we "remember" an act of authentication for Single Sign-On. Only results stored in an IdPSession can ever be eligible for SSO, so disabling sessions globally disables SSO.

SPSession

An SPSession is a way to track the authentication interactions the IdP has been involved in during a session (in SAML terms, the SPs it has issued assertions to). There are two main reasons to do this at all:

some kind of distributed logout
user interface considerations

The former is probably as worthless in practice as its always been, but we are accomodating SAML logout requirements explicitly in the session design, at a massive cost in code complexity. The latter is to support scenarios in which a UI component may need to present information about the services a subject may have accessed. Obviously this overlaps with logout, but could also be useful in other cases. It might even substitute for logout by giving the subject information about what's not going to be logged out of.

This aspect of the Session layer can be disabled independently of the rest, to simplify storage requirements or just save cycles. In particularly, to store sessions entirely within a cookie, this tracking must be off, for size reasons.

Within an SPSession, we track:

name/ID of service
creation time
expiration time
authentication flow ID used to fulfill the service's request for authentication
an optional secondary key that may be needed to lookup the SPSession by alternative means

The last one is really for SAML; we have to store the NameID and SessionIndex issued to the SP because that's how logout works. By abstracting this behind a generic interface property, the code isn't SAML-specific except where needed.

Because the actual underlying SPSession type is going to be protocol-specific, it's left to the profile web flow to eventually create and attach it, rather than part of the authentication layer.

SessionManager and SessionResolver

There are two interfaces used to interact with the Session layer, one for creating/destroying them (SessionManager) and one for looking them up (SessionResolver). In practice they are implemented together.

The SessionManager interface is very minimal because actual updates to an IdPSession are done via methods on the IdPSession, not through the SessionManager. This is more elegant for the caller, but generally means a particular SessionManager implementation is also supplying its own custom implementation of IdPSession to manage changes.

The SessionResolver is designed around the Resolver notion used in a lot of the code base, and lookups are done based on custom Criterion objects that provide for the use cases we have for session access, such as:

by session ID
by implicit session ID found in a servlet request (i.e., a cookie, though this isn't necessarily required)
by a secondary lookup of an SP ID plus custom key (this supports the SAML logout case)

Storage

As a technical matter, the Session and Storage layers are distinct, but in practice a lot of what SessionManager and SessionResolver have to do depends on the way storage is handled. The concrete implementation provided for the Session interfaces is built on top of the StorageService abstraction in OpenSAML.

Any storage implementation able to satisfy that contract will work transparently with the Session implementation, including one already implemented that stores data in a secure cookie and reads/writes it on a per-request basis.

The Session implementation can be configured to store SPSession data only in the case that the storage implementation is not per-request/client-side storage, because of size constraints.

Technical Details

There are large number of unusual features implemented in the StorageService and the Session layer is the reason for that. It's trying to serve two goals: providing a very customizeable storage layout for advanced use cases like this one, but limiting the impact of that complexity on the actual storage plugin, which is meant to be highly unspecialized and use very opaque storage formats and layouts.

The current implementation manages the storage of an IdPSession as a set of records under a context matching the session ID. One of those records is a "master" record with a fixed key of "_session" that contains a JSON serialization of the "core" sesson data attached to the IdPSession interface. The master record also contains a pair of arrays containing the keys to all associated AuthenticationResult or SPSession objects (which are the flow ID and SP name/ID respectively). This is done because the storage API doesn't provide a way to enumerate the keys within a context, so this provides a foreign key lookup from IdPSession to its content.

The master record is set to expire based on the session timeout value, and the expiration slides forward on every update of the activity time. There is no actual "lifetime" bound because the session itself has no security value.

---

With the SP, I don't try and turn the NameID, etc. into a fully unique key, just a mostly unique one. I just index by a possibly truncated version of the NameID value, ignoring the qualifiers. Then I validate the results I get back and check for an exact match using the content of the Session before I include it in the set returned.

The StorageService API I have used doesn't accomodate secondary indexing natively. In the SP, I built a secondary index by maintaining a list of the session IDs associated with a SessionIndex, mapped to the index key. The cleanup problem is dodged there by not doing it. This works to a point under two conditions:

there's an upper bound on lifetime of the session, which puts an upper bound on the life of the list of sessions indexed by a given key
there aren't a huge number of sessions to be indexed by one key that would make reading and writing the list inefficient

The second of these is a problem for load testing because it can generate a huge number of sessions for a single NameID. This is pretty easy to workaround by having an option not to do the indexing, which is not needed when load testing anyway. I have never come across a legitimate scenario outside of load tests where the number would get large enough to be such a problem.

V2 doesn't bound the IdP session lifetime, because the individual login methods do get lifetimes, and that's also true in V3. And adding one doesn't seem to work very well, since it creates a hard stopping point that would dump all existing active results and force new authentication, even if the user literally logged in again seconds before. The mixing of the IdP session and the AuthnResult object lifetimes and policies to handle SSO create a disincentive for having a real "lifetime" on the session.

But I think the individual "ServiceSesson" records should have a lifetime, based on some kind of upper bound along with the SessionNotOnOrAfter for the relying party, plus some slop. All of that should be known in the SAML profile code, which is another reason why it's clear the ServiceSession work should be pushed later in the flow, not as part of general Session upkeep.

So I'm thinking we could lay this out like so (triples are context, key, record):

The IdPSession would be stored as (sessionId, "_session", serialization of IdPSession data) with expiration based on inactivity timeout + logout slop
Individual AuthenticationResults stored as (sessionId, flowId, serialization of AuthenticationResult) with expiration based on inactivity timeout
ServiceSessions would be stored as (sessionId, RP entityID, serialization of ServiceSession) with expiration based on expected session lifetime at RP + logout slop

That handles the basic data storage and cleanup, and lookup by session key.

The secondary indexing would be stored as (RP entityID, possibly truncated NameID, multimap of SAML SessionIndex to IdPSession keys) and the expiration would be set/updated to the upper bound of the expirations of the ServiceSessions referenced by a given secondary index.

The secondary indexing matches the lookup requirements of both SAML queries and logout directly. I think we could save some work and just serialize the data there by directly serializing a Multimap, but that would be an internal detail, just a tactical choice.

I also don't really think most other protocols will ever need anything like this. It doesn't really work well in SAML as it is. So we can cross that API bridge when we come to it if something really different comes along.