Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Work in progress document on design of Sessions in the IdP.

IdPSession

What we're maybe storing:

...

The Session layer in the IdP is what tracks information associated with a Subject across multiple transactions separated significantly in time. It is not involved in managing state for a particular transaction or for a web flow; that's managed by Spring as part of the SWF layer.

Table of Contents

Data Objects

IdPSession

The core object managed by the Session layer is net.shibboleth.idp.session.IdPSession, which contains the following:

...

  • an ID
  • creation time
  • last activity time
  • canonical principal name
  • map of flow ID to AuthenticationResult
  • map of SP entityID to ServiceSession

I think the secret is irrelevant. V2 uses session cookies that are MAC'd with the secret and contain V4 and V6 client addresses. This is more complex than we need, we can just replace the secret with the two addresses and use a highly random ID in the session cookie like the SP does.

A client side session probably doesn't need an ID strictly speaking but we probably want it for logging / auditing.

If we don't use a multi-map then each type of authentication configured would be tracked only once per session, so reuse of a method might replace the older copy. This seems ok, probably, and avoids the problem of garbage collection of older results.

ServiceSession

What we're maybe storing:

  • ID of service
  • maybe a creation time, do we care?
  • thinking we might track an expiration, otherwise how do we bound the number of these?
  • flow ID used for service (could be fresh, or a reuse of active result derived from flow)
  • for SAML at least, we would need the NameID issued in the transaction to be able to propagate logout

The latter is a bitch, because we don't have it until very late in the flow. Thinking this means we don't want to actually update the session until very late (or at least not write it back to storage, which is what update really means in this context).

We're storing custom Principals to handle SAML-specific stuff elsewhere, maybe we create custom ServiceSession types with additional data tracked. This might help allow for heterogenous sessions with different protocols used with particular SPs. I know we didn't want hierarchies, but this feels more like data modeling to me, and it's just one deep.

Storage

Pretty well decided that we have two modes, client-side and server-side. Client-side means we don't store the ServiceSessions, but try to fit the rest. Obviously this makes logout completely impossible. Even if we work around the problem of not having access to the session when the request comes in, we will never have the data needed to propagate the logout. We discussed splitting the storage model up so that anything needed for logout would be managed by a separate StorageService instance that would have to be non-client-side.

Lookup Requirements

Main one obviously is lookup via client delivering session or client cookie with key to session, latter requiring basic index by session ID. Note: validation of a session's use is a separate issue. We are not going to overload lookup with validation.

V2 supports lookups in conjunction with queries. I'm very reluctant to support this because it's ambiguous; a non-transient NameID could map to multiple Sessions and I'm not sure that's helpful, but if we do it, I think we have to expose all of them. Also, the query lookups for a session are done based on the principal name after reversing the NameID. I think we should dump this, it's superfluous if we index by NameID. We didn't used to index by NameID in V2, so that's probably why it was done that way.

Logout requires a lookup by SP, NameID, and optional SessionIndex. With no SessionIndex, we definitely have to handle multiple Sessions coming back by design. The use case for non-indexed logout was about terminating sessions with multiple devices. With a SessionIndex, we should only get a single one back.

So we have:

Session ID -> Session (1:1)

SP + NameID -> Session (1:N)

SP + NameID + SAML SessionIndex -> Session (1:1)

Indexing

With the SP, I don't try and turn the NameID, etc. into a fully unique key, just a mostly unique one. I just index by a possibly truncated version of the NameID value, ignoring the qualifiers. Then I validate the results I get back and check for an exact match using the content of the Session before I include it in the set returned.

The StorageService API I have used doesn't accomodate secondary indexing natively. In the SP, I built a secondary index by maintaining a list of the session IDs mapped to the index key. The cleanup problem is dodged there by not doing it. This works to a point under two conditions:

  • there's an upper bound on lifetime of the session, which puts an upper bound on the life of the list of sessions indexed by a given key
  • there aren't a huge number of sessions to be indexed by one key that would make reading and writing the list inefficient

The second of these is a problem for load testing because it can generate a huge number of sessions for a single NameID. This is pretty easy to workaround by having an option not to do the indexing, which is not needed when load testing anyway. I have never come across a legitimate scenario outside of load tests where the number would get large enough to be such a problem.

Right now, I don't think we know about the upper bound question. V2 doesn't bound the IdP session lifetime, because the individual login methods do get lifetimes, and that's also true in V3. And adding one doesn't seem to work very well, since it creates a hard stopping point that would dump all existing active results and force new authentication, even if the user literally logged in again seconds before. The mixing of the IdP session and the AuthnResult object lifetimes and policies to handle SSO create a disincentive for having a real "lifetime" on the session.

So the lack of an upper bound seems like the bigger problem. We can't maintain a secondary index in a naive way that lacks cross-index coherency if there's no way to know when to blow away the index as a worst case. About all we could do is implement background cleanup to walk the indexed lists of sessions, and that would take way too many individual storage lookups. So, hmm.

An IdPSession can also be bound during creation and afterward to client addresses, one per address family (e.g. IPv4, IPv6), and offers a method to check for a timeout that also updates the last activity time. This decouples use cases that care about client address or timeout checks from those that don't.

Only a single AuthenticationResult for a given flow, and only a single SPSession for a given SP are tracked in a session. Of course, multiple such objects might exist in distinct sessions with a Subject (e.g., across different devices).

The AuthenticationResult object is discussed extensively on the Authentication page, and is how the IdP "remembers" an act of authentication for Single Sign-On. Only results stored in an IdPSession are ever reused for SSO, so disabling sessions globally disables SSO.

SPSession

An SPSession is a way to track the authentication interactions the IdP has been involved in during a session (in SAML terms, the SPs it has issued assertions to). There are two main reasons to do this:

  • some kind of distributed logout
  • user interface considerations

The former is as worthless in practice as its always been, but the design accomodates SAML logout requirements explicitly, at a massive cost in code complexity. The latter is to support scenarios in which a UI component may need to present information about the services a subject may have accessed. Obviously this overlaps with logout, but could also be useful in other cases. It might even substitute for logout by giving the subject information about what's not going to be logged out, but in turn that just gives an attacker sitting at a user's terminal more information about what he/she can access.

This aspect of the Session layer can be disabled independently of the rest, to simplify storage requirements or just save cycles. In particularly, to store sessions entirely within a cookie, this tracking must be off, for size reasons.

Within an SPSession, we track:

  • unique name/ID of service
  • creation time
  • expiration time
  • authentication flow ID used to fulfill the service's request for authentication
  • an optional secondary key that may be needed to lookup the SPSession by alternative means

The last one is really for SAML; we have to store the NameID and SessionIndex issued to the SP because that's how logout works. By abstracting this behind a generic interface property, the code isn't SAML-specific except where needed.

Because the actual underlying SPSession type is going to be protocol-specific, it's left to the profile web flow to eventually create and attach it, rather than part of the authentication layer.

Functional Interfaces

There are two interfaces used to interact with the Session layer, one for creating/destroying them (net.shibboleth.idp.session.SessionManager) and one for looking them up (net.shibboleth.idp.session.SessionResolver). In practice they are implemented together.

The SessionManager interface is very minimal because actual updates to an IdPSession are done via methods on the IdPSession, not through the SessionManager. This is more elegant for the caller, but generally means a particular SessionManager implementation is also supplying its own custom implementation of IdPSession to manage changes.

The SessionResolver is designed around the Resolver notion used in a lot of the code base, and lookups are done based on custom Criterion objects that provide for the use cases we have for session access, such as:

  • by session ID
  • by implicit session ID found in a servlet request (i.e., a cookie, though this isn't necessarily required)
  • by a secondary lookup of an SP ID plus custom key (this supports the SAML logout case)

Storage

As a technical matter, the session and storage layers are distinct, but in practice a lot of what SessionManager and SessionResolver have to do depends on the way storage is handled. The concrete implementation provided for the session interfaces is built on top of the StorageService abstraction in OpenSAML.

Any storage implementation able to satisfy that contract will work transparently with the session implementation, including one already implemented that stores data in HTML Local Storage or a cookie and reads/writes it on a per-request basis.

The session implementation can be configured to store SPSession data only in the case that the storage implementation can accomodate particular data sizes, to allow client-side use of cookies.

Technical Details

There are a large number of unusual features implemented in the StorageService and the session layer is the reason for that. It's trying to serve two goals: providing a very customizeable storage layout for advanced use cases like this one, but limiting the impact of that complexity on the actual storage plugin, which is meant to be highly unspecialized and use very opaque storage formats and layouts.

The current implementation manages the storage of an IdPSession as a set of records under a context matching the session ID. One of those records is a "master" record with a fixed key of "_session" that contains a JSON serialization of the "core" sesson data attached to the IdPSession interface. The master record also contains a pair of arrays containing the keys to all associated AuthenticationResult or SPSession objects (which are the flow ID or SP name/ID respectively). This is done because the storage API doesn't provide a way to enumerate the keys within a context, so this provides a foreign key lookup from an IdPSession to its content.

The master record is set to expire based on the session timeout value, and the expiration slides forward on every update of the activity time. There is no actual "lifetime" bound because the session itself has no security value, so as long as a session continues to see use, it will stay alive. The size of the session is bounded by making sure the individual records that make up the session all have appropriate expirations.

Attached to the same context as the master record, each individual AuthenticationResult and SPSession added to the IdPSession is serialized and stored under the foreign key stored in the master record (the flow ID or SP name/ID).

An AuthenticationResult is serialized by the corresponding AuthenticationFlowDescriptor, which implements the StorageSerializer API and handles all the details. Similarly to the master record, the expiration is timeout-driven, based on the last activity of the result, along with an offset that prevents a result from disappearing from storage too quickly.

An SPSession is serialized using a plugin registered for the underlying SPSession class, which allows subclasses to contain extended information specific to new protocols. To make this reversible, the class name of the SPSession is prefixed to the serialized data so it can be read back to determine the right plugin to use to reconstitute the object. The expirations are set based on the actual expiration of the SPSession (as determined by the profile flow that created it) plus some "slop" that keeps knowledge of the SPSession available for logout purposes.

Collectively, these records all expire at somewhat varying times, but they are all bounded and are eventually cleaned up by the StorageService without any intervention.

An example layout:

ContextKeyValueExpiration
<session ID>_session<serialized form of IdPSession and subrecord keys>session last activity + timeout + offset
<session ID>authn/JAAS<serialized AuthenticationResult>result last activity + timeout + offset
<session ID>authn/SPNEGO<serialized AuthenticationResult>result last activity + timeout + offset
<session ID>https://sp.example.org/shibbolethnet.shibboleth.idp.saml.session.SAML2Session:<serialized SPSession>SP session expiration + offset
<session ID>https://sp2.example.org/shibbolethnet.shibboleth.idp.saml.session.SAML2Session:<serialized SPSession>SP session expiration + offset

A word about versioning: a big reason for manipulating the various record expirations is to be able to update and recover the last activity time without actually touching the record. This is safe because the versioning policy for such a field would be last-update-wins anyway, and it avoids needing to modify the serialized form of a record to perform the most common update.

Additionally, AuthenticationResult and SPSession records are otherwise static; they can't change in any other respects, so once serialized they never change. The master record does change any time an address is added, or a sub-record is attached (to maintain the foreign key lists), and the record versioning feature of the StorageService prevents race conditions when updates occur at the same time.

Secondary Index

If all of that sounds horrible, the real fun is implementing a secondary index using the SPSession records to be able to track back to an IdPSession based on an SP name and a custom key. Without SQL semantics, the secondary index is implemented using a simple delimited list of session IDs in a record keyed by the SP name and whatever custom key is appropriate for the session type. This is implementation-specific, but in SAML 2 is probably going to be a truncation of the NameID value. It doesn't have to be unique because we're storing a list of sessions in the record, not just one, and we can independently validate which sessions really "match" in other ways.

The index is maintained by a read followed by a create or update to maintain the session ID list in the record. This is guarded using the versioning support, again, to prevent races and maintain coherent data.

Another "trick" is that the expiration of the record is based on the expiration of the SPSession record being indexed on. This ensures that the index record survives at least as long as every SPSession that may trigger a lookup.

The one warning is that this very low-tech solution breaks under load testing scenarios, because that tends to generate a huge number of sessions with the same SP name and custom key, whereas under normal use you never see significant numbers that are identical. With such a large number, reading and writing the list of sessions becomes inefficient. The same problem has been observed with the Shibboleth SP for the same reason. In practice it doesn't matter, one simply disables this secondary indexing when load testing.

Following along with the example above, the secondary records created might be:

ContextKeyValueExpiration
https://sp.example.org/shibboleth<NameID value><session ID>SP session expiration + offset
https://sp2.example.org/shibboleth<NameID value><session ID>SP session expiration + offset