Page Comparison

...

ID of service
maybe a creation time, do we care?
thinking we might track an expiration, otherwise how do we bound the number of these?
flow ID used for service (could be fresh, or a reuse of active result derived from flow)
for SAML at least, we would need the NameID and SessionIndex issued in the transaction to be able to propagate logout

The latter is a bitch, because we don't have it until very late in the flow. Thinking this means we don't want to actually update the session until very late (or at least not write it back to storage, which is what update really means in this context). Or we explicitly support this in the API instead of bundling all this together as a single session object update.

We're storing custom Principals to handle SAML-specific stuff elsewhere, maybe we create custom ServiceSession types with additional data tracked. This might help allow for heterogenous sessions with different protocols used with particular SPs. I know we didn't want hierarchies, but this feels more like data modeling to me, and it's just one deep.

...

Main one obviously is lookup via client delivering session or client cookie with key to session, latter requiring basic index by session ID. Note: validation of a session's use is a separate issue. We are not going to overload lookup with validation, excepting that lookup of an expired session can be blocked by the storage API already.

V2 supports lookups in conjunction with queries. I'm very reluctant to support this because it's ambiguous; a non-transient NameID could map to multiple Sessions and I'm not sure that's helpful, but if we do it, I think we have to expose all of them. Also, the query lookups for a session are done based on the principal name after reversing the NameID. I think we should dump this, it's superfluous if we index by NameID. We didn't used to index by NameID in V2, so that's probably why it was done that way.

...

The StorageService API I have used doesn't accomodate secondary indexing natively. In the SP, I built a secondary index by maintaining a list of the session IDs associated with a SessionIndex, mapped to the index key. The cleanup problem is dodged there by not doing it. This works to a point under two conditions:

...

The second of these is a problem for load testing because it can generate a huge number of sessions for a single NameID. This is pretty easy to workaround by having an option not to do the indexing, which is not needed when load testing anyway. I have never come across a legitimate scenario outside of load tests where the number would get large enough to be such a problem.

Right now, I don't think we know about the upper bound question. V2 doesn't bound the IdP session lifetime, because the individual login methods do get lifetimes, and that's also true in V3. And adding one doesn't seem to work very well, since it creates a hard stopping point that would dump all existing active results and force new authentication, even if the user literally logged in again seconds before. The mixing of the IdP session and the AuthnResult object lifetimes and policies to handle SSO create a disincentive for having a real "lifetime" on the session.

So the lack of an upper bound seems like the bigger problem. We can't maintain a secondary index in a naive way that lacks cross-index coherency if there's no way to know when to blow away the index as a worst case. About all we could do is implement background cleanup to walk the indexed lists of sessions, and that would take way too many individual storage lookups. So, hmm.

But I think the individual "ServiceSesson" records should have a lifetime, based on some kind of upper bound along with the SessionNotOnOrAfter for the relying party, plus some slop. All of that should be known in the SAML profile code, which is another reason why it's clear the ServiceSession work should be pushed later in the flow, not as part of general Session upkeep.

So I'm thinking we could lay this out like so (triples are context, key, record):

The IdPSession would be stored as (sessionId, "_session", serialization of IdPSession data) with expiration based on inactivity timeout + logout slop
Individual AuthenticationResults stored as (sessionId, flowId, serialization of AuthenticationResult) with expiration based on inactivity timeout
ServiceSessions would be stored as (sessionId, RP entityID, serialization of ServiceSession) with expiration based on expected session lifetime at RP + logout slop

That handles the basic data storage and cleanup, and lookup by session key.

The secondary indexing would be stored as (RP entityID, possibly truncated NameID, multimap of SAML SessionIndex to IdPSession keys) and the expiration would be set/updated to the upper bound of the expirations of the ServiceSessions referenced by a given secondary index.

The secondary indexing matches the lookup requirements of both SAML queries and logout directly. I think we could save some work and just serialize the data there by directly serializing a Multimap, but that would be an internal detail, just a tactical choice.

I also don't really think most other protocols will ever need anything like this. It doesn't really work well in SAML as it is. So we can cross that API bridge when we come to it if something really different comes along.

Versions Compared

Old Version 1

New Version 2

Key