Page Comparison

Note
DRAFT - In Progress

Table of Contents

maxLevel	2

Background

To better inform discussion about a next generation SP design, this is a high level summary of the current design, components, and some of the key decisions that have to be made if we go at this again.

To reiterate some of the main problems with the current design:

The volume of C/C++ code needs to be greatly reduced to be anything close to sustainable with the skill sets available today from people not near retirement.
Reliance on the currently used XML libraries has proven to be too big a risk going forward because they are essentially unmaintained. Alternatives may exist if a rewrite were done, but wouldn't address point 1.
The footprint is not amenable to a lot of "modern" approaches to web application deployment.
Packaging has been a significant effort, partly due to the sheer number of different libraries involved, and shipping lots of libraries on Windows means lots of maintenance releases.

That said, there are some key requirements we probably have some consensus on that a new design would have to maintain:

Support for Apache 2.4 and IIS 7+, ideally in a form that leads more easily to other options in the future.
Support for the current very general integration strategy for applications that relies on server variables or headers.
Some relatively straightforward options for clustering.

The new design being considered would offload the majority of "not every request" processing to a "hub" that would be implemented in Java on top of the existing code base used by the IdP, including its SAML support, XML Security, metadata, etc. Long term it could potentially expand to more generic support for other protocols that fit the same general pattern. The intention would be to maintain at least some degree of scalability by ensuring that post-login "business as usual" requests could be handled without the hub as much as possible, which is where a lot of the interesting design decisions lie.

...

There's a lengthy list of components that should be possible to eliminate from the C++ code by offloading the work:

All SAML specifics, including metadata, message handling, policy rules, artifact resolution. I wouldn't anticipate a single SAML reference in the code, ideally, other than perhaps paths for compatibility.
SAML Attribute processing, extraction, decoding, filtering, resolution of additional data, etc.

I would think one big win would be leveraging the IdP's AttributeRegistry, AttributeResolver (and filter engine of course) to do all this work for us, and in particularly to instantly add the ability to supplement incoming data with database, LDAP, web service, etc. lookups and transformations, returning all of the results to the SP agent. For some deployments, that alone may be enough to incentivize converting to this approach, though it's possible that many of those deployments would (and have) just proxied SSO already anyway.

Credential handling and trust engines
SOAP client
Protocol and security policy "providers" that supply a lot of low-level configuration details
Replay cache

Most of this code would be either unnecessary to a redesign or already implemented in Java, modulo that configuring it may be very different or would have to be wired up in code based on the existing SP configuration syntax (if viewed as absolutely needed).

...

The rest of the handlers are a grab bag of stuff.

AssertionLookup – this was a back channel hook to pass the full SAML assertion into applications that wanted it. Not clear to me this would be worth keeping.
DiscoveryFeed – the discovery feed, this would clearly go away though might have to migrate into the hub in some form if we intended to maintain the EDS.
AttributeChecker – basically a pre-session authorization tool, probably would need to stay in some form
ExternalAuth – this was a backdoor to create sessions as a trick, I doubt we'd keep it but it would take substantial offloading to do it
MetadataGenerator – gone, obviously, but probably replaced by something else, possibly somewhere else
SAML 2 ArtifactResolution – this is for resolving outbound-issued artifacts, I can't imagine we'd bother keeping it, offloaded or otherwise, but we could
SAML 2 NameIDMgmt – if we kept this, it would probably need to morph into some more generic idea of updating active session data via external input
SessionHandler – debugging, still needed
StatusHandler – debugging, still needed I imagine
AttributeResolver – this was a trick to do post-session acquisition of data using the SP AttributeResolver/etc. machinery and dump it as JSON; if we kept this it would have to be offloaed obviously, and we'd likely have to discuss the actual requirements with someone

Session Cache

This is the most complex piece of code in the SP (and not coincidentally the IdP). Partly this is because it's a component that tends to start life as a "self-contained" component but ends up having to solve so many problems that the final result isn't so modular anymore, and didn't get decomposed into smaller portions. Sessions in general are just the hardest part of implementing one of these modules and in some sense are the only reason to do it. In a web platform that handles sessions, it's going to make more sense to implement identity inside that platform and not generically, because the application is already stuck using that platform and will be an instant legacy debt nightmare regardless of how you do identity.

...

Versions Compared

Old Version 21

New Version 22

Key

Background

Session Cache