Page Comparison

...

Another major component that impacts much of the current code is a DOM-based abstraction for accessing configuration settings by mapping XML attributes and elements into string-based properties that the code can access (often by cascading across multiple layers of properties). Since XML is out of the question, this will need to be simplified into a simpler property-driven API but may still need to be “scoped” by component, which starts to look more like a Windows “ini” file than a flat property set. I have implemented that sort of thing before for Unix, but don’t have any of that code to hand anymore.

...

Components

...

This is a breakdown of the major pieces of the implementation. These proceed in a very inexact “lowest” to “highest” order, with later pieces typically depending on some of the earlier ones.
Non-XML-based configuration support
Logging abstraction
syslog
Windows event log
Apache error log
DDF (remoted data representation and serialization)
HTTP transport between agent and Java hub
Curl-based implementation
WindoesWindows-based implementation
HTTP abstraction for web server integrationRequest/Responose abstraction
Session cache
Session abstraction
File-backed implementation
Chunked cookie implementation
Handler framework
Session initiator handler
Token consumer handler
Logout handlers
Other handlers
Web server bridges to interface to requests and responses
Portable authorization via RequestMap
Native modules
Apache 2.4 module
IIS module
FastCGI authorizer and requester
...

Configuration Support

The current configuration is predominantly XML-based, supplemented/bridged to native Apache commands on that platform (none of the other current agents support a usable configuration mechanism).

...

Windows Event logging (which necessitates may necessitate an accompanying event DLL unfortunately, might be worth trying to embed that in the module DLL)
syslog (on all but Windows)
Apache’s error logging (for Apache only of course)

...

The current SP includes an abstraction geared to SOAP, with a “transport” class implemented on top of libcurl. A simpler version of it should serve as a good basis for this work.

HTTP Request/Response Abstraction

The second HTTP-related component is to model requests and responses (think most web application frameworks, like the servlet API in Java) generically. This uncouples components that need to read requests or manipulate responses from the specific agent environment, which will have a proprietary API used to connect to the surrounding context. There will be less code needing this abstraction than there is now, but still a non-trivial amount of it, including the handler framework that ties the agent to SSO protocols and potentially the session cache.

The current interfaces in xmltooling are likely close to the form they will take here.

Session Cache

The session layer is the most complicated portion of these kinds of agents, and is needed for most “low-level” agent implementations. The cost of insulating applications from identity is providing this kind of session implementation.

The current design is horrendous for a few reasons. One is that it supports SAML’s logout requirements, which stipulate a sort of reverse indexing of information back to sessions. The other big complication is the combining of the in-process caching done by the modules with the storage-backed caching done by shibd in a single component, so there are lots of conditional code blocks and inter-process hand-offs within the code. It’s also very natively handling SAML data within sessions, and also does direct manipulation of cookies to manage the sessions. Most of that complexity needs to be removed.

We will need a newer Session interface (much simpler than the current one that largely sticks with policy timestamps, a bag of attributes, and some opaque data supplied by the hub). Using the DDF returned by the hub and reading and writing that should be sufficient to represent them). The SessionCache interface is probably not too far removed from the current one, hopefully with some minor improvements.

Some requirements:

Sessions need to be buffered in memory with a cleanup/ejection policy to limit size.
A pluggable interface to store sessions persistently will be needed. The proposed implementation initially is to use write-once, read-many files to cache them, allowing for cleanup based on file creation time. If the interface to the cache allows access to the HTTP abstraction as it does now, then a chunked cookie implementation using the DataSealer operations in the hub to encrypt/decrypt data would be a possibility. Another option might be shared memory, which seems to be somewhat common in other modules of this type. Still another is remote use of the Java hub’s StorageService support, allowing for a database, memcache, etc.
Timeout enforcement should be coarse-grained instead of enforced on every request to limit the need to adjust a per-session tracking value that might be expensive to update.

In theory, sessions currently can be updated post-creation, but I would like to avoid this requirement in the future.

Handler Framework

The current implementation supports a pluggable map of paths below a special handler prefix (typically /Shiibboleth.sso) to functional modules that implement something at a URL. They are essentially just virtual URL logic extensions that tend to be similar to native mechanisms in some web servers but for portability were implemented in a portable way. Mounting them below a fixed prefix helped to optimize the process of determining whether to route a request into the handler layer or not. Historically this was also related to a mechanism within IIS in particular for mapping requests to a DLL, which are no longer relevant for newer versions or for Apache.

It may be possible to “flex away” from requiring a fixed prefix for handler URLs, though for compatibility it is expected that deployments will end up re-using the existing paths.

As there are now, there will be different types of handlers implementing different functions. Some of the existing handlers actually don’t involve a lot of business logic within the agents; rather they remote everything to shibd, and this will map fairly well to remoting a lot to Java.

At least three specialized handlers exist (as in the current code), but will be in a protocol-neutral form and aside from compatibility considerations it would be intended that all protocols can start and finish at a unified set of locations rather than be specific to a protocol. That is, it will be invisible to an agent whether SAML, OpenID Connect, or something else is involved, so all protocols could process responses at a fixed endpoint (most SSO protocols generally rely on advance distribution of endpoints for validation).

Session Initiators

Session initiators are handlers that generate requests to create a new session. Unlike now, a single session initiator should be built to handle any protocol supported by the hub. There will be a bundle of request settings fetched from the RequestMap and other sources, as now, but instead of using them to generate requests, they will be packaged into the request to the hub to be handled. The response from the hub will be a tunnelled HTTP response to relay back to the client.

The SP currently supports a number of “fancier” session initiators that relate to IdP discovery. It is TBD exactly how discovery fits in but it’s possible a similar one or two of them may be carried over, particularly in light of the EDS software we provide that probably isn’t going to be deprecated yet. Configuring “chains” of initiators was something mostly hidden in newer versions of the SP, and may be tricky to configure without XML but this is an open question right now.

Initiators also have to be run in a special “in-request” mode that allows their work to be peformed “mid-request” for resources to enforce session requirements. This also introduces a requirement to implement POST form preservation, as in the current implementation, and this is going to be remoted to Java for storage.

Token Consumers

The generic term in the new design for what SAML calls an assertion consumer service is “token consumer”. All supported protocols will be supported by a single implementation of this concept with a fixed set of behavior that results in a session being created (subject to various extra features we may continue to support such as the “sessionHook” concept.

Token consumers will need to perform POST data recovery as they do now.

Versions Compared

Old Version 3

New Version 4

Key