The SP V4 design reduces the complexity of, but does not elimiinate, the need for web server or application framework agents in order to support a target environment. This document outlines the initial requirements for these agents and a proposed design breakdown of the software components needed.
Targets
- Determine C++ language standard to target while supporting the necessary platforms.
The initial target set of agents are those already supported: Apache (2.4 only), IIS 7+, and FastCGI. Of these, Apache and FastCGI need to be in C/C++, while it is in principle possible that IIS could be implemented via .NET, and thus in other languages. Given that the majority of the code needed for IIS already exists (or is portable and necessary to support the others anyway), it’s probably we would include IIS in the initial C/C++ set.
I am not inclined to roll back the code we do have and write a lot of new code in C. Yes, it’s more portable, easier and faster to build, etc. It’s also far more error prone and lacking in basic library facilities that modern C++ has available. That said, the nature of that C++ code needs to change radically, simplifying and modernizing it, dropping the use of templates wherever possible, and sticking to the standard library only. The nominal target would be C++11, but we need to research whether a newer version would be practical. Elimination of various Boost-based tricks is essential, especially non-standard lambda and parameter binding templates.
Dependencies
The only allowable dependencies must be usable on both Windows and non-Windows platforms (or there must be native alternatives to use). The set should be as minimal as possible. At this stage, only the following has been identified as critical needs:
Regular expressions
TLS-capable HTTP 1.1 client that allows basic control of server trust anchors (NOT the advanced metadata-driven trust evaluation support we use today).
C++11 has standard liibrary support for regular expressions.
The most obvious choices to use for HTTP are the native Windows client (for Windows obviously) and libcurl + OpenSSL (for everythiing else). In principle, libcurl with a different TLS implementation would work given that we stick to the higher level TLS options exposed by libcurl. Notably, this will pull libcurl and a TLS library into the in-process agent footprint, something we avoided in the current SP to avoid symbol conflicts between Apache and the agent. As long as the same version of OpenSSL is used between libcurl and Apache’s mod_ssl, that shouldn’t cause problems, and most Linux distributions would typically do that.
Library Structure
The current implementation is spread across 3 layers of libraries, with two of them forked into two variants using conditional macros to control the portions included in each fork.
One of the three is OpenSAML, and that will be almost if not entirely gone. Portions of the lowest layer, xmltooling, will survive in modified form, primarily the HTTP request/response abstraction that lived there, as that ultimately gets bridged up into the web server modules and handlers. There are various utility abstractions that may survive, though some of those should be obviated by newer C++ STL features.
Changes Desired or Needed
The highest layer is probably going to become a self-contained package of code (with the necessary portions of xmltooling copied in) that will be statically compiled into the various server modules. This will simplify the build and avoid the need for multiple binaries to be packaged to deliver a module.
Packaging everything into single binary modules will make it impractical to support plugin extensions via shared library. I think that’s a fair trade off given that most of the likely extension points are better handled via Java, but we’ll want to be able to allow for new implementations of some interfaces to be added to the build conditionally and we will continue to want to support a type-based plugin factory API for instantiating variants of components at runtime based on configuration.
Another major component that impacts much of the current code is a DOM-based abstraction for accessing configuration settings by mapping XML attributes and elements into string-based properties that the code can access (often by cascading across multiple layers of properties). Since XML is out of the question, this will need to be simplified into a simpler property-driven API but may still need to be “scoped” by component, which starts to look more like a Windows “ini” file than a flat property set. I have implemented that sort of thing before for Unix, but don’t have any of that code to hand anymore.
Components
This is a breakdown of the major pieces of the implementation. These proceed in a very inexact “lowest” to “highest” order, with later pieces typically depending on some of the earlier ones.
Non-XML-based configuration support
Logging abstraction
syslog
Windows event log
Apache error log
DDF (remoted data representation and serialization)
HTTP transport between agent and Java hub
Curl-based implementation
Windows-based implementation
HTTP Request/Responose abstraction
Session cache
Session abstraction
File-backed implementation
Chunked cookie implementation
Handler framework
Session initiator handler
Token consumer handler
Logout handlers
Other handlers
Web server bridges to interface to requests and responses
Portable authorization via RequestMap
Native modules
Apache 2.4 module
IIS module
FastCGI authorizer and requester
Configuration Support
The current configuration is predominantly XML-based, supplemented/bridged to native Apache commands on that platform (none of the other current agents support a usable configuration mechanism).
The RequestMap will continue to be supported as XML, while other needs may be possible to address with a simpler format (JSON is not simpler and is much less flexble, but an INI or property file is definitely simpler). Boost has a somewhat odd-looking Property Tree module that appears to be header-only (so easy to use without adding runtime dependencies), and it actually supports reading and writing to INI, JSON, and XML(!) with some limitations in each case.
That may be a viable way to parse the RequestMap without delegating to Java, and in fact the representation is rather similar to what I built in Java using my DDF library. (The whole abstraction is somewhat similar to DDF in fact.)
For now, it suffices to say that we need to identify what the settings needed are and their general structure before settling on an approach, but this Boost option looks like a good fit simply due to flexibility.
Logging
Currently we use a fork of an old C++ logging library. The elimination of dependencies and the fact that having out of band logging streams from within a web server has never worked well demands we drop this in favor of “native” options. Logging is of course non-portable, and some web servers do their own logging.
We will need our own logging API with implementations for:
Windows Event logging (which may necessitate an accompanying event DLL unfortunately, might be worth trying to embed that in the module DLL)
syslog (on all but Windows)
Apache’s error logging (for Apache only of course)
I could envision wanting some tuning of logging levels (likely not at the level of individual categories), and possibly wanting support to write to multiple logging sinks, but we will NOT provide our own full implementation with customizable appenders, file rotation, etc.
Critically, the bulk of the important material will be on the Java side, and all of the audit logging will be there.
DDF
The DDF abstraction is discussed elsewhere. This is the core remoting layer to issue requests to the Java hub (as it has been now to connect to shibd). It is mostly implemented at this point in the form needed, with a few outstanding questions. The current serialization support is limited to a record-oriented format that should be suitable for C++-based agents, and would be a usable representation of data stored in files (e.g., the session cache). JSON serialization will be needed but likely only for future agents implemented in other languages and in Java, so is out of scope for this work.
HTTP Transport to Java Hub
The current SP relies on a basic socket protocol to read and write serialized DDF objects between agents and shibd. That was originally the plan for the new version but it became obvious that using HTTP alone had a lot of advantages, so the plan shifted to using a simple HTTP 1.1 transport wrapper to pass in and out bodies of serialized DDF objects. There’s no benefit to using HTTP 2 or QUIC here, though it probably wouldn’t be precluded on the hub end of things.
Using HTTP implies support for TLS, which was originally going to be handled via stunnel, but should be much better for deployers if supported natively. This will not be the “TLS as implemented nowhere else” that the SP today uses for SAML exchanges, but rather the primitive approach supported by most HTTP clients in which a set of static trust anchors are applied using simple APIs. Client TLS authentication could be an option but isn’t expected to be a primary tool due to the complexity of key management.
Portability demands that we build an interface to this function abstractly, and then implement it on each platform using the most obvious plumbling. On Windows that would likely be whatever the “best” native client is now (used to be WinHTTP, don’t know if it’s still the best choice). Everywhere else the best option is probably libcurl, which we are very familiar with.
The current SP includes an abstraction geared to SOAP, with a “transport” class implemented on top of libcurl. A simpler version of it should serve as a good basis for this work.
HTTP Request/Response Abstraction
The second HTTP-related component is to model requests and responses (think most web application frameworks, like the servlet API in Java) generically. This uncouples components that need to read requests or manipulate responses from the specific agent environment, which will have a proprietary API used to connect to the surrounding context. There will be less code needing this abstraction than there is now, but still a non-trivial amount of it, including the handler framework that ties the agent to SSO protocols and potentially the session cache.
The current interfaces in xmltooling are likely close to the form they will take here.
Session Cache
The session layer is the most complicated portion of these kinds of agents, and is needed for most “low-level” agent implementations. The cost of insulating applications from identity is providing this kind of session implementation.
The current design is horrendous for a few reasons. One is that it supports SAML’s logout requirements, which stipulate a sort of reverse indexing of information back to sessions. The other big complication is the combining of the in-process caching done by the modules with the storage-backed caching done by shibd in a single component, so there are lots of conditional code blocks and inter-process hand-offs within the code. It’s also very natively handling SAML data within sessions, and also does direct manipulation of cookies to manage the sessions. Most of that complexity needs to be removed.
We will need a newer Session interface (much simpler than the current one that largely sticks with policy timestamps, a bag of attributes, and some opaque data supplied by the hub). Using the DDF returned by the hub and reading and writing that should be sufficient to represent them). The SessionCache interface is probably not too far removed from the current one, hopefully with some minor improvements.
Some requirements:
Sessions need to be buffered in memory with a cleanup/ejection policy to limit size.
A pluggable interface to store sessions persistently will be needed. The proposed implementation initially is to use write-once, read-many files to cache them, allowing for cleanup based on file creation time. If the interface to the cache allows access to the HTTP abstraction as it does now, then a chunked cookie implementation using the DataSealer operations in the hub to encrypt/decrypt data would be a possibility. Another option might be shared memory, which seems to be somewhat common in other modules of this type. Still another is remote use of the Java hub’s StorageService support, allowing for a database, memcache, etc.
Timeout enforcement should be coarse-grained instead of enforced on every request to limit the need to adjust a per-session tracking value that might be expensive to update.
In theory, sessions currently can be updated post-creation, but I would like to avoid this requirement in the future.
Handler Framework
The current implementation supports a pluggable map of paths below a special handler prefix (typically /Shiibboleth.sso) to functional modules that implement something at a URL. They are essentially just virtual URL logic extensions that tend to be similar to native mechanisms in some web servers but for portability were implemented in a portable way. Mounting them below a fixed prefix helped to optimize the process of determining whether to route a request into the handler layer or not. Historically this was also related to a mechanism within IIS in particular for mapping requests to a DLL, which are no longer relevant for newer versions or for Apache.
It may be possible to “flex away” from requiring a fixed prefix for handler URLs, though for compatibility it is expected that deployments will end up re-using the existing paths.
As there are now, there will be different types of handlers implementing different functions. Some of the existing handlers actually don’t involve a lot of business logic within the agents; rather they remote everything to shibd, and this will map fairly well to remoting a lot to Java.
At least three specialized handlers exist (as in the current code), but will be in a protocol-neutral form and aside from compatibility considerations it would be intended that all protocols can start and finish at a unified set of locations rather than be specific to a protocol. That is, it will be invisible to an agent whether SAML, OpenID Connect, or something else is involved, so all protocols could process responses at a fixed endpoint (most SSO protocols generally rely on advance distribution of endpoints for validation).
Session Initiators
Session initiators are handlers that generate requests to create a new session. Unlike now, a single session initiator should be built to handle any protocol supported by the hub. There will be a bundle of request settings fetched from the RequestMap and other sources, as now, but instead of using them to generate requests, they will be packaged into the request to the hub to be handled. The response from the hub will be a tunnelled HTTP response to relay back to the client.
The SP currently supports a number of “fancier” session initiators that relate to IdP discovery. It is TBD exactly how discovery fits in but it’s possible a similar one or two of them may be carried over, particularly in light of the EDS software we provide that probably isn’t going to be deprecated yet. Configuring “chains” of initiators was something mostly hidden in newer versions of the SP, and may be tricky to configure without XML but this is an open question right now.
Initiators also have to be run in a special “in-request” mode that allows their work to be peformed “mid-request” for resources to enforce session requirements. This also introduces a requirement to implement POST form preservation, as in the current implementation, and this is going to be remoted to Java for storage.
Token Consumers
The generic term in the new design for what SAML calls an assertion consumer service is “token consumer”. All supported protocols will be supported by a single implementation of this concept with a fixed set of behavior that results in a session being created (subject to various extra features we may continue to support such as the “sessionHook” concept.
Token consumers will need to perform POST data recovery as they do now.