Native Agent Design/Development Plan

The SP V4 design reduces the complexity of, but does not elimiinate, the need for web server or application framework agents in order to support a target environment. This document outlines the initial requirements for these agents and a proposed design breakdown of the software components needed.

Targets

Determine C++ language standard to target while supporting the necessary platforms.

The initial target set of agents are those already supported: Apache (2.4 only), IIS 7+, and FastCGI. Of these, Apache and FastCGI need to be in C/C++, while it is in principle possible that IIS could be implemented via .NET, and thus in other languages. Given that the majority of the code needed for IIS already exists (or is portable and necessary to support the others anyway), it’s probably we would include IIS in the initial C/C++ set.

I am not inclined to roll back the code we do have and write a lot of new code in C. Yes, it’s more portable, easier and faster to build, etc. It’s also far more error prone and lacking in basic library facilities that modern C++ has available. That said, the nature of that C++ code needs to change radically, simplifying and modernizing it, dropping the use of templates wherever possible, and sticking to the standard library only. The nominal target would be C++11, but we need to research whether a newer version would be practical. Elimination of various Boost-based tricks is essential, especially non-standard lambda and parameter binding templates.

Dependencies

The only allowable dependencies must be usable on both Windows and non-Windows platforms (or there must be native alternatives to use). The set should be as minimal as possible. At this stage, only the following has been identified as critical needs:

Regular expressions
TLS-capable HTTP 1.1 client that allows basic control of server trust anchors (NOT the advanced metadata-driven trust evaluation support we use today).

C++11 has standard library support for regular expressions.

The most obvious choices to use for HTTP are the native Windows client (for Windows obviously) and libcurl + OpenSSL (for everythiing else). In principle, libcurl with a different TLS implementation would work given that we stick to the higher level TLS options exposed by libcurl. Notably, this will pull libcurl and a TLS library into the in-process agent footprint, something we avoided in the current SP to avoid symbol conflicts between Apache and the agent. As long as the same version of OpenSSL is used between libcurl and Apache’s mod_ssl, that shouldn’t cause problems, and most Linux distributions would typically do that.

Library Structure

The current implementation is spread across 3 layers of libraries, with two of them forked into two variants using conditional macros to control the portions included in each fork.

One of the three is OpenSAML, and that will be almost if not entirely gone. Portions of the lowest layer, xmltooling, will survive in modified form, primarily the HTTP request/response abstraction that lived there, as that ultimately gets bridged up into the web server modules and handlers. There are various utility abstractions that may survive, though some of those should be obviated by newer C++ STL features.

Changes Desired or Needed

The highest layer is probably going to become a self-contained package of code (with the necessary portions of xmltooling copied in) that will be statically compiled into the various server modules. This will simplify the build and avoid the need for multiple binaries to be packaged to deliver a module.

Packaging everything into single binary modules will make it impractical to support plugin extensions via shared library. I think that’s a fair trade off given that most of the likely extension points are better handled via Java, but we’ll want to be able to allow for new implementations of some interfaces to be added to the build conditionally and we will continue to want to support a type-based plugin factory API for instantiating variants of components at runtime based on configuration.

Another major component that impacts much of the current code is a DOM-based abstraction for accessing configuration settings by mapping XML attributes and elements into string-based properties that the code can access (often by cascading across multiple layers of properties). Since XML is out of the question, this will need to be simplified into a simpler property-driven API but may still need to be “scoped” by component, which starts to look more like a Windows “ini” file than a flat property set. I have implemented that sort of thing before for Unix, but don’t have any of that code to hand anymore.

Components

This is a breakdown of the major pieces of the implementation. These proceed in a very inexact “lowest” to “highest” order, with later pieces typically depending on some of the earlier ones.

Non-XML-based configuration support
Logging abstraction
- syslog
- Windows event log
- Apache error log
DDF (remoted data representation and serialization)
HTTP transport between agent and Java hub
- Curl-based implementation
- Windows-based implementation
HTTP Request/Responose abstraction
Session cache
- Session abstraction
- File-backed implementation
- Chunked cookie implementation
Handler framework
- Session initiator handler
- Token consumer handler
- Logout handlers
- Other handlers
Portable authorization via RequestMap
Web server bridges to interface to requests and responses
Native modules
- Apache 2.4 module
- IIS module
- FastCGI authorizer and requester

Configuration Support

The current configuration is predominantly XML-based, supplemented/bridged to native Apache commands on that platform (none of the other current agents support a usable configuration mechanism).

The RequestMap will continue to be supported as XML, while other needs may be possible to address with a simpler format (JSON is not simpler and is much less flexble, but an INI or property file is definitely simpler). Boost has a somewhat odd-looking Property Tree module that appears to be header-only (so easy to use without adding runtime dependencies), and it actually supports reading and writing to INI, JSON, and XML(!) with some limitations in each case.

That may be a viable way to parse the RequestMap without delegating to Java, and in fact the representation is rather similar to what I built in Java using my DDF library. (The whole abstraction is somewhat similar to DDF in fact.)

For now, it suffices to say that we need to identify what the settings needed are and their general structure before settling on an approach, but this Boost option looks like a good fit simply due to flexibility.

Logging

Currently we use a fork of an old C++ logging library. The elimination of dependencies and the fact that having out of band logging streams from within a web server has never worked well demands we drop this in favor of “native” options. Logging is of course non-portable, and some web servers do their own logging.

We will need our own logging API with implementations for:

Windows Event logging (which may necessitate an accompanying event DLL unfortunately, might be worth trying to embed that in the module DLL)
syslog (on all but Windows)
Apache’s error logging (for Apache only of course)

I could envision wanting some tuning of logging levels (likely not at the level of individual categories), and possibly wanting support to write to multiple logging sinks, but we will NOT provide our own full implementation with customizable appenders, file rotation, etc.

Critically, the bulk of the important material will be on the Java side, and all of the audit logging will be there.

DDF

The DDF abstraction is discussed elsewhere. This is the core remoting layer to issue requests to the Java hub (as it has been now to connect to shibd). It is mostly implemented at this point in the form needed, with a few outstanding questions. The current serialization support is limited to a record-oriented format that should be suitable for C++-based agents, and would be a usable representation of data stored in files (e.g., the session cache). JSON serialization will be needed but likely only for future agents implemented in other languages and in Java, so is out of scope for this work.

HTTP Transport to Java Hub

The current SP relies on a basic socket protocol to read and write serialized DDF objects between agents and shibd. That was originally the plan for the new version but it became obvious that using HTTP alone had a lot of advantages, so the plan shifted to using a simple HTTP 1.1 transport wrapper to pass in and out bodies of serialized DDF objects. There’s no benefit to using HTTP 2 or QUIC here, though it probably wouldn’t be precluded on the hub end of things.

Using HTTP implies support for TLS, which was originally going to be handled via stunnel, but should be much better for deployers if supported natively. This will not be the “TLS as implemented nowhere else” that the SP today uses for SAML exchanges, but rather the primitive approach supported by most HTTP clients in which a set of static trust anchors are applied using simple APIs. Client TLS authentication could be an option but isn’t expected to be a primary tool due to the complexity of key management.

Portability demands that we build an interface to this function abstractly, and then implement it on each platform using the most obvious plumbling. On Windows that would likely be whatever the “best” native client is now (used to be WinHTTP, don’t know if it’s still the best choice). Everywhere else the best option is probably libcurl, which we are very familiar with.

The current SP includes an abstraction geared to SOAP, with a “transport” class implemented on top of libcurl. A simpler version of it should serve as a good basis for this work.

HTTP Request/Response Abstraction

The second HTTP-related component is to model requests and responses (think most web application frameworks, like the servlet API in Java) generically. This uncouples components that need to read requests or manipulate responses from the specific agent environment, which will have a proprietary API used to connect to the surrounding context. There will be less code needing this abstraction than there is now, but still a non-trivial amount of it, including the handler framework that ties the agent to SSO protocols and potentially the session cache.

The current interfaces in xmltooling are likely close to the form they will take here.

Session Cache

The session layer is the most complicated portion of these kinds of agents, and is needed for most “low-level” agent implementations. The cost of insulating applications from identity is providing this kind of session implementation.

The current design is horrendous for a few reasons. One is that it supports SAML’s logout requirements, which stipulate a sort of reverse indexing of information back to sessions. The other big complication is the combining of the in-process caching done by the modules with the storage-backed caching done by shibd in a single component, so there are lots of conditional code blocks and inter-process hand-offs within the code. It’s also very natively handling SAML data within sessions, and also does direct manipulation of cookies to manage the sessions. Most of that complexity needs to be removed.

We will need a newer Session interface (much simpler than the current one that largely sticks with policy timestamps, a bag of attributes, and some opaque data supplied by the hub). Using the DDF returned by the hub and reading and writing that should be sufficient to represent them). The SessionCache interface is probably not too far removed from the current one, hopefully with some minor improvements.

Some requirements:

Sessions need to be buffered in memory with a cleanup/ejection policy to limit size.
A pluggable interface to store sessions persistently will be needed. The proposed implementation initially is to use write-once, read-many files to cache them, allowing for cleanup based on file creation time. If the interface to the cache allows access to the HTTP abstraction as it does now, then a chunked cookie implementation using the DataSealer operations in the hub to encrypt/decrypt data would be a possibility. Another option might be shared memory, which seems to be somewhat common in other modules of this type. Still another is remote use of the Java hub’s StorageService support, allowing for a database, memcache, etc.
Timeout enforcement should be coarse-grained instead of enforced on every request to limit the need to adjust a per-session tracking value that might be expensive to update.

In theory, sessions currently can be updated post-creation, but I would like to avoid this requirement in the future.

Handler Framework

The current implementation supports a pluggable map of paths below a special handler prefix (typically /Shiibboleth.sso) to functional modules that implement something at a URL. They are essentially just virtual URL logic extensions that tend to be similar to native mechanisms in some web servers but for portability were implemented in a portable way. Mounting them below a fixed prefix helped to optimize the process of determining whether to route a request into the handler layer or not. Historically this was also related to a mechanism within IIS in particular for mapping requests to a DLL, which are no longer relevant for newer versions or for Apache.

It may be possible to “flex away” from requiring a fixed prefix for handler URLs, though for compatibility it is expected that deployments will end up re-using the existing paths.

As there are now, there will be different types of handlers implementing different functions. Some of the existing handlers actually don’t involve a lot of business logic within the agents; rather they remote everything to shibd, and this will map fairly well to remoting a lot to Java.

At least three specialized handlers exist (as in the current code), but will be in a protocol-neutral form and aside from compatibility considerations it would be intended that all protocols can start and finish at a unified set of locations rather than be specific to a protocol. That is, it will be invisible to an agent whether SAML, OpenID Connect, or something else is involved, so all protocols could process responses at a fixed endpoint (most SSO protocols generally rely on advance distribution of endpoints for validation).

Session Initiators

Session initiators are handlers that generate requests to create a new session. Unlike now, a single session initiator should be built to handle any protocol supported by the hub. There will be a bundle of request settings fetched from the RequestMap and other sources, as now, but instead of using them to generate requests, they will be packaged into the request to the hub to be handled. The response from the hub will be a tunnelled HTTP response to relay back to the client.

The SP currently supports a number of “fancier” session initiators that relate to IdP discovery. It is TBD exactly how discovery fits in but it’s possible a similar one or two of them may be carried over, particularly in light of the EDS software we provide that probably isn’t going to be deprecated yet. Configuring “chains” of initiators was something mostly hidden in newer versions of the SP, and may be tricky to configure without XML but this is an open question right now.

Initiators also have to be run in a special “in-request” mode that allows their work to be peformed “mid-request” for resources to enforce session requirements. This also introduces a requirement to implement POST form preservation, as in the current implementation, and this is going to be remoted to Java for storage.

Token Consumers

The generic term in the new design for what SAML calls an assertion consumer service is “token consumer”. All supported protocols will be supported by a single implementation of this concept with a fixed set of behavior that results in a session being created (subject to various extra features we may continue to support such as the “sessionHook” concept.

Token consumers will need to perform POST data recovery as they do now.

Logout Handler(s)

Logout hasn’t been looked at heavily in the context of the new design yet, but there will certainly be one, if not more, logout handlers needed. As with the rest, the requirement is for protocol neutrality/unawareness, so the intent is to relax a lot of the fancier requirements in these protocols and streamline the behavior down to “clear session found in client’s cookie” while punting subsequent behavior to the hub to manage. If we do back-channel logout at all, which I don’t favor, it would be in the form of some kind of storage-backed revocation process that would be enforced with some kind of poilling strategy.

Ideally, I would like to combine the current “local vs. full” logout distinction into one handler since all the protocol aspects would be shunted to the hub anyway. This is all very undercooked at this point.

Other Handlers

Some of the miscellaneous handlers in the current code probably will disappear, but a number will likely remain. These are generally just doing various things like status reporting, session debugging, etc. The DiscoFeed handler is probably something we can shunt over to Java, but security/DOS concerns might necessitate keeping a front-end to that local to the agents.

Portable Authorization

Apache has its own authorization framework (particularly in 2.4) but IIS doesn’t. We built an XML-based syntax to supplement things to support static enforcement of rules against attributes and various built-in variables. It currently hangs off the RequestMap structure but also was pluggable in theory. We would likely drop the pluggability but support the XML syntax.

One complication is that the current code supports reloadable use of external files to house rules, which will be additional code to keep/support.

Web Server Bridges

Each web server hosting an agent generally provides an API to interface to the server’s request and response handling, and the various modules today house classes that bridge from those APIs to the shared logic in the SP’s library that does most of the work in a more generic way. The bridges let that shared code read or set headers, issue responses or redirects, process request bodies, etc. That design would be expected to stay and much of that code probably will remain the same.

The most complexity there probably derives from the so-called “spoof-checking” logic that activates when populating headers is enabled.

Native Modules

The final built artifact in the new design is intended to be a single shared library module for the supported servers (or pair of executables in the case of FastCGI). The only real difference in terms of “size” would be the FastCGI case since the lack of an underlying library for shared code means both executables get big, but I’m not inclined to optimize around FastCGI for delivery. It’s a huge advantage to package a single file.

If other considerations lead to requiring additional binaries, we may revisit and continue to build a library to use the code that’s shared.

In practice the main piece that’s inside these modules today is the web server bridging logic and native configuration/startup/shutdown layers required for Apache and IIS.