SP Service Architecture

DRAFT

Overview

My current thinking has shifted away from envisioning the SP service as a stand-alone application and back toward strongly coupling it with the IdP as a plugin. There are a few reasons for this:

  • It facilitates packaging and delivery of the software and reinforces the IdP as our core product.

  • It makes direct reuse of the IdP’s code directly possible. The refactoring we did is in some sense unforunately, but having done it, this direction makes refactoring a choice rather than an obligation in the future.

  • It makes use of Spring Web Flow as an implementation tool much simpler.

  • It allows us to expose a REST API for agents more conveniently.

  • It shifts the focus to agents that will have HTTP client capability, which is likely the majority of them anyway. Remoting HTTP calls via the DDF protocol becomes the special case, even if it turns out to be required for the most common agent today, Apache.

  • It obviates the need to construct an entirely new set of components to manage “message” addressing and delivery since the SWF layer can already handle that for us.

One difference between the SP and IdP is the notion of muliple instances of the metadata and attribute-related services, something the IdP supports in theory but not so much in practice. While most (and preferably nearly all) SPs define default configurations for metadata and attribute handling that are reused even when application overrides are used, there are deployers that insist on isolating these configurations. Right now, we’re leaning toward not explicitly pushing towards that kind of isolation, and instead encouraging a single instance of each service, with behavior tailored through activation conditions that would trigger around which application was active. This should be simpler to configure and more efficient most of the time.

Applications

Applications probably will remain the core abstractioon around which the SP exposes configuration of protocol behavior. To achieve the separation that the “containment” relationship provides now between Applications and components like metadata and attribute handling, the various reloadable services for these functions can be injected into the Application objects with Spring. Applicatons would then be free to reuse (or not) common instances/configurations of these services as required, but normally would do so.

As it stands now, it’s TBD whether a given instance of this service should host only Applications deployed together in a virtual web environment (i.e., not necessarily one physical server, but one logical server). I lean toward “no” and opening this up to simply viewing an Application as the top level concept without regard for where it lives. It’s not clear yet whether some of the logical URL details that are needed for things like endpoint computation and destination enforcement will actually require that the Java service be aware of these mappings. It’s plausible that any URL details might be fed in from the agent and simply trusted by the service as it goes about its work. Of course, sharing many different Applications inside one service instance obviously has implications for performance/scale, and particularly for security.

Addressing Applications

Assuming we lump “many” SP’s Applications into one service instance, identifying them starts to overlap with the problem of porting over the configuration and how entityIDs or client_ids are assigned to systems. Even if we wanted to make that assignment purely a service-side issue, it wouldn’t allow the service to cleanly identify the agents in a shared scenario, and there’s no identifier for the Applications that would be unique in the current model. An obvious thing to do would be to combine the current entityID and applicationId together to build a unique value, but the SP’s multi-hosting features might make that a challenge if there are e.g. entityIDs being calculated based on vhosts. It’s plausible that at least for compatibility with the old configuration we would just use the “default” entityID to name the agent, and come up with something more abstracted (agent ID?) if a new, simpler configuration were used.

Security

The lack of security in the current remoting layer was always a very deliberate choice because security is hard, true networking greatly reduces performance, and it adds dependencies. The redesign’s focus is on eliminating all dependencies, so adding security formally isn’t likely in the cards. However, we envision adding TLS support to the TCP gateway, and the use of stunnel is a plausible solution to secure the traffic if desired, which would be critical if different agents share an instance of the service. Direct REST access would be TLS-protected of course.

Additionally, it’s possible that some kind of simple key or secret authorization strategy could be employed to limit the agents that can send requests targeted at a particular Application.

Agent / Service Interaction

We can’t strictly require the service to remember state about the agents. That is, it’s not going to fly to have the agents connect to the service to supply information about themselves and then “remember” that information so it can be applied later. That won’t work without persistence on the service we don’t want (it starts to look like OpenID Dynamic Client Registration, no thank you) or without creating problems if the service restarts, and the agent doesn’t realize it has to re-initialize things. We could build in some kind of retry model where a remote call fails and signals that initialization is required, but it would be preferable if we can break apart enough of the settings the agent might need to supply to the service so that the ones that matter for a request are just supplied with the request. If the overhead gets too severe, we can consider the retry model.

Regardless, the agent likely will be stateful with respect to the service in some sense, by requiring the agent to initially connect to the service to both verify it’s available and ensure it is recognized/authorized by the service (possibly minimally so, i.e., it can connect so it’s accepted). At the same time, this provides the obvious hook to feed in the legacy configuration, or a new XML format surrounding primarily the RequestMapper, and have the service parse it and return the processed results for the agent. This is clearly the first obvious “not a handler” remoted operation.

Many of the other “not a handler” operations will probably be centered on session handling, asking to recover a session. Plausibly, a request to the agent that does not self-identify a session already in the agent’s local cache (whatever that entails) would lead to a request to either return a session, or (if a flag signaled that one was required) ask the service to respond with a new login request message (i.e., return the HTTP response body or redirect to make one happen). That implies that any configuration settings influencing the generation of that request would have to be known by the service or provided by the agent with the request. That kind of hurts, since it turns a simple “get session” call into a much larger message “just in case” but it seems unavoidable because the SP definitely was steered toward using content settings to influence login requests, but ideally the overhead won’t be too bad and could, I suppose, even be optional. Some kind of “I’m using advanced features like this” flag could change the agent’s behavior in some way.