Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Note

DRAFT - In Progress

Table of Contents
maxLevel2

Background

To better inform discussion about a next generation SP design, this is a high level summary of the current design, components, and some of the key decisions that have to be made if we go at this again.

...

The configuration object is largely a big "scaffold" built by a couple of large, complex classes that store off a lot of information and build lots of plugins that do other work and hang them off the configuration class. It's basically the outside-in approach that dependency injection avoids in other systems. The noteworthy thing is that once this is built, apart from reloading changed configuration, the object tree is (with one recent exception) fully "processed" up front rather than constantly parsed in real time.

...

  • AssertionLookup – this was a back channel hook to pass the full SAML assertion into applications that wanted it. Not clear to me this would be worth keeping.

  • DiscoveryFeed – the discovery feed, this would clearly go away though might have to migrate into the hub in some form if we intended to maintain the EDS.

  • AttributeChecker – basically a pre-session authorization tool, probably would need to stay in some form

  • ExternalAuth – this was a backdoor to create sessions as a trick, I doubt we'd keep it but it would take substantial offloading to do it

  • MetadataGenerator – gone, obviously, but probably replaced by something else, possibly somewhere else

  • SAML 2 ArtifactResolution – this is for resolving outbound-issued artifacts, I can't imagine we'd bother keeping it, offloaded or otherwise, but we could

  • SAML 2 NameIDMgmt – if we kept this, it would probably need to morph into some more generic idea of updating active session data via external input

  • SessionHandler – debugging, still needed

  • StatusHandler – debugging, still needed I imagine

  • AttributeResolver – this was a trick to do post-session acquisition of data using the SP AttributeResolver/etc. machinery and dump it as JSON; if we kept this it would have to be offloaed obviously, and we'd likely have to discuss the actual requirements with someone

...

This is the heart of a lot of the tough decisions to make, in particular whether the actual non-localized session cache should be migrated to the hub service, but I doubt that's really practical if the hub is a shared service. It's likely too much state in a lot of cases if a shared hub is used, and it just means the hub has a state problem to address just as the IdP does now. The IdP solution to that is local storage; that might work but it's tricky because it requires Javascript to get at it and it has to be the SP web server mediating that (the hub is not a proxy because we already have proxies). I punted trying to do that with the SP to this point and stuck to cookies because I thought the size was manageable. I'm not sure it actually is, but it may be less additional work to just figure out a way to get at HTML local storage than other options, or to start with cookies and see how it goes.

The one advantage of moving the second-level cache to the hub would be the potential of implementing other storage options in Java rather than C++, which will open up a lot more options. A possible hybrid would be to just implement session cache store operations remotely but not actually implement a real cache itself in the hub, just have it passthrough data between storage back-ends and the SP, essentially just relaying between a REST API we design and the Java libraries to talk to Hazelcast or what have you.

OTOH, a more local hub design analagous to "shibd" can clearly continue to provide a session store as it does now, and would have the added advantage of additional storage options being in Java. Furthermore, a revamped model of only updating session use infrequently for more inexact timeouts would greatly reduce the traffic, making remote deployment of a hub for a cluster of agents much more likely to scale.

All that said, we may need to revisit the idea of a simple file-system-based cache. Lots of Apache modules do this today and people have indicated to me that it scales. If that’s the case, we may want to consider starting with that and bypassing shibd entirely other than for stateless processing. The files could just be the encrypted content that would otherwise be in cookies and creative use of stat and touch with file timestamps could do a lot of things. But that might not be portable to Windows either.

Library Content

This summarizes the general functionality from each dependent library and its likely future.

...

This is discussed pretty thoroughly in ApplicationModel, but is basically a way of layering segregation of content on top of an undifferentiated web document space. It's tied heavily into the Request Mapper component (which maps requests to the proper application ID). Since most of the settings in this layer are really more SAML-level details, the implication is that maintaining this feature would require that the hub know about it intimately to be able to segregate its own configuration in the same way.

...