IdP Clustering Configuration

Actually, we fooled you, this page contains nothing about configuring the IdP to run in a clustered fashion. However, please read all of this page because it provides critical information for setting up the environment into which you will deploy you clustered IdPs.

Terminology

Many individuals consider high availability, load balancing, and clustering to be the same thing. They are not. Within this document the terms are used and have the following definitions:

...

Finally, be aware that fail over and load balancing are completely outside the control of the IdP. These require some mechanism (two described below) for routing traffic before it even reaches a node.

Application State and its Effect on Clustering

The IdP is a stateful application. That is it maintains operational information, between requests, that is required to answer subsequent requests. The IdP keeps this information in memory (for various reasons cookies can not be used as they are in some web applications). Therefore, in order to achieve high-availability this information must be shared amongst all IdP nodes. By default the Shibboleth team recommends (and documents) the use of Terracotta as the mechanism for doing this.

Like most applications that use this approach each IdP node keeps the state it creates in memory in a form readily usable by the node but uses a more compact form when making it available to other nodes. Therefore, any load balancing solution used should route all subsequent requests to the same node that serviced the initial request. This prevents the IdP nodes from constantly reading/writing information to/from this more compact form (an expensive process). This is generally known as session affinity load balancing.

Two Common Clustering Approaches

Most environments use one of two methods for creating a cluster of nodes that look like one single service instance to the world at large.

DNS Round Robin

This is done by registering each cluster node under the same hostname. When a DNS lookup is performed for that hostname the DNS server returns back a list of IP addresses (one for each node) and the client chooses which one to contact.

...

The Shibboleth team strongly discourages this approach.

Hardware Base Clustering

This is done by using specially dedicated hardware to intercept and route traffic the various nodes in a cluster (so the hardware basically becomes a switch in front of the nodes). This hardware is then given the host name for all the services provided by the clusters behind it.

...

Because of the guaranteed characteristics provided by this solution, the Shibboleth team recommends this approach. Caution should be taken though to ensure that the load balancing hardware does not become a single point of failure (i.e. buy and run two of them).

Configuring the IdP for Clustering

Now, go set up you cluster environment and then proceed to configure the IdP for Clustering.

Versions Compared

Old Version 1

New Version 2

Key

IdP Clustering Configuration

Terminology

Application State and its Effect on Clustering

Two Common Clustering Approaches

DNS Round Robin

Hardware Base Clustering

Configuring the IdP for Clustering

Page Comparison

Versions Compared

Old Version 1

New Version 2

Key

IdP Clustering Configuration

Terminology

Application State and its Effect on Clustering

Two Common Clustering Approaches

DNS Round Robin

Hardware Base Clustering

Configuring the IdP for Clustering