LoopDetection

A frequently encountered problem setting up SAML SPs, perhaps especially Shibboleth SPs, is a failure mode that results in more or less infinte looping between the SP and IdP because of problems getting a session established with the SP in combination with a URL that requires one, so every request to the URL issues a login request to the IdP, followed by a failure of some kind, and then a repeat of the login request, ad nausuem. The SP documentation has a topic about this.

From the IdP perspective, there's little or nothing that can be done to stop this without imposing additional requirements to try and limit traffic. In most cases, it's relatively harmless because if it's a real setup mistake, it will get corrected for a service to actually be functioning. Unfortunately there are other cases that crop up that are more "invisible" to users and SP operators because they only happen inside inactive tabs or hidden frames, often as a result of session timeouts. These loops can run for hours if left to their own devices, and greatly distort log metrics and add IdP overhead.

The usual way to deal with this is with web server throttling of some sort, but this can be somewhat tricky to configure because it's generally operating at layer 4 and is based on traffic seen from a particular address, which can be very misleading in proxied/NAT'd networks. It's possible to disrupt legitimate traffic to heavily used services. It's also harder to do with many non-Apache server environments.

The IdP includes some code to support a more experimental approach that operates entirely within the IdP itself, using the new "warning" interceptor flow combined with a built in Predicate that installs a Meter into the Metrics Registry to track the number of requests for a given service and username and checks if the number seen within the last minute exceeds a theshold.

That triggers a customized warning view that can ask the user if they're actually present and want to continue (many users might say yes, but won't understand that it won't likely help them resolve the problem, so this also doubles as a potential means of directing them to the right support resources). For invisible loops, which are really the problem, it breaks the loop cleanly.

It should be noted that there's clearly (at this point unknown) memory overhead for installing a Meter per service and user, and the intention is that this be enabled for only services known to be offending frequently enough to make that worth it. To help with this, the Predicate class is configured with a map of SP entityIDs/names that should be monitored and a map value that names the Meter. This provides a "safe" label for the Meter instead of the entityID (and even allows for mapping multiple services to one Meter, though that seems unlikely to make sense). Of course the "warning" flow itself has to be enabled for the services to monitor as well, though enabling it for unmapped services will do nothing.

In addition, the example demonstrates a second optimization, adding a precursor condition that only applies the loop detection logic for SSO-based requests (i.e., requests that report out as having not prompted the user for authentication, since loops inherently are an SSO-based phenomenon). The theory is that this will limit the creation of timer metrics for any users that are just “one and done” most of the time, which tends to be a higher proportion of usage than one might expect.

Configuring this is quite simple (note per the WarningInterceptConfiguration topic, you need to enable the relevant Module first). In the example below, a single service is monitored, and a view template must be created in views/intercept/loop-detected.vm to render. The example also applies a warning "duration" of zero, which is a signal that the warning should always appear when the condition is met rather than only at some lower interval. The trigger is set by the threshold property, which defaults to 20 requests in a minute.

conf/intercept/warning-intercept-config.xml
<?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:context="http://www.springframework.org/schema/context" xmlns:util="http://www.springframework.org/schema/util" xmlns:p="http://www.springframework.org/schema/p" xmlns:c="http://www.springframework.org/schema/c" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd" default-init-method="initialize" default-destroy-method="destroy"> <!-- The map keys are the names of view templates to render if the condition evaluates to true. The values are of type Pair<Predicate<ProfileRequestContext>,Duration>. The condition determines whether the warning is displayed, and the duration is the interval between warnings. --> <util:map id="shibboleth.warning.ConditionMap"> <entry key="loop-detected"> <bean parent="shibboleth.Pair"> <constructor-arg index="0"> <bean parent="shibboleth.Conditions.AND" c:_0-ref="custom.OnlySSO"> <constructor-arg index="1"> <bean parent="shibboleth.Conditions.LoopDetection" p:threshold="20"> <property name="relyingPartyMap"> <map> <entry key="https://sp.example.org/shibboleth" value="example" /> </map> </property> <property name="usernameLookupStrategy"> <bean parent="shibboleth.Functions.Compose" c:g-ref="shibboleth.PrincipalNameLookup.Subject" c:f-ref="shibboleth.ChildLookup.SubjectContext" /> </property> </bean> </constructor-arg> </bean> </constructor-arg> <constructor-arg index="1"> <bean class="java.time.Duration" factory-method="parse" c:_0="PT0S" /> </constructor-arg> </bean> </entry> </util:map> <!-- This adapts an existing function that signals whether SSO applied, and adapts it into a Predicate. --> <bean id="custom.OnlySSO" class="net.shibboleth.shared.logic.PredicateSupport" factory-method="fromFunction" c:_1="false"> <constructor-arg index="0"> <bean parent="shibboleth.Functions.Compose"> <constructor-arg name="g"> <bean class="net.shibboleth.idp.authn.context.navigate.PreviousResultLookupFunction" /> </constructor-arg> <constructor-arg name="f"> <ref bean="shibboleth.ChildLookup.AuthenticationContext" /> </constructor-arg> </bean> </constructor-arg> </bean> </beans>

You may also wish to suppress the potentially large number of Meters from appearing via "standard" access to the full set of metrics, which can be done quickly by adding to conf/logback.xml:

<logger name="metrics.net.shibboleth.idp.loopDetection" level="OFF" />

This won't fully prevent access to the information, which would require adjusting access control for the relevant metrics via the MetricsConfiguration.