Date: Thu, 28 Mar 2024 10:44:03 +0000 (UTC) Message-ID: <1017843404.153.1711622643488@09f95ff058ba> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_152_1380288330.1711622643488" ------=_Part_152_1380288330.1711622643488 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
The system provides ample logging to diagnose errors, and your first res= ponsibility is to review that information in any situation. You should also= be familiar with all the logs involved, and what their function is, to und= erstand where to look.
The core logs consist of:
The IdP process log, usually in idp.home/logs/idp-process.log (= warnings and errors will be mirrored to idp.home/logs/idp-warn.log= ), is the primary troubleshooting resource for most problems. It should be = the first place you look.
A servlet container log from Tomcat, Jetty, etc. This is usually NOT all= that useful unless the IdP is actually refusing to initialize and start up= at all, but may be important in those cases. Note that if the Status page is viewable, that likely means the IdP started properly.<= /p>
A web server log in the case that you are proxying traffic from Apache o= r the like. This is also probably not involved unless the problem is very l= ow-level and involves connectivity to the system.
Web access logs, which track requests to the web server. These will be w= herever traffic is handled (servlet container, Apache, etc.) and help with = understanding what clients are asking for. This can help diagnose issues wi= th the advertised locations for IdP functionality.
Back-end logs of authentication and attribute sources may also be necess= ary in some cases, but most of the time the IdP should provide enough in it= s own log to indicate what the problems are.
There are two known reasons for experiencing unusually slow startup: usi= ng Tomcat in certain ways and a blocking PRNG. Taken in order:
If you're using Tomcat, this issue is discussed on the (old) prep page along with some workarounds, primarily forcing it= to expand the warfile, counter to the usual advice we give with Jetty.
With all containers, it's not uncommon for startup to be delayed sometim= es several minutes because Java defaults to using /dev/random as a source of randomness, which is a blocking device. The most common = workaround for this is to switch to /dev/urandom instead, = which can be done by passing a system property to Java during startup (e.g.= on Jetty via start.ini):
-Djava.= security.egd=3Dfile:/dev/urandom
Traditionally this has been viewed as less safe. There are alternate views of this. The Shibboleth Project does not take an of= ficial position on this question.
This is a human translation of=
the EndpointResolutionFailed event in the IdP that triggers when the check=
between the AssertionConsumerServiceURL
in an SP's request is not in the SP's meta=
data, so the IdP fails the request in accordance with the standard's requir=
ement to validate the response location.
Most SAML SPs, and certainly m=
ost or all Shibboleth SPs, will include a full AssertionC=
onsumerServiceURL
attribut=
e in their <AuthnRequest>
message to the IdP. The value of the URL in a Shibb=
oleth SP is determined by the computed request URL that led to the issuance=
of the request and is primarily a function of web server configuration (on=
Apache) or the SP's <
ISAPI
=
>
site mappin=
g configuration (on IIS).
Whatever its set to, that URL =
generally has to be one of the <AssertionConsumerService> endpoints in the SP's me=
tadata used by the IdP or this error message is returned by default in resp=
onse to the error and the IdP log records the problem.
As an IdP operator you either = have bad metadata, or you have a broken SP, and unless you manage the metad= ata yourself (likely in the case of a commercial SP), it's not your problem= to solve. Only the SP operator knows whether the URL is valid, so you may = have to update the metadata or they will have to stop generating requests i= ncluding the bad URL. The cause of a bad URL is generally a failure to prop= erly configure a web server running the SP to account for local virtualizat= ion, load balancing, etc. It could also simply be a failure to update metad= ata to reflect a change that the SP deployer has made.
If you see this in the log, usually accompanied by other problems and re= ports of users unable to login successfully, you are experiencing a problem= somewhere between the IdP and some back-end system you have configured for= authentication, attribute resolution, or some other kind of remotely acces= sed functionality. This layer isn't properly timing out fast enough to prev= ent users from replaying requests, smashing form buttons, etc., and the res= ult is clients stacking up requests that are hung up until the limit on sim= ultaneous requests from one client is reached. Eventually new requests time= out trying to get locks to add new requests to the queue and this starts ge= tting logged.
You can't fix this other than by determining which aspect of your deploy= ment is hung up and adjusting it to either behave properly or at least fail= quickly enough to produce errors and get transactions completed. It may no= t work, but it shouldn't hang.
The XXXX above might be "NameIdentifierGeneration" or "RelyingPartyResol= ver" or "AttributeResolver", or any of the other service names. When you se= e this message while debugging something that's not working, you are missin= g the real problem and focusing on the aftermath.
The log is not just the last message you see; you have = to look at all of it, particularly during startup or reloading a new config= uration for a particular part of the system. The real error is that you've = broken a part of the configuration in a particularly significant way, usual= ly with a syntax error of some sort, and you need to look at the error that= describes that problem to make any progress.
Sometimes that will be a fairly complex Spring message because of a prob= lem creating Java objects, but occasionally it will be fairly obvious what = the error is if the problem mentions an XML syntax error or a property name= that turns out to be misspelled. Regardless, that's the real problem and y= ou should ignore whatever comes after it until you resolve the original iss= ue and get the configuration of the affected service at least loading again= .
If you're really confused, try changing the relevant "failfast" property= in the services.properties file to force the IdP to fail at start= up. At least then you'll know where the real issue is.
The first time the status page is accessed after starting the IdP, the f= ollowing log message will be seen:
ERROR [= org.apache.velocity:96] - ResourceManager : unable to find resource 'status= .vm' in any resource loader.
Although this is presented by the Velocity library as an ERRO=
R
level message, this should be ignored unless you have customised t=
he status page yourself. In fact, the default status page is implemented as=
a JSP and not as a Velocity template, and the message merely indicates tha=
t the fallback from Velocity to JSP is occurring for the status view the fi=
rst time it is invoked. This message is filtered in the logging configurati=
on in newer IdP versions.
If the default login UI is changed from a Velocity template to a JSP, th= e following error will be seen at least once:
ERROR [= org.apache.velocity:96] - ResourceManager : unable to find resource 'login.= vm' in any resource loader.
Although this is presented by the Velocity library as an ERRO=
R
level message, this can be ignored. The message merely indica=
tes that the fallback from Velocity to JSP is occurring. You may wish to fi=
lter it with your logging configuration (it's filtered out in the default l=
ogging file now).
IdPs that consume metadata using the FileBackedHTTPMetadataProvide=
r
regularly download metadata from an external metadata source. The =
IdP has no control over the size of that metadata file, and it may have suf=
ficiently increased in size from one download to the next that the IdP cann=
ot process the later file. Since the IdP stores the previous file, it conti=
nues to use that until the metadata it contains expires (perhaps a couple o=
f weeks after the initial out-of-memory error) at which point the IdP stops=
recognizing SPs. Re-starting the IdP temporarily resolves the issue becaus=
e it allows the IdP to download new metadata until the next time it runs ou=
t of memory.
Symptoms include the idp-warn.log showing entries like this:
2017-01= -16 14:02:16,700 - DEBUG [org.opensaml.saml.metadata.resolver.impl.Abstract= ReloadingMetadataResolver:380] - Metadata Resolver FileBackedHTTPMetadataRe= solver UKfederationMetadata: Unmarshalling metadata from 'http://metadata.u= kfederation.org.uk/ukfederation-metadata.xml' 2017-01-16 14:02:22,323 - ERROR [net.shibboleth.utilities.java.support.serv= ice.AbstractReloadableService:188] - Service 'shibboleth.MetadataResolverSe= rvice': Unexpected error during initial load {} org.springframework.beans.f= actory.BeanCreationException: Error creating bean with name 'ShibbolethMeta= data': Cannot create inner bean
Please see the Installation pages for your parti= cular deployment to determine how to set an appropriate Java heap size, and= what that size should be, or (much more sensiibly), just switch to the DynamicHTTPMetadataProvider via an MDQ = service hosted by your federation.
The default logging provided in logback.xml is for a truncated informati=
on from exceptions. Particularly when debugging a development system =
it may be helpful to turn up the amount of information in the log. This page has more details but replacing this line in&nbs=
p;logback.
<Patt= ern>%date{ISO8601} - %level [%logger:%line] - %msg%n%ex{short}</Patte= rn>
with
<Patt= ern>%date{ISO8601} - %level [%logger:%line] - %msg%n%ex{full}</Patter= n>
Can often provide a useful hint of where the fault has come from. For in= stance:
Caused = by: java.lang.IllegalArgumentException: name =20 at sun.misc.URLClassPath$Loader.findResource(Unknown Source) =20 at sun.misc.URLClassPath.findResource(Unknown Source) =20 at java.net.URLClassLoader$2.run(Unknown Source) =20 at java.net.URLClassLoader$2.run(Unknown Source) =20 at java.security.AccessController.doPrivileged(Native Method) =20 at java.net.URLClassLoader.findResource(Unknown Source) =20 at org.eclipse.jetty.webapp.WebAppClassLoader.getResource(WebAppClassLoader= .java:402) =20 at java.net.URLClassLoader.getResourceAsStream(Unknown Source) =20 at org.springframework.core.io.ClassPathResource.getInputStream(ClassPathRe= source.java:166) =20 at net.shibboleth.idp.profile.spring.factory.BasicX509CredentialFactoryBean= .getCertificates(BasicX509CredentialFactoryBean.java:177)
is more helpful when debugging than
nested = exception is org.springframework.beans.factory.BeanCreationException: Error= creating bean with name '(inner bean)#6a98f353': Cannot create inner bean = '(inner bean)#3b35798' of type [net.shibboleth.idp.profile.spring.factory.B= asicX509CredentialFactoryBean] while setting bean property 'trustCredential= ';=20 nested exception is org.springframework.beans.factory.BeanCreationException= : Error creating bean with name '(inner bean)#3b35798': Invocation of init = method failed;=20 nested exception is java.lang.IllegalArgumentException: name