HttpClient connection pooling problems
Fix versions
Description
Environment
Assignee
is caused by
Activity
I believe we’ll leave this open for a while pending more evidence, but I’m hopefull the https://shibboleth.atlassian.net/browse/JDUO-74 changes will end up fixing this.
I think we do have some response closure bugs here actually, so I’m moving this over to to the Duo plugin project where the remaninig code lives. I think this is just going to end up being a dup of the one I just filed.
As a sanity check we’ll want to make sure that the response handling is being handled consistently with our intentions to close the objects out and make sure the exception isn’t leaving the connections orphaned, exhausting the pool. I hope it’s something simple like that but probably not.
I spoke with Unicon who helped us with this MFA logic, they asked me to include the stack trace for the error when they ran into the issue
2023-08-21 03:04:33,668 - x.x.x.x - ERROR [net.shibboleth.idp.authn:-2] - Uncaught runtime exception
java.lang.RuntimeException: com.duosecurity.duoweb.DuoWebException: Duo AuthAPI preauth request failed: Non-ok status code (500) returned from Duo: Internal Server Error
at jdk.scripting.nashorn/jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:531)
Caused by: com.duosecurity.duoweb.DuoWebException: Duo AuthAPI preauth request failed: Non-ok status code (500) returned from Duo: Internal Server Error
at net.shibboleth.idp.authn.duo.impl.DuoPreauthAuthenticator.authenticate(DuoPreauthAuthenticator.java:81)
Moving into shared project and presuming this will continue to affect the new version.
When Duo had its outage on 8/28, the IdP started logging, even after Duo had resolved its issue.
MFA Duo PreAuth API - com.duosecurity.duoweb.DuoWebException: Duo AuthAPI preauth request failed: Timeout waiting for connection from pool
We discovered that there were some hanging connections from our IdP servers to Duo and only a restart would clear up connections in the CLOSE_WAIT state
netstat | grep CLOSE_WAIT tcp 25 0 ourshibserver duo CLOSE_WAIT tcp 25 0 ourshibserver duo CLOSE_WAIT
Our current logic calls the Duo PreAuth API to confirm enrollment and device status before prompting for the IFrame/Univerisal Prompt. Below we defined the connect in some of our custom Duo handling logic.
<entry key="duoPreAuthenticator"> <bean lazy-init="true" class="net.shibboleth.idp.authn.duo.impl.DuoPreauthAuthenticator" p:objectMapper-ref="shibboleth.JSONObjectMapper" p:httpClient="#{getObject('shibboleth.authn.Duo.NonBrowser.HttpClient')}" p:httpClientSecurityParameters="#{getObject('shibboleth.authn.Duo.NonBrowser.HttpClientSecurityParameters')}" /> </entry>
The service.properties are mainly default except for the following
idp.httpclient.connectionRequestTimeout = PT30S idp.httpclient.connectionTimeout = PT30S idp.httpclient.socketTimeout = PT30S
This has only happened during Duo’s outage on that date and these settings have been on our IdP for the last 4-5 months. Unfortunately the Duo Ping checks were working it didn’t fail open for us.
Scott and I also had a conversation in the Shibboleth #idp Slack channel on 8/22 if you want to see more details.