Ex: Re: httpclient sockets stuck in CLOSE_WAIT

Paul B. Henson henson at cpp.edu
Wed Jul 1 04:00:24 UTC 2020


On Tue, Jun 30, 2020 at 08:14:17PM +0000, Cantor, Scott wrote:

> I have no issues with sockets exhausting with my HTTP calls hanging
> that I have ever observed. That says to me it's not an inherent issue
> with the code, which leaves the network or other components as a
> trigger for the client to misbehave

I'm not saying there's an endemic problem with the idp :). I'm just
saying there's a difference between "the network is broken, nothing to
do" and "sometimes when the network behaves in particular ways, things
don't work right".

I don't know what the root cause is here, but based on the symptoms the
client isn't closing the sockets, causing them to get stuck in
CLOSE_WAIT state forever. Given it doesn't happen all the time, and only
started happening to me after updating various things, there certainly
could be an environmental trigger. But it's still broken.

Per a packet capture, the connection starts off normally enough with a
SYN from the idp to the client, a SYN-ACK back, and a final ACK. Then
TLS is negotiated and the content transfer completed. The client sends a
packet with both ACK (acknowledging the last packet from the idp) and
FIN (closing the connection) set. The idp sends an ACK for the FIN
packet, Then the idp never sends a FIN of it's own, leaving the socket
in CLOSE_WAIT state.

This bug looks similar:

https://issues.apache.org/jira/browse/HTTPCLIENT-1887

It says the pool eviction thread should call Socket.close()? Where in
the idp is this? Or does the idp rely on the default HttpClient check
for staleness executed when a request is made? I see the use of an
httpclient in idp/cas/proxy/impl/HttpClientProxyValidator.java but my
java/spring fu isn't quite up to tracing where the passed in object
comes from <sigh>. I'm using a custom bean now for CAS proxy validation:

<bean id="shibboleth.CASProxyValidatorHttpClient" parent="shibboleth.NonCachingHttpClient"
      p:tLSSocketFactory-ref="shibboleth.SecurityEnhancedTLSSocketFactory"
      p:maxConnectionsTotal="500"
      p:maxConnectionsPerRoute="500" />

Is there some setting I could use here to make the connections close/get
evicted properly?

This situation only occurs when the remote CAS client web server closes
the connection, not when the idp closes the connection. What might cause
that? Is that the expected behavior or an unusual behavior? Perhaps
that's the environmental difference that causes this problem.

Thanks...

-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.cpp.edu/~henson/
Operating Systems and Network Analyst  |  henson at cpp.edu
California State Polytechnic University  |  Pomona CA 91768


More information about the users mailing list