Apache workers stuck in graceful shutdown on RHEL7

Andrew Barnes andrew.barnes at deakin.edu.au
Thu Apr 28 21:42:19 EDT 2016


Hello,

We have observed that a number of RHEL7, SP enabled apache servers we have rolled out, are slowly collecting apache workers stuck in graceful shutdown.

Here is an example scoreboard of one of the hosts:

  GGRGGGRGGGGGGG.G.GGGGGGGGG.GGGGGG.GGGGGGGRGG.GGW.G.G..G..G.GGG.G
  .G.RWWGGW....WR.W.W.............................................
  ................................................................
  ................................................................

Same host 2 weeks later:

  GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG_GGGGWGGGGGGGG
  GGG_GGGGG_G_GGGGGGGGGG_GGGGGGGG_GGGG_G_.GGGGG._GGGGGG_GG.GGGGGG_
  G_.G.._G__G_G__...._______.....____.............................
  ................................................................

So seems to leak approx. 1-2 workers/day.

Here is the server-status line for one of the workers:

    Srv    PID      Acc     M  CPU    SS    Req  Conn Child  Slot      Client                  VHost                                          Request
   59-19  34461 1/75/677    G 5.34  778556  0    0.3  0.87  9.09   ::1            [local fqdn redacted]:80 OPTIONS * HTTP/1.0

A stack trace of the worker looks to indicate some sort of deadlock in shib_exit():

    # sudo pstack 34461
    Thread 3 (Thread 0x7f018163f700 (LWP 34463)):
    #0  0x00007f019fd2303e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
    #1  0x00007f0190f6947f in (anonymous namespace)::XMLConfig::background_load() () from /usr/lib64/libshibsp-lite.so.6
    #2  0x00007f01912634c9 in xmltooling::ReloadableXMLFile::reload_fn(void*) () from /usr/lib64/libxmltooling-lite.so.6
    #3  0x00007f019fd1fdc5 in start_thread () from /lib64/libpthread.so.0
    #4  0x00007f019f84928d in clone () from /lib64/libc.so.6
    Thread 2 (Thread 0x7f0180e3e700 (LWP 34465)):
    #0  0x00007f019fd23a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
    #1  0x00007f019126ddd3 in xmltooling::CondWaitImpl::timedwait(xmltooling::Mutex*, int) () from /usr/lib64/libxmltooling-lite.so.6
    #2  0x00007f0190f4d2f4 in (anonymous namespace)::SSCache::cleanup_fn(void*) () from /usr/lib64/libshibsp-lite.so.6
    #3  0x00007f019fd1fdc5 in start_thread () from /lib64/libpthread.so.0
    #4  0x00007f019f84928d in clone () from /lib64/libc.so.6
    Thread 1 (Thread 0x7f01a1241840 (LWP 34461)):
    #0  0x00007f019fd20ef7 in pthread_join () from /lib64/libpthread.so.0
    #1  0x00007f0191264f35 in xmltooling::ReloadableXMLFile::shutdown() () from /usr/lib64/libxmltooling-lite.so.6
    #2  0x00007f0190f5f516 in (anonymous namespace)::XMLConfig::~XMLConfig() () from /usr/lib64/libshibsp-lite.so.6
    #3  0x00007f0190ef985e in shibsp::SPConfig::setServiceProvider(shibsp::ServiceProvider*) () from /usr/lib64/libshibsp-lite.so.6
    #4  0x00007f0190efac45 in shibsp::SPConfig::term() () from /usr/lib64/libshibsp-lite.so.6
    #5  0x00007f0190efb168 in shibsp::SPInternalConfig::term() () from /usr/lib64/libshibsp-lite.so.6
    #6  0x00007f01914c2e4a in shib_exit () from /usr/lib64/shibboleth/mod_shib_24.so
    #7  0x00007f019ff4d1ae in apr_pool_destroy () from /lib64/libapr-1.so.0
    #8  0x00007f0196e7223c in clean_child_exit () from /etc/httpd/modules/mod_mpm_prefork.so
    #9  0x00007f0196e726e7 in child_main () from /etc/httpd/modules/mod_mpm_prefork.so
    #10 0x00007f0196e72a55 in make_child () from /etc/httpd/modules/mod_mpm_prefork.so
    #11 0x00007f0196e736ee in prefork_run () from /etc/httpd/modules/mod_mpm_prefork.so
    #12 0x00007f01a127a5ae in ap_run_mpm ()
    #13 0x00007f01a1273b36 in main ()

Some details of our environment are:

    # cat /etc/redhat-release
    Red Hat Enterprise Linux Server release 7.2 (Maipo)

    # rpm -q  httpd
    httpd-2.4.6-40.el7.x86_64

    # rpm -q shibboleth
    shibboleth-2.5.6-3.1.x86_64

    # rpm -qf /usr/lib64/libxmltooling-lite.so.6
    libxmltooling6-1.5.6-1.1.x86_64

Wondering if anyone has seen a similar issue or has any ideas that may help us narrow down the problem further?

Many thanks,

Andrew Barnes

Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone.

Deakin University does not warrant that this email and any attachments are error or virus free.


More information about the users mailing list