Memory leak hunting - Rhino script woes leading to stale metadata not being garbage collected
Brian Koehmstedt
bkoehmstedt at ucmerced.edu
Tue Sep 16 23:23:00 EDT 2014
After some extended heap analysis sessions, I think I've narrowed down
my Shib IdP 2.4.0 memory leak to the default Rhino engine. The original
observation of a leak and corresponding Tomcat crashes was recently
discussed in:
http://marc.info/?t=140926984900011&r=1&w=2
"Problem with tomcat hanging on Shib 2.4 "
I'll describe my analysis in a moment, but besides sharing what I've
learned, I'm still scratching my head over why this problem isn't being
more widely felt as it seems to be, on the surface, a generalized
problem with Rhino (as described below). This started happening when we
migrated from Solaris to RedHat, as discussed in previously mentioned
thread, so maybe it's environmental, but I'm not quite yet ready to
blame RedHat or OpenJDK just yet, although that possibility certainly
remains. (And the next step is to try and reproduce this consistently
in a test environment and see if the problem goes away with Oracle
JDK.) The other possibility is we have some odd quirk in our
configuration that is triggering this, possibly introduced when we did
the migration.
At any rate -- on to describing what I found.
First, the obvious stuff: I turned on GC stat logging and took heap
dumps at intervals, and the evidence was clear of heap usage steadily
growing. At the very top of the overall size usage chart in the jhat
histogram were character arrays (189MB at last heap dump and growing).
Further investigations led to the conclusion this was mostly character
arrays containing strings from the Incommon metadata file...leading me
to the early hypothesis that the old Incommon data wasn't entirely being
garbage collected.
Further digging led me to:
In java-shib-common, ScriptMatchFunctor.java (contains script exec
code), in getScriptContext(), it instantiates a SimpleScriptContext and
sets the filterContext attribute to an instance of
ShibbolethFilteringContext. Normally, it seems this context should come
and go with each script execution and thus be garbage collected.
But, when adding the filterContext to the ScriptContext, internally in
the JDK this seems to trigger the Rhino engine to analyze the
filterContext class (ShibbolethFilteringContext) for methods belonging
to that object and cache what it finds.
It seems to cache the field members and methods in an instance of
sun.org.mozilla.javascript.ClassCache (via
sun.org.mozilla.javascript.JavaMembers and
sun.org.mozilla.javascript.NativeJavaMethod).
When it does this, this cache seems to result an in indirect reference
to the original ShibbolethFilteringContext object being kept. (See below
for the nitty gritty reference chain from jhat.)
And that filteringContext holds a ref to a SSORequestContext instance,
which in turn holds a ref to an EntityDescriptorImpl instance, which in
turn holds a ref to an EntitiesDescriptorImpl instance, which holds a
ref to all the other entities in the Incommon metadata file.
This seems to be causing memory leaks for my deployment. It seems that
not just one stale filterContext is not-garbage-collected, but over
time, this builds up to be several stale filterContext objects, all
holding stale refs to old metadata loads (in EntitiesDescriptorImpls).
The jhat Histogram shows the following instance counts:
edu.internet2.middleware.shibboleth.common.attribute.filtering.provider.ShibbolethFilteringContext
11
org.opensaml.saml2.metadata.impl.EntitiesDescriptorImpl 9
The ref chain info was found by using jhat to find the paths, from root,
to a particular object that seemed to be stale:
select heap.livepaths(heap.findObject(0xca99c500));
(This took a long time to run on my 1gb heap dump. I had to let it run
overnight and jhat needed a 5gb heap to run this query, -J-Xmx5000M.)
Relevant part of the chain showing ClassCache holding reference to a
ShibbolethFilterContext.
...
->sun.org.mozilla.javascript.ClassCache at 0xc4fa9560 (field classTable)
->java.util.HashMap at 0xc8d2c558 (field table)
->[Ljava.util.HashMap$Entry;@0xc8d2c588 (Element 12 of
[Ljava.util.HashMap$Entry;@0xc8d2c588)
->java.util.HashMap$Entry at 0xcedcdea8 (field value)
->sun.org.mozilla.javascript.JavaMembers at 0xcedcdec8 (field members)
->java.util.HashMap at 0xcedcdef0 (field table)
->[Ljava.util.HashMap$Entry;@0xcedcdf30 (Element 198 of
[Ljava.util.HashMap$Entry;@0xcedcdf30)
->java.util.HashMap$Entry at 0xcededef0 (field value)
->sun.org.mozilla.javascript.NativeJavaMethod at 0xcededf10 (field
parentScopeObject)
->com.sun.script.javascript.ExternalScriptable at 0xcedce6d8 (field context)
->javax.script.SimpleScriptContext at 0xcedce6f8 (field engineScope)
->javax.script.SimpleBindings at 0xcedcea18 (field map)
->java.util.HashMap at 0xcedcea28 (field table)
->[Ljava.util.HashMap$Entry;@0xcedcea58 (Element 12 of
[Ljava.util.HashMap$Entry;@0xcedcea58)
->java.util.HashMap$Entry at 0xcedceb58 (field value)
->edu.internet2.middleware.shibboleth.common.attribute.filtering.provider.ShibbolethFilteringContext at 0xcedb3af0
(field attributeRequestContext)
->edu.internet2.middleware.shibboleth.idp.profile.saml2.SSOProfileHandler$SSORequestContext at 0xcedb3b10
(field peerEntityMetadata)
->org.opensaml.saml2.metadata.impl.EntityDescriptorImpl at 0xce946bb8
(field parent)
->org.opensaml.saml2.metadata.impl.EntitiesDescriptorImpl at 0xcbaeed78
(field signature)
...
More information about the users
mailing list