Memory leak hunting - Rhino script woes leading to stale metadata not being garbage collected

Brian Koehmstedt bkoehmstedt at ucmerced.edu
Tue Sep 16 23:23:00 EDT 2014


After some extended heap analysis sessions, I think I've narrowed down 
my Shib IdP 2.4.0 memory leak to the default Rhino engine.  The original 
observation of a leak and corresponding Tomcat crashes was recently 
discussed in:
http://marc.info/?t=140926984900011&r=1&w=2
"Problem with tomcat hanging on Shib 2.4 "

I'll describe my analysis in a moment, but besides sharing what I've 
learned, I'm still scratching my head over why this problem isn't being 
more widely felt as it seems to be, on the surface, a generalized 
problem with Rhino (as described below). This started happening when we 
migrated from Solaris to RedHat, as discussed in previously mentioned 
thread, so maybe it's environmental, but I'm not quite yet ready to 
blame RedHat or OpenJDK just yet, although that possibility certainly 
remains.  (And the next step is to try and reproduce this consistently 
in a test environment and see if the problem goes away with Oracle 
JDK.)  The other possibility is we have some odd quirk in our 
configuration that is triggering this, possibly  introduced when we did 
the migration.

At any rate -- on to describing what I found.

First, the obvious stuff: I turned on GC stat logging and took heap 
dumps at intervals, and the evidence was clear of heap usage steadily 
growing.  At the very top of the overall size usage chart in the jhat 
histogram were character arrays (189MB at last heap dump and growing).  
Further investigations led to the conclusion this was mostly character 
arrays containing strings from the Incommon metadata file...leading me 
to the early hypothesis that the old Incommon data wasn't entirely being 
garbage collected.

Further digging led me to:
In java-shib-common, ScriptMatchFunctor.java (contains script exec 
code), in getScriptContext(), it instantiates a SimpleScriptContext and 
sets the filterContext attribute to an instance of 
ShibbolethFilteringContext.  Normally, it seems this context should come 
and go with each script execution and thus be garbage collected.

But, when adding the filterContext to the ScriptContext, internally in 
the JDK this seems to trigger the Rhino engine to analyze the 
filterContext class (ShibbolethFilteringContext) for methods belonging 
to that object and cache what it finds.

It seems to cache the field members and methods in an instance of 
sun.org.mozilla.javascript.ClassCache (via 
sun.org.mozilla.javascript.JavaMembers and 
sun.org.mozilla.javascript.NativeJavaMethod).

When it does this, this cache seems to result an in indirect reference 
to the original ShibbolethFilteringContext object being kept. (See below 
for the nitty gritty reference chain from jhat.)

And that filteringContext holds a ref to a SSORequestContext instance, 
which in turn holds a ref to an EntityDescriptorImpl instance, which in 
turn holds a ref to an EntitiesDescriptorImpl instance, which holds a 
ref to all the other entities in the Incommon metadata file.

This seems to be causing memory leaks for my deployment.  It seems that 
not just one stale filterContext is not-garbage-collected, but over 
time, this builds up to be several stale filterContext objects, all 
holding stale refs to old metadata loads (in EntitiesDescriptorImpls).

The jhat Histogram shows the following instance counts:
edu.internet2.middleware.shibboleth.common.attribute.filtering.provider.ShibbolethFilteringContext 
     11
org.opensaml.saml2.metadata.impl.EntitiesDescriptorImpl     9

The ref chain info was found by using jhat to find the paths, from root, 
to a particular object that seemed to be stale:
select heap.livepaths(heap.findObject(0xca99c500));
(This took a long time to run on my 1gb heap dump.  I had to let it run 
overnight and jhat needed a 5gb heap to run this query, -J-Xmx5000M.)

Relevant part of the chain showing ClassCache holding reference to a 
ShibbolethFilterContext.

...
->sun.org.mozilla.javascript.ClassCache at 0xc4fa9560 (field classTable)
->java.util.HashMap at 0xc8d2c558 (field table)
->[Ljava.util.HashMap$Entry;@0xc8d2c588 (Element 12 of 
[Ljava.util.HashMap$Entry;@0xc8d2c588)
->java.util.HashMap$Entry at 0xcedcdea8 (field value)
->sun.org.mozilla.javascript.JavaMembers at 0xcedcdec8 (field members)
->java.util.HashMap at 0xcedcdef0 (field table)
->[Ljava.util.HashMap$Entry;@0xcedcdf30 (Element 198 of 
[Ljava.util.HashMap$Entry;@0xcedcdf30)
->java.util.HashMap$Entry at 0xcededef0 (field value)
->sun.org.mozilla.javascript.NativeJavaMethod at 0xcededf10 (field 
parentScopeObject)
->com.sun.script.javascript.ExternalScriptable at 0xcedce6d8 (field context)
->javax.script.SimpleScriptContext at 0xcedce6f8 (field engineScope)
->javax.script.SimpleBindings at 0xcedcea18 (field map)
->java.util.HashMap at 0xcedcea28 (field table)
->[Ljava.util.HashMap$Entry;@0xcedcea58 (Element 12 of 
[Ljava.util.HashMap$Entry;@0xcedcea58)
->java.util.HashMap$Entry at 0xcedceb58 (field value)
->edu.internet2.middleware.shibboleth.common.attribute.filtering.provider.ShibbolethFilteringContext at 0xcedb3af0 
(field attributeRequestContext)
->edu.internet2.middleware.shibboleth.idp.profile.saml2.SSOProfileHandler$SSORequestContext at 0xcedb3b10 
(field peerEntityMetadata)
->org.opensaml.saml2.metadata.impl.EntityDescriptorImpl at 0xce946bb8 
(field parent)
->org.opensaml.saml2.metadata.impl.EntitiesDescriptorImpl at 0xcbaeed78 
(field signature)
...



More information about the users mailing list