Strange problem with the MCB and attribute resolving
kwessel at illinois.edu
Thu Apr 24 17:06:51 EDT 2014
Need some help figuring out how I might troubleshoot.
I've just repointed our test IDP cluster to a newer AD forest for attribute resolution. We have two data connectors that were repointed: one that pulls attributes for entries anywhere in the directory, the other that only resolves attributes from under a specific OU.
I have the MCB running with my lovely combination of the remote user and Duo login handlers.
About half the time, when I log in, I encounter an error after the remote user login handler finishes and before Duo presents anything. Looks like the MCB's trying to resolve attributes at that point and is failing:
15:52:04.231 - ERROR [edu.internet2.middleware.shibboleth.common.attribute.resolver.provider.ShibbolethAttributeResolver:379] [session=] - Received the following error from data connector CampusADRootDN, no failover data connector available
edu.internet2.middleware.shibboleth.common.attribute.resolver.AttributeResolutionException: An error occurred when attempting to search the LDAP
Then a Java stack trace followed by the Duo login handler complaining about not having the data it needs and a null pointer exception:
15:51:34.874 - ERROR [edu.internet2.middleware.assurance.mcb.authn.provider.MCBAttributeResolver:96] [session=] - Failed to resolve attributes for kwessel: An error occurred when attempting to search the LDAP
15:51:34.876 - ERROR [edu.internet2.middleware.assurance.mcb.authn.provider.MCBLoginServlet:336] [session=] - Exception calling submodule.
I tried turning up the MCB log level to debug and saw no more details about the specific error from the LDAP search failure. I can't reproduce the error using the IDP's attribute resolver with aacli.
I'm pointing directly to a specific AD node, not a load balanced or round robin address. I've tried increasing read and connect timeouts for my AD data connectors to 10 seconds from 2, and the problem persists.
It could be just luck, but when I change things back to our old AD forest, it doesn't give me this error at all. That would make sense though: I haven't seen this problem until today, and I just made the changes to our attribute resolver configuration this week.
Any thoughts on how I might track down the cause of this problem?
More information about the users