A question about load balancers and Terracotta clustering

Wessel, Keith William kwessel at illinois.edu
Mon Jul 9 22:35:40 BST 2012


Thanks, Chris. Can you tell me which version of Terracotta you found to be successfully aside from the scalability issues with that combination of Tomcat and Java?

Keith


-----Original Message-----
From: users-bounces at shibboleth.net [mailto:users-bounces at shibboleth.net] On Behalf Of Christopher Bongaarts
Sent: Monday, July 09, 2012 4:31 PM
To: users at shibboleth.net
Subject: Re: A question about load balancers and Terracotta clustering

On 7/9/2012 4:08 PM, Wessel, Keith William wrote:
> I'm still fighting with this, and not having easy console access to run a gui from these boxes, I haven't explored that route yet.

You can run the Terracotta dev console remotely, so long as you open the proper ports (9510 or 9520 by default, I think) to wherever it's running.  It was an invaluable tool for keeping tabs on what was going on.

> One question as is seems our move to RHEL6 might have broken this.
> Can Russ or others tell me what Tomcat, Java and Terracotta version 
> conbimations they've found to work? I'm wondering if it's time to 
> update one or more of these.

We had some success (but scalability issues that are now leading us away from Terracotta) on RHEL6, Sun JVM 1.6.0_26, Tomcat 6.0.32, and whatever Apache comes with RHEL6.  We are running with two VMs as Terracotta primary and failover, separate from our IdP VMs (6 in production).

> Thanks,
> Keith
>
>
> -----Original Message-----
> From: Wessel, Keith William
> Sent: Tuesday, July 03, 2012 3:30 PM
> To: Shib Users
> Subject: RE: A question about load balancers and Terracotta clustering
>
> Thank you to both of you gentlemen. I'll give that a try, but it definitely seems like I'm not connecting fully to Terracotta. Even taking the slb/glsb out of the loop completely and pointing directly at the server IPs again using DNS round robin, I'm getting the same exact unhelpful Shib generic error page.  And my idp-process log still ends with the debug-level message "Beginning user authentication process."
>
> Are there any facilities I can turn further up to see what's causing this error?
>
> Thanks,
> Keith
>
>
> -----Original Message-----
> From: users-bounces at shibboleth.net 
> [mailto:users-bounces at shibboleth.net] On Behalf Of Kevin P. Foote
> Sent: Tuesday, July 03, 2012 11:06 AM
> To: Shib Users
> Subject: RE: A question about load balancers and Terracotta clustering
>
>
> Keith - you might find some help from the included java gui dev-console. You can run this and connect into your TC cluster to see who is actually connected ..client and server..  you can also peek into the DSO cache.
>
> usually found in bin/dev-console.sh or bin/dev-console.bat what ever you prefer..
>
> I've found this tool helpful in trouble shooting TC.
>
> ------
> thanks
>    kevin.foote
>
> On Mon, 2 Jul 2012, Wessel, Keith William wrote:
>
> -> Thank you, Sir. My feeling is the same as yours. Even though the terracotta-server.log seems to indicate both clients are connecting, I'm suspecting one's having trouble communicating for some reason.
> ->
> -> I do know that both nodes work fine independently. I also know that, if the SP session's destroyed then I hit the same IDP node a second time, it works fine, and that's even with the Terracotta DSO boot jar loaded. So, my individual two IDP nodes both seem to be working fine.
> ->
> -> Looking at the terracotta-server.log and terracotta-client.log files, nothing jumps out at me other than this warning message:
> ->
> ->
> -> 2012-07-02 11:13:02,803 [L2_L1:TCComm Main Selector Thread_R 
> -> (listen 0:0:0:0:0:0:0:0:9510)] WARN com.tc.net.core.CoreNIOServices 
> -> - java.nio.channels.CancelledKeyException occured
> ->
> -> Does that mean anything to anyone on this list?
> ->
> -> I'm not sure what I'm looking for in these log files to see that a session was successfully written or queried by a client. Can anyone give me any clues there?
> ->
> -> I fear that this is something super trivial, and I know we had it working before and can't figure out for the life of me what broke it.
> ->
> -> Thanks for any further advice.
> ->
> -> Keith
> ->
> ->
> -> From: users-bounces at shibboleth.net
> -> [mailto:users-bounces at shibboleth.net] On Behalf Of Russell Beall
> -> Sent: Friday, June 29, 2012 4:42 PM
> -> To: Shib Users
> -> Subject: Re: A question about load balancers and Terracotta 
> -> clustering
> ->
> -> Hi Keith,
> ->
> -> There are a lot of factors involved in getting a cluster functional.  It seems like you know what you are doing and are part of the way there.
> ->
> -> Before you get into complicated functionality testing you should be sure that the basic functions are working.
> ->
> -> The simplest functionality test involves simply swapping nodes in the loadbalancer without shutting off servers.  You shouldn't even have to kill the SP server if you simply access its logout link before trying to login to it again.  If this works, then you have clustered sessions.
> ->
> -> I have no clue as to what might be going wrong from your log snippet.  My vague guess is that one of the tomcat servers is not completing its connection to the TC server and cannot access its session data.  Each node should be usable for login individually, while also being connected to TC, before you should try seeing if the login handler on one node can access a session created by another node.
> ->
> -> Good luck.  :-)
> ->
> -> Regards,
> -> Russ.
> ->
> -> On Jun 29, 2012, at 1:03 PM, Wessel, Keith William wrote:
> ->
> ->
> -> One other note: if I get the login screen from, say, node 1 then stop node 1's Tomcat instance before clicking the login button, node 2 takes the login form submission and successfully authenticates me. So, if I take a node down mid-authentication, the other picks up the session. This didn't seem to work yesterday when node 1 wasn't participating in the Terracotta cluster. So, I assume this is progress.
> ->
> -> Thoughts? Suggestions?
> ->
> -> Thanks,
> -> Keith
> ->
> ->
> -> From: Wessel, Keith William
> -> Sent: Friday, June 29, 2012 2:00 PM
> -> To: 'Shib Users'
> -> Subject: RE: A question about load balancers and Terracotta 
> -> clustering
> ->
> -> Russ,
> ->
> -> You got me past the first hurtle, and it was a pretty stupid one. Looks like, if the path to the TC boot jar in JAVA_OPTS points to a non-existent jar, no errors are logged, and life goes on as if the jar was never mentioned. My experiements with a different Java version and forgetting to undo my changes in JAVA_OPTS got me here. I fixed that, regenerated the jar for good measure, and now we're connecting.
> ->
> -> Now, if I log into node 2, shutdown tomcat on that node and destroy the SP session on the SP, then refresh in the browser, I get the generic error page from node 1. I turned up logging in the IDP to debug for both edu.internet2.middleware.shibboleth and org.opensaml. It gets this far, then no more logs:
> ->
> -> 13:18:48.429 - DEBUG
> -> [edu.internet2.middleware.shibboleth.idp.profile.saml2.SSOProfileHa
> -> nd
> -> ler:216]
> -> [session=f820cbb71ff0d31ba6b2f08493feba312e75c6c090fe80afd9ec3894b4
> -> 3e 382f] - Redirecting user to authentication engine at 
> -> https://[hostname 
> -> removed]:443/idp/AuthnEngine<https://[hostname%20removed]:443/idp/A
> -> ut
> -> hnEngine>
> -> 13:18:48.500 - DEBUG
> -> [edu.internet2.middleware.shibboleth.idp.authn.AuthenticationEngine
> -> :2
> -> 09]
> -> [session=f820cbb71ff0d31ba6b2f08493feba312e75c6c090fe80afd9ec3894b4
> -> 3e 382f] - Processing incoming request
> -> 13:18:48.508 - DEBUG [edu.internet2.middleware.shibboleth.idp.authn.AuthenticationEngine:240] [session=f820cbb71ff0d31ba6b2f08493feba312e75c6c090fe80afd9ec3894b43e382f] - Beginning user authentication process.
> ->
> -> And that's where it stops. Is there another log facility I might consider turning up to see what happens next?
> ->
> -> Any other thoughts on what I might look at here?
> ->
> -> Keith
> ->
> -> From:
> -> users-bounces at shibboleth.net<mailto:users-bounces at shibboleth.net>
> -> [mailto:users-bounces at shibboleth.net]<mailto:[mailto:users-bounces@
> -> sh ibboleth.net]> On Behalf Of Russell Beall
> -> Sent: Thursday, June 28, 2012 5:08 PM
> -> To: Shib Users
> -> Subject: Re: A question about load balancers and Terracotta 
> -> clustering
> ->
> -> Theoretically, what you are trying to do should work just fine.  This type of thing in my clusters does work.  I'd be inclined to suspect that the Terracotta server nodes are not correctly sharing their session data.
> ->
> -> Instead of shutting down both the Terracotta server and the IdP Tomcat server, try just shutting down the Tomcat server only, or simply disabling it in your loadbalancer and then hitting the other node with a login request.
> ->
> -> If this doesn't work either, then your sessions are not being shared.
> ->
> -> Do the Terracotta server logs recognize when a Tomcat server client is connecting as a participant in the cluster?  Are these connection logs all showing up on a single Terracotta server node?
> ->
> -> Regards,
> -> Russ.
> ->
> -> On Jun 28, 2012, at 1:58 PM, Wessel, Keith William wrote:
> ->
> -> I know quite a few of you out there have Shib clustered using Terracotta behind a load balancer. We're in the process of moving our production cluster from round robbin DNS to actual, real hardware load balancing. (much to my and many other folks' relief).
> ->
> -> I'm doing some testing with our test cluster, and I'm noticing something odd. As recommended, we're running a Terracotta cluster node on each of our two IDP nodes. I'm logging into a test SP then restarting shibd and Apache to force reauthentication when I refresh the SP webpage in the browser. Before refreshing, I'll shut down the IDP node that I used to log in the first time. I'll then wait for Terracotta output on the other IDP to say it's taken over as the active node. In theory, if my persistent store and the Terracotta cluster is working, it'd seem my IDP session would still be valid. Session lifetime on the IDP has been left at the default 30 minutes. However, I'm getting the login screen from the other IDP node at this point.
> ->
> -> So, if I lost anyone:
> -> 1.      Log into SP via Shib node 1
> -> 2.      Once logged in and back at the SP, shut down Tomcat and Terracotta on Shib node 1.
> -> 3.      Restart shibd and Apache on the SP.
> -> 4.      Click refresh in the browser which takes me to Shib node 2 via the load balancer
> -> 5.      Instead of having my existing session honored, I'm getting a login screen.
> -> FYI, I don't get the login screen if I just restart shibd and Apache then hit the same IDP node again that I hit the first time, so I know the SP isn't doing forced re-auth.
> ->
> -> Am I overlooking something here? Is this working as designed? Or does it seem I have a Terracotta misconfiguration somewhere?
> ->
> -> Thanks for any help that anyone can offer.
> ->
> -> Keith
> ->
> -> --
> -> To unsubscribe from this list send an email to 
> -> users-unsubscribe at shibboleth.net<mailto:users-unsubscribe at shibboleth.
> -> net>
> ->
> -> --
> -> To unsubscribe from this list send an email to 
> -> users-unsubscribe at shibboleth.net<mailto:users-unsubscribe at shibboleth.
> -> net>
> ->
> ->
> --
> To unsubscribe from this list send an email to 
> users-unsubscribe at shibboleth.net
> --
> To unsubscribe from this list send an email to 
> users-unsubscribe at shibboleth.net
>


-- 
%%  Christopher A. Bongaarts   %%  cab at umn.edu          %%
%%  OIT - Identity Management  %%  http://umn.edu/~cab  %%
%%  University of Minnesota    %%  +1 (612) 625-1809    %%


--
To unsubscribe from this list send an email to users-unsubscribe at shibboleth.net


More information about the users mailing list