Questions about setup a clustered Shib IdP

Wed Mar 20 13:41:12 EDT 2013

To expand on a couple of the other responses...

On 3/19/2013 5:32 PM, Yaowen Tu wrote:
> 1. If several IdP nodes act as a cluster, does it mean that they have to
> use the same keystore? Then should we put the keystore file into a
> shared location? or just copy the same keystore file to all the nodes?

While it is not strictly necessary to have identical SAML signing keys 
across all the nodes (you just have to edit the metadata you give to the 
SPs so all of them are present), in practice it is usually less confusing.

If you support encrypted SAML requests, you can only use one (shared) 
encryption key for the cluster.  But I think most sites don't use them.

We copy the same keystore to all nodes (using our rdist-based software 
maintenance tool).

> 2. Similar to keystore file, should all the IdP have the same
> configuration? What if we need to update the config file e.g. adding an
> SP. We need to go to each IdP to update the file, and restart it? During
> this process, some weird things may happen since not all the nodes in
> the cluster have the same configuration, is it? How do we usually handle it?

In general, yes, they should have the same configuration, but it is not 
strictly necessary depending on how you route requests.

For many of the configuration files, the IdP can be configured to check 
for changes periodically and reload them when they change.  We do this 
for metadata files and the attribute-filter.xml file, for example (and 
we would do it for attribute-resolver.xml and relying-party.xml except 
there's a recently fixed bug that prevents the latter from working). 
This allows us to onboard new SPs on the fly without a restart in most 
cases.  Our rdist-based tool lets us copy the files out to all of our 
nodes with a single command; a simple shell script or batch file could 
do the same thing.

For other changes (including IdP upgrades) we typically take a couple of 
nodes out of the load balancer rotation, wait for them to idle down, 
then make the changes and restart the container (we use Tomcat).  Then 
we activate the upgraded nodes one at a time, watching the idp-process 
and other logs for anomalies as we go.

Back when we used to cluster using Terracotta, this was a bit more 
complicated, as you had to essentially bring down the entire cluster in 
order to upgrade the IdP itself.  We now use a cookie-based stateless 
clustering setup like the one Scott describes in the wiki, and because 
of the lack of state, it's a lot easier to manage.

-- 
%%  Christopher A. Bongaarts   %%  cab at umn.edu          %%
%%  OIT - Identity Management  %%  http://umn.edu/~cab  %%
%%  University of Minnesota    %%  +1 (612) 625-1809    %%