Recommended clustering configuration

Cantor, Scott cantor.2 at
Thu May 12 16:54:57 EDT 2016

> I've read the docs on the wiki about clustering and storage, but there
> are some things that I'm not clear on.  They go over different ways
> one *might* set it up, but the specifics of what's recommended is not
> spelled out.

Clustering is more a local decision than anything else, so the only recommendation is to review the options and choose what fits best.

If there's an overarching one, it's to use the defaults and avoid state unless you really need it, I guess. People put huge stock in defaults, and the default is what it is for a reason, it makes clustering much simpler than the other options.

But I can't tell you what your use cases are. All I can do is list the features impacted, which is spelled out in the clustering page where it talks about the default config and its limitations (see Feature Limitations).

> I'm interested to know what is the specific
> recommendation of the developers/those with experience and what is
> working for them, instead of generalities about the many different
> options one could use.

OSU has run stateless for about 12 years, going back to V1.3. I'd have to be seriously arm-twisted to do anything else.

> How I generally approach clustering for a web app is to have multiple
> nodes running an application, with nodes behind an haproxy load
> balancer that handles status checking and sticky sessions to ensure
> users go back to the same node.  The database (LDAP in this case) is
> somewhere else and has its own clustering, so not a concern here.
> This provides a resilient setup that is still relatively simple.

You need some degree of stickyness or some kind of advanced proxying to simulate it behind the scenes, but if you want to support various features that are not usable out of the box, you need more state, and that means configuring something else. You can't just use stickyness. CAS, artifacts, etc. all involve cross-system calls between clients and RPs.

> What I still have questions about:
> - What kind of storage on the node is required, if any?

By default? None.

> - Clustering wiki mentions that the IdP defaults are "easily
> clusterable", but the nodes have replay cache and artifacts stored in
> memory, so what is the impact if a node goes down?  Failed queries?

Artifacts don't work by default; replay cache is admittedly handled per-node and imperfectly, and most people are fine with that.

> Or cache misses resulting in longer response time for a request?  Is
> this type of failure a big issue that could cause problems?  Or would
> a user "just" need to relogin on the new node?

There's nothing out of the box that can "miss" unless you blow up mid-transaction. They don't need to re-login across nodes, no.

> I'm not a java developer and I feel that the docs sometimes assume a
> detailed level of java framework knowledge, so any help in
> understanding this at an administrator level is appreciated.

I don't think the clustering topic has much Java material in it, really, but that's a bit of an exception compared to a lot of the material that involves knowing how to read Javadocs to really do things with it.

-- Scott

More information about the users mailing list