clustering with HazelcastStorageService
Paul B. Henson
henson at cpp.edu
Thu Apr 28 17:12:56 EDT 2016
> From: Cantor, Scott
> Sent: Thursday, April 28, 2016 6:50 AM
>
> I don't really know what you mean by separate instance.
I apologize. I'm mostly speaking from a position of ignorance in terms of the specific details of the idp implementation; I really wish I had the time to really dig through the idp code and better understand it rather than just do the occasional cursory overview and waste your time talking from a higher level abstract position.
So the unicon implementation is:
public class HazelcastMapBackedStorageService extends AbstractStorageService {
Somewhere the idp does a "storage_service = new StorageService" or something of that nature; what I mean by instance is each of those occurrences. Right now I am using the HazelcastStorageService for the idp.session.StorageService, the idp.replayCache.StorageService, and the idp.cas.StorageService. It's not clear to me whether each of those usages shares a single instance of the back end object, or each of them creates a separate instance of the object. I was poking through the code and saw a piece that said "new StorageBackedIdPSession" in the session handling code and perhaps misinterpreted it.
> I don't know what you mean, actually. I said "context", not storage engine.
> The storage plugins implement a Java interface, and so a bean of that type is
> the "engine" in the IdP. The context/key split is just a two part key to allow
> for co-habitation within one storage instance by different storage clients, and
> the context is just a string.
Based on this, it sounds like there is one instance of the storage engine that is shared by everything in the idp that needs storage that is configured to use that type of storage engine?
If I understand that correctly, then in the case of the Hazelcast storage backend, it uses the context as the name of the map to use to store the data. A Hazelcast map is basically a Java map that is transparently distributed across a cluster of nodes. If I understood Jonathan correctly, if I knew the name of the context the idp was going to use, I could configure the Hazelcast map, which has a variety of options. In particular, one option is how much memory the map is allowed to use. For example, I could configure the map for the replay cache to use a maximum of 5% of the available memory, and to automatically evict the oldest entries if it hit that limit. You can also configure how much redundancy you want. By default, there is one node of redundancy, you could lose one cluster node and not lose any data. If the session context had a static name, I would be able to configure it to have two nodes of redundancy if I wanted to, and be able to lose two nodes of the cluster without losing any data. There's also the question of efficiency; as I understand it now, each and every session, given it uses a random context name, is creating a new Hazelcast map, which I believe is much less efficient and involves considerably more overhead than if all sessions shared a single map. That is of course an implementation detail specific to the hazelcast storage backend, but to resolve it might be easier with a minor change in how the session context are named. I will need to consult with Jonathan to make sure I understand both how the code currently works, that the inefficiencies of multiple maps are worth redesigning the current implementation, and what a better approach might be.
> The session cache happens to store its data by using random context names
> that are derived from the session ID it's storing.
>
> Most of the other cases tend to pick a context like "_replay" to store records
> under.
Is the session cache the only one that uses random data? If so, a slightly kludgy heuristic could be that if the context name is random, it belongs to the session map. When you have the time, I would appreciate it if you could confirm the context names for the idp.replayCache.StorageService and the idp.cas.StorageService, if those are static I could at least test configuring the hazelcast maps for those.
Thanks much...
--
Paul B. Henson | (909) 979-6361 | http://www.cpp.edu/~henson/
Operating Systems and Network Analyst | henson at cpp.edu
California State Polytechnic University | Pomona CA 91768
More information about the users
mailing list