Key rollover docs/procedure

Cantor, Scott cantor.2 at osu.edu
Fri Sep 13 10:17:36 EDT 2019


Key rollover is a practical impossibility unless you have few remote services and/or are me. I'm not saying that to be boastful. It's taken me over 6 months, and we still have two stragglers, and a variety of cleanup tasks to perform. And I know more than anybody alive about this topic.

And in the process I've found half a dozen completely broken vendors that weren't even verifying signatures, point being if you go down this road, your testing will find things you won't like and will have to deal with, at great cost in time and stress.

So you do it when you absolutely have to, not because you want to. That said, MD5 is going to keep biting you, and...umm, yuck. That sucks.

Unless your key is compromised or is in a state in which there are individuals with access to it, now or in the past, that you don't feel good about, then you also don't need to change the key, you just need a new certificate for the existing key.

All Shibboleth SPs couldn't care less what the certificate is, so you immediately avoid any risk with all those SPs, and perhaps some others. Admittedly that doesn't solve the problem, but it at least helps some. The rest is a tremendously time-consuming exercise in planning, arguing, cajoling, and drinking heavily.
 
Or you just break stuff and don't care, which is in fact what most organizations seem to do. Maybe I'm naïve in believing that's not a professional way to do my job.

Above all, you need a deep understanding of how it all works, how to debug remote systems you don't have access to, careful planning, and a lot of time. And while you're doing it, SPs will start doing dumb things to their keys and you'll be juggling balls like you wouldn't believe.

Honestly, you may just want to deploy a second certificate or key for the cases that can't handle MD5 and treat it as an exception.

The one thing I advise is to use it as an opportunity to take careful stock of everything, document all the systems you left undocumented, and tag all the metadata of all the systems that you find can't handle a key rollover. That allows future changes to be much more confidently executed. When in doubt, lock systems to a key you control at the IdP so that flipping keys in federation or campus metadata in the future won't impact them until you deal with them.

For context, I have over 100 cloud services with all manner of broken SAML support in use, aside from the 400+ campus SPs and all the ad hoc InCommon/eduGAIN usage. I'm down to 2 using our old key now.

-- Scott




More information about the users mailing list