Cool technology--which I'm not letting anywhere near any production server I hav...

danudey · on July 28, 2010

My company has just purchased 150-200 ksplice licenses, and it's really a fantastic technology. It loads up a kernel module that does all the runtime patching for you, but if you don't run it, it doesn't update.

The benefit I see here is that it provides you with a kernel that has all the latest security updates without having to recompile, upgrade, etc. When you reboot next, you have the same kernel you had last time you rebooted (unless you've done an upgrade in the meantime), so you know it's going to boot.

ksplice doesn't modify the kernel on-disk, only in memory. In the unlikely (so far for us) situation that one of those patches is incompatible, causes problems, crashes your system, etc. and you want to reboot to a known-good configuration, just log into their system and deactivate the server, and it won't be able to do any updates.

Your post seems to assume that ksplice changes your booting kernel, and thus who knows if it will boot next time. This doesn't happen. Unless you upgrade your kernel yourself (e.g. through yum or apt), you're always booting from the same (insecure) kernel, and the ksplice daemon reapplies the patches again after you boot to get you up-to-date.

It's really a marvellous technology, and it's worked very well for us so far.

wzdd · on July 28, 2010

Exactly. I saw this back when it was relatively new, at EuroSys 09, and it seemed very smooth, very neat, and very much not what I'd want on anything mission-critical.

The most valuable thing I took out of it was that it is a good lesson in marketing a product. I mean, the tech itself obviously took a lot of work, but it doesn't really attack what people doing dynamic upgrade consider hard problems -- in particular, it doesn't do changes to data very well.

However, look at the way it was presented. Firstly we get good stats to the effect that the vast majority of kernel updates are things that can be done without solving difficult data update problems. This was a particularly important point to make at a research-related conference.

But the marketing doesn't stop there -- they also run a service which you can use to generate updates for you, so you can track from kernel to kernel automatically.

So, the end result is a strong argument for something which a) works most of the time; b) will work for you without much effort on your part; and (most importantly) c) is fantastic bragging rights: "My OS doesn't ever need rebooting!"

In the wrong hands, this would have been a mediocre research project. "Sure we can upgrade the kernel, but we have to interpose functions, create shadow data structures, the result isn't anything like what a "real" kernel would look like after reboot so you have no guarantee of anything, and sometimes it doesn't work". Instead we get something that everybody is talking about and is rapidly emerging as a strong selling point for Linux. My respect to the KSplice team for doing three jobs well: research, implementation, and marketing.

chwahoo · on July 27, 2010

This idea that OS (or application) startup provides a valuable upgrade sanity check under controlled conditions is an interesting point and one that hadn't occurred to me in my work with runtime upgrades.

However, Ksplice mainly supports security patches which tend to be localized and less risky than large semantic changes or feature additions. I suspect that such changes are extremely unlikely to produce an unbootable system.

bnoordhuis · on July 27, 2010

Same goes for rebooting after upgrading libc and other core libraries. Running processes will keep on using the old version of the library and that's problematic with long-running processes like web servers and databases. You don't want to find out months after the fact that your business-critical application is incompatible with the updated library.

nitefly · on July 27, 2010

If you'd still like to do a scheduled reboot a few times per year, you can--but at least now you can do so at your leisure, without being insecure for days or weeks because your system is not up to date.