It's been ages since I wrote my last post. I guess now it's time to make that up with post about Plone deployment. In a last few months I've had chance to invest part of my time at work to develop our Plone deployment model and since there hasn't been that many posts about the subject I thought I'd share my experiences for the community.
Before I get to the details here's some background information about the environment I've been dealing with. I'm working at the University of Jyväskylä, Finland. Our university is a heavy weight Plone adopter here in Finland. Plone is in use by every faculty and department. We have about 500 - 700 content managers and about 50 - 60 separate Plone instances. Our front page gets about 2 - 3 millionhits page views / month. The important information in previous data is the amount of instances. 50 - 60 is just my estimation about our current status and that amount is increasing every year. I'm sure you got the idea so I'll continue to our previous deployment model which I'm sure is familiar to anyone who has had the pleasure to deploy Plone sites.
Every now and then we've been thinking better deployment story for Plone-sites until we realized that the things we'd want to achieve sounded just like package management software. As we're using Red Hat os in our servers I hold my breath and jumped head first to the RPM world. There isn't too much information about deploying Plone sites as RPM-package on WWW. I knew folks at Weblion have developed environment for Debian based distros, but that didn't help much. Luckily Google revealed that Nikolay Kim from Enfoldsystems had been in talks with Fedora packagers about packaging Plone as RPM. As far as I understood from the mailing list messages this didn't work out well and Enfoldsystems packages never ended up to Fedoras repository due to packaging policy disagreements. Nevertheless Enfoldsystems have kindly published their Plone buildouts and RPM specs in their Subversion repository.
I used Nikolay Kims work as a base structure to fit things to our environment. Nikolay had divided Plone to two separate RPMs. First one (plone-base) contains all the common packages Plone needs. Another package contains the Plone-instance and its eggs. Nikolays packages should work out of the box as they are, but we needed to modify them a bit to suit our needs. I created own RPM-package for the Python virtualenvs - one for Python 2.4.6 and one for Python 2.6.5. I also created RPM-packages for few python libraries (python-imaging, python-lxml, python-ldap) so that they install to virtualenvs instead of system python. With these ready I modified Nikolays spec-files to use my virtualenv packages and modified buildout.cfg file so that it extends to our shared configuration and the initial setup was done. There was lots of manual work to do to add and test RPM-specs for our buildouts but now when it's done life seems to smile at me. To make my workload even lighter and to avoid dull copy-paste work I created custom paster template which creates buildout folder with RPM-spec ready for rpmbuild to do it's magic. I attached the rpmbuild process to Hudson by creating parametrized builds - when our developers includes certain commit message and pushes his changes to Mercurial this launches hudson build, tests and if everything went ok, finally rpmbuild. This is done with the help of hghudson package which I customized a bit to allow parametrized Hudson builds.
You may wonder what makes RPM deployment better than buildout?
Here are some pros listed in no particular order:
I know some of these can be achieved by using buildout as well - eg. by mirroring pypi, using eggproxy etc., but nevertheless my experience is that running buildouts on a production server has always a bigger chance to fail than updating tested RPM-package. I still have to admit that at the moment this deployment model is new to us and I probably won't know all the cons. Here are few which are more related to the setup process of RPM deployment model:
I hope this post will be useful to someone struggling with similar problems. I will be posting updates about the subject when our new deployment model has seen more use. I'd also like to thank Nikolay Kim and Enfoldsystems for the excellent work they have done to make Plone RPM-packaging being as easy as it is now. Thank you!
Before I get to the details here's some background information about the environment I've been dealing with. I'm working at the University of Jyväskylä, Finland. Our university is a heavy weight Plone adopter here in Finland. Plone is in use by every faculty and department. We have about 500 - 700 content managers and about 50 - 60 separate Plone instances. Our front page gets about 2 - 3 million
Current setup
We've got quite normal setup of RHEL servers, Apache + Varnish combination and loads of Plone sites. Our sites use mainly Plone 3.3.5, but we have increasing amount of Plone 4 sites and also few Plone 2.1 and 2.5 sites. We've used buildout to deploy new sites and update the old ones and as I've mentioned in my previous posts (Managing multiple plone buildouts, problems with plone version pinnging) it hasn't always been enjoyable experience. Don't get me wrong - buildout is awesome tool if you need to put up few sites every now and then and don't need to look after them, but running about 30 buildout scripts within few hours scheduled update window isn't fun. I know many of you think now that why didn't they script that? We did. We also developed tools to update packages automatically to avoid all the manual steps - yet we hit problems in updates, some of them were plain human errors (eg. wrong version pinnings) and some of them were because of some unpredictable behavior of software.Reinventing the wheel?
Every now and then we've been thinking better deployment story for Plone-sites until we realized that the things we'd want to achieve sounded just like package management software. As we're using Red Hat os in our servers I hold my breath and jumped head first to the RPM world. There isn't too much information about deploying Plone sites as RPM-package on WWW. I knew folks at Weblion have developed environment for Debian based distros, but that didn't help much. Luckily Google revealed that Nikolay Kim from Enfoldsystems had been in talks with Fedora packagers about packaging Plone as RPM. As far as I understood from the mailing list messages this didn't work out well and Enfoldsystems packages never ended up to Fedoras repository due to packaging policy disagreements. Nevertheless Enfoldsystems have kindly published their Plone buildouts and RPM specs in their Subversion repository.
RPM setup
I used Nikolay Kims work as a base structure to fit things to our environment. Nikolay had divided Plone to two separate RPMs. First one (plone-base) contains all the common packages Plone needs. Another package contains the Plone-instance and its eggs. Nikolays packages should work out of the box as they are, but we needed to modify them a bit to suit our needs. I created own RPM-package for the Python virtualenvs - one for Python 2.4.6 and one for Python 2.6.5. I also created RPM-packages for few python libraries (python-imaging, python-lxml, python-ldap) so that they install to virtualenvs instead of system python. With these ready I modified Nikolays spec-files to use my virtualenv packages and modified buildout.cfg file so that it extends to our shared configuration and the initial setup was done. There was lots of manual work to do to add and test RPM-specs for our buildouts but now when it's done life seems to smile at me. To make my workload even lighter and to avoid dull copy-paste work I created custom paster template which creates buildout folder with RPM-spec ready for rpmbuild to do it's magic. I attached the rpmbuild process to Hudson by creating parametrized builds - when our developers includes certain commit message and pushes his changes to Mercurial this launches hudson build, tests and if everything went ok, finally rpmbuild. This is done with the help of hghudson package which I customized a bit to allow parametrized Hudson builds.
What we achieved?
You may wonder what makes RPM deployment better than buildout?
Here are some pros listed in no particular order:
- It fits well with the regular operating system maintenance.
- Our system administrators can now update servers/packages automatically without any knowledge about Plone.
- If something goes wrong we can easily rollback to previous package version.
- Deploying new site happens just like installing any other package with package management software.
- We can be sure there isn't going to be any trouble with pypi or some other critical site being down when we're updating/creating new instance.
- We still have buildout in the background to make developers life easy.
- It saves time, nerves and makes it possible for me to once again enjoy Wednesdays (guess which day our scheduled maintenance window is?) :)
I know some of these can be achieved by using buildout as well - eg. by mirroring pypi, using eggproxy etc., but nevertheless my experience is that running buildouts on a production server has always a bigger chance to fail than updating tested RPM-package. I still have to admit that at the moment this deployment model is new to us and I probably won't know all the cons. Here are few which are more related to the setup process of RPM deployment model:
- You'll need a way to create RPM-spec automatically.
- RPM deployment isn't as straightforward than using buildout.
- You'll get most out of the pros only when you're serving several instances.
- You'll need to know at least basics about how to create RPM-packages to set up this kind of environment.
I hope this post will be useful to someone struggling with similar problems. I will be posting updates about the subject when our new deployment model has seen more use. I'd also like to thank Nikolay Kim and Enfoldsystems for the excellent work they have done to make Plone RPM-packaging being as easy as it is now. Thank you!
I was mocked openly when I did this about two years ago:
ReplyDeletehttp://rudd-o.com/new-projects/the-missing-rpms/zope
http://rudd-o.com/new-projects/python-improvements
It can be done. It works well. Enjoy.
I remember reading the conversation in #plone channel back then and also wondered what's the point of rpm deployment when you had buildout. I could say I've seen enough to realize it now :) In the defense of Plone community I think buildout is still the best way to deploy plone-sites in small scale. In larger scale where server updates are done by some other people I'd choose rpm-model over buildout without any hesitation.
ReplyDeleteI still wonder why you ended up packaging every egg to rpm? Many of them are only usable by Plone-instances and I see little or none additional value of them being separately installable as rpm. Please enlighten me if I've missed something obvious.