Since we splitted our one huge Plone site to several smaller ones we've been looking for the correct procedure to keep things running smoothly and to make updates as painless as possible. So far we haven't found the right way and every update has had some problems. Some of the problems are related to the fact that we have about 24 sites to maintain and adding a new content type to all of them means lots of work. Below are all the steps needed for update (and solutions we've founded so far).
1. Add your new product to buildout.cfg eggs and zcml list.
Do this 24 times and you'll find yourself thinking there has to be better way. Luckily there is; you can extend your buildout.cfg with some general config file (eg. deployement.cfg). Most tutorials describe a way where you have those two files living side by side in filesystem and you run buildout with -c deployement.cfg option, but this doesn't remove the actual problem.
We have our deployement.cfg living in different server on the web, but this brings us some new challenges. Actual extending of buildout.cfg:s parts won't work, or at least I haven't found the way for that). So if your buildout.cfg includes something like this
and you want to extend that eggs list from deployement.cfg you'd think following would work:
Well - for our disappointment - it won't. Instead of extending you'll need to remove eggs attribute listing from original buildout.cfg and add it to your deployement.cfg with all the eggs the original buildout.cfg had. That'll do what you want and now you can share that deployement.cfg by adding url to it in your buildout.cfg:s extends attribute.
2. Run the actual buildout command
Once again - repeat this 24 times and if you didn't swear editing all those buildout.cfg:s I mentioned in step I'm sure you'll do that by now. We didn't find any ready solution for this so we made our own called jyu.bomanager. Our tool has buildout.cfg where we can configure all the sites we have. By convention we've named our buildout directorys with the same name our sites are. What jyu.bomanager does is that it runs buildouts automatically for all our sites and restarts them. All seems to be good here, but wait - there are problems too. I'll come to that later.
3. Update/install the new addon.
Now when we have our 24 sites up and running we just need to update or install our new product. I guess you already see what's the problem so I won't repeat myself anymore. To solve this problem we developed a tool similar with jyu.bomanager called jyu.sitemanager. What jyu.sitemanager does is that - depending our mission - it can log in to all of our sites as admin and either install specified new product(s) or upgrade those which needs updating. If everything has gone ok this far we don't have any problems with this step either.. unless we have failed testing our new addon and there are bugs in it, but that'll never happen to us :)
Weakest link...
...is step 2. So far running buildout is usually the part where things fall apart - even when on testing server all went well few minutes ago. It won't help much if we've automated all the steps so that you'd need only few commands to watch the process just happen if buildout fails. There are many points where we can fail - plone.org could be down, PyPI could be down or there could be some nasty surprise where we have some unpinned version updated - maybe brought to our buildout by some addon requirements - or just forgotten to pin.
Today we had scheduled 2 hours timeframe to update all our sites and we hoped that this time everything would go smoothly. Maybe next time... it seems that even though we've mirrored http://dist.plone.org/release/3.2.2 to our own server and pinned versions we still had some loose version out there. Plone 3.2.2 pins zc.buildout to version 1.1.1, but running buildout got us new version of plone.recipe.zope2zeoserver which expected to find 'relative_paths' attribute from zc.buildout. That attribute has appeared on zc.buildout since 1.2 versions, so you see the problem. We modified versions.cfg so that we'll get zc.buildout 1.4.1 version and things went smoothly. Another problem came from LDAPUserFolder. We have python-ldap 2.3.8 installed in our production environment, but LDAPUserFolder had version requirement for anything bigger that 2.3.0 (I'm not sure about exact version number and I'm too lazy to find out) so our buildout happily fetched 2.3.10 which needs libraries our production server doesn't have. Crash. Another version pinning and we were good to go.
Although buildout is blessing sometimes it feels like wild wild west. It isn't funny when we're supposed to update our site and our buildout hangs when it's supposed to fetch data from pypi or plone.org. Now we're thinking setting up eggproxy and/or own pypi mirrors so maybe next time we have smooth experience.
1. Add your new product to buildout.cfg eggs and zcml list.
Do this 24 times and you'll find yourself thinking there has to be better way. Luckily there is; you can extend your buildout.cfg with some general config file (eg. deployement.cfg). Most tutorials describe a way where you have those two files living side by side in filesystem and you run buildout with -c deployement.cfg option, but this doesn't remove the actual problem.
We have our deployement.cfg living in different server on the web, but this brings us some new challenges. Actual extending of buildout.cfg:s parts won't work, or at least I haven't found the way for that). So if your buildout.cfg includes something like this
[instance]
...
eggs =
Plone
...
and you want to extend that eggs list from deployement.cfg you'd think following would work:
[instance]
eggs +=
collective.yourproduct
Well - for our disappointment - it won't. Instead of extending you'll need to remove eggs attribute listing from original buildout.cfg and add it to your deployement.cfg with all the eggs the original buildout.cfg had. That'll do what you want and now you can share that deployement.cfg by adding url to it in your buildout.cfg:s extends attribute.
2. Run the actual buildout command
Once again - repeat this 24 times and if you didn't swear editing all those buildout.cfg:s I mentioned in step I'm sure you'll do that by now. We didn't find any ready solution for this so we made our own called jyu.bomanager. Our tool has buildout.cfg where we can configure all the sites we have. By convention we've named our buildout directorys with the same name our sites are. What jyu.bomanager does is that it runs buildouts automatically for all our sites and restarts them. All seems to be good here, but wait - there are problems too. I'll come to that later.
3. Update/install the new addon.
Now when we have our 24 sites up and running we just need to update or install our new product. I guess you already see what's the problem so I won't repeat myself anymore. To solve this problem we developed a tool similar with jyu.bomanager called jyu.sitemanager. What jyu.sitemanager does is that - depending our mission - it can log in to all of our sites as admin and either install specified new product(s) or upgrade those which needs updating. If everything has gone ok this far we don't have any problems with this step either.. unless we have failed testing our new addon and there are bugs in it, but that'll never happen to us :)
Weakest link...
...is step 2. So far running buildout is usually the part where things fall apart - even when on testing server all went well few minutes ago. It won't help much if we've automated all the steps so that you'd need only few commands to watch the process just happen if buildout fails. There are many points where we can fail - plone.org could be down, PyPI could be down or there could be some nasty surprise where we have some unpinned version updated - maybe brought to our buildout by some addon requirements - or just forgotten to pin.
Today we had scheduled 2 hours timeframe to update all our sites and we hoped that this time everything would go smoothly. Maybe next time... it seems that even though we've mirrored http://dist.plone.org/release/3.2.2 to our own server and pinned versions we still had some loose version out there. Plone 3.2.2 pins zc.buildout to version 1.1.1, but running buildout got us new version of plone.recipe.zope2zeoserver which expected to find 'relative_paths' attribute from zc.buildout. That attribute has appeared on zc.buildout since 1.2 versions, so you see the problem. We modified versions.cfg so that we'll get zc.buildout 1.4.1 version and things went smoothly. Another problem came from LDAPUserFolder. We have python-ldap 2.3.8 installed in our production environment, but LDAPUserFolder had version requirement for anything bigger that 2.3.0 (I'm not sure about exact version number and I'm too lazy to find out) so our buildout happily fetched 2.3.10 which needs libraries our production server doesn't have. Crash. Another version pinning and we were good to go.
Although buildout is blessing sometimes it feels like wild wild west. It isn't funny when we're supposed to update our site and our buildout hangs when it's supposed to fetch data from pypi or plone.org. Now we're thinking setting up eggproxy and/or own pypi mirrors so maybe next time we have smooth experience.
In our deployments we're using ``collective.eggproxy`` to cache packages on our local server and the ``buildout.dumppickedversions`` buildout extension to track down any unpinned eggs.
ReplyDeleteThese save us of most trouble regarding PyPI outage and accidental version changes and I can recommend both tools (available on PyPI).
You say the following doesn't work:
ReplyDeleteIn buildout.cfg:
[instance]
...
eggs =
Plone
...
In deployment.cfg
[instance]
eggs +=
collective.yourproduct
Well, I know that an old version of zc.buildout have an issue with this.
I currently use zc.buildout = 1.4.3 and this works, I use it!
You can verify what's going on with:
bin/buildout annotate
Vincent Fretin
This comes a bit late, but thanks to both of you for your comments. We had recently a nice sprint where we tried to reorganize our buildout environment again.
ReplyDeleteI can confirm the Vincents advice to update zc.buildout worked like a charm. We also put up eggproxy, but still have some random problems with it. As we changed things with a pretty rough hand I think I'm going to write another blog post when I have some spare time.