Part 7a: Setting up a flexible Lenya environment
21 May 05
So you say you have a good Lenya setup going, eh? Well, are you running into memory usage issues? What about when you want to change some things in your Lenya setup and you don't want to affect the live site at all? Looks like we need to expand our setup to be a little more flexible, and so that's what I'll take you through next.
I can't take credit for the actual setup here. This setup was discussed by Michael Wechner and Gregor Rothfuss while we met in Boston this past weekend. But, hopefully the love can be passed on to others getting a little more serious about using Lenya full-time.
Our Default Setup
To recap, our initial setup has been setting up Tomcat (or Jetty, whichever you prefer) and hosting the live and authoring site all on one installation of Tomcat. OK, for small simple sites, there's nothing wrong with this. You put together your sites, you may restart during early morning hours when you make changes, but you're happy.
What if your installation gets pretty large, with multiple users editing and more and more people are visiting your site? Notice how memory usage tends to go up? Well, if you're at the point where the load is too much, you might want to first try separating the live and authoring environments onto separate servers.
Splitting up Authoring and Live environments
So this actually isn't too hard. You'll need to only worry about one thing: how the two servers share the same publications. You can tackle this in multiple ways:
- Create an NFS share between the two servers
- Use your ultra-powerful NAS to share disk space
- Run a cron job that does a secure copy between the servers every few minutes (using rsync)
- Attach a process to the publishing of a page that secure copies the updated files to the live server
There are pluses and minuses to each method. NFS is pretty darn simple to setup, but it has quite a few drawbacks: file locking can be wacky on NFS, tuning performance for NFS is, um, difficult, but most importantly, should your authoring server go down, your live server will too, since the files are only being hosted from the authoring server's NFS share.
Sharing space on a NAS over a SAN is probably a fantastic way to go. Speed is fabulous over a fiber connection to the NAS, and the NAS should have a way to failover to a replicated environment within seconds. The big downer: cost. If you don't already have a NAS or a SAN, this solution definitely ain't for you.
Running a cron job every 5 minutes or so to rsync folders between multiple servers is relatively easy to setup. rsync works very fast, so when run in a decent interval, it can easily do the job. It also is great on performance because each of the servers are using files from the local disk rather than a network share like NFS. The only downers for running rsync as a cron job would be that on average your live site document would be about 2.5 minutes old and that if you have huge amounts of editing happening in the span of 5 minutes and your rsync isn't finished when the new one is supposed to start, then load on the server could escalate exponentially over a short period of time.
Attaching your own process to the publish part of the workflow is an unknown to me. I've never attempted this myself, and I only faintly remember it being mentioned that someone had done this. If anyone has, please do leave a comment and share.
My recommendation: use rsync as a cron job every 5 minutes. If nothing's changed in those 5 minutes, rsync takes up very little load (it doesn't take that much load to begin with). And if you're changing that much content in a 5 minute window, you probably have enough money to buy a NAS and do it the expensive way!
Setting up rsync between your authoring and live servers
So to set this up, you'll want to be using SSH. Since I'm assuming you're using RedHat Linux, SSH should be turned on by default and telnet turned off (you aren't really using telnet, are you?). You'll want to run rsync over SSH so that you aren't passing your documents over in plain text between the servers. To make this easy so that it doesn't prompt for a password every time you try to rsync, you can create client keys between the two servers in SSH. This document is a great tutorial on how to do this. Be sure that you set up the client keys for the user that has permission to access your pubs folder!
Now you'll need to setup Lenya on both servers. See my tutorial on installation, or check out the Lenya website for their documentation. Except this time, you'll want to remove all the contents of the pubs directory on the live server setup. You'll be using the authoring server as the master server, and syncing files in the pubs folder to the live server.
Try a simple test on the authoring server (remember to be logged in as the user that has permissions to the pubs directory):
/usr/bin/rsync -avz -e ssh --delete /usr/local/tomcat/webapps/lenya/lenya/pubs/ live.server.com:/usr/local/tomcat/webapps/lenya/lenya/pubs/
I'm assuming above that your Lenya installation is done with Tomcat and that Tomcat is installed in /usr/local/tomcat/. I also created the server name live.server.com, but replace it with the name of your live server. This setup will show you what happens as it is working (-v switch) and also delete files on the live server that are no longer on the authoring server (--delete switch). Abit of warning: please do back up your publications before doing this, should something go wrong and they get deleted. I can't be responsible should something catastrophic happen. :)
Satisfied? If so, add this to your crontab file to run every five minutes and your off! Test it out by publishing your changes on the authoring server and see how they take effect on the live server within a few minutes.
That's it! You're off to the races with two servers running, one specifically for authoring, and the other for live. Now you can tune the live server for caching and the works. (Hopefully I can put together an article on this too.)
Oh but there's more
But what if this isn't enough? How so? Well, what if you are active in developing new items within Lenya or within your publication and you want to test those in a pre-production state before going live, and you don't want to mess around with your production site's downtime? We've got a solution for that too, and the next article will focus on its details. Stay tuned.