Saturday, 23 November 2013

Backups? What backups? (fit the second)

In my last post I chatted about the sort of backup approach that I use for my own computers, but what about website backups?

We are responsible for managing dozens of client websites, on a number of different servers. All our server and hosting suppliers provide backup (obviously) but that's mainly to cope with server failures. Most of our sites are developed using content management tools, which means that the site owner could be adding or editing content every day.

So how do we guard against:

1) Client wants to revert to an older version of a page

2) Content is corrupted and not noticed for a few days
3) Hosting provider suddenly shuts down (it's not happened so far, fingers crossed)

We've set up a number of systems to protect against these situations.

The first is fairly straightforward: our CMS stores all previous versions of a page in the database and a client can view all of these and revert to whatever version they want.

The others are slightly more complicated.

We originally set up a system that backed up the latest version of our production sites to storage in our office. The procedure was fairly straightforward: copies of the databases were downloaded each night and kept for a couple of weeks, and a copy of the current state of the static files (pictures, pdfs etc) was also kept. The main drawback was doing this every night over a broadband line - some of those databases were quite large, and were downloaded every day even if nothing had changed. So how could we improve things?

Initially we set up a system using an 'unlimited' (hah!) hosting package which did something similar, but to a hosted server, so that we didn't have to worry about bandwidth and storage. It then turned out that the unlimited package we had bought wasn't quite as unlimited as all that. So on to plan C.

Plan C is our current version and now makes use of 'the Cloud'. We are using an Amazon EC2 server to run the backup processes, which now do a daily backup of all the static files and databases on all our production servers and then stores them in Amazon S3 storage. The costs are pleasingly low. We also make use of the Amazon 'Glacier' storage for older backups. This way we can have a complete snapshot of all our site data which is immediately available (so that we can restore individual files or database records) for every day for the last two weeks, and we have further daily backups for three months which can be recovered in a few hours. (Obviously all the backup files are password protected and aren't directly web accessible).

We've also developed a backup management system that warns us if a backup is overdue for some reason.

So now we can sleep easy in our beds!

The whole process of developing the backup strategy to the position we're now in has taken many weeks of development time. Apart from the obvious advantage of having a reliable backup system, it's also served as a useful opportunity to experiment with the Amazon Cloud services - which are pretty impressive.

But there are times when I wonder if we're very good at "business". We've done all this work to ensure our customers' data is safe, but do we charge them an arm and a leg for the extra security? Do we heck! All part of the Technoleg Taliesin service.