10/04/2010

Improve Availability and Performance Using Database Replication

Database replication is the capability of a database to automatically maintain a copy of itself at a separate location. That other location could be another server in the rack, or a server on the other side of the globe.

Regardless of the location of the replicated database, most databases attempt to replicate the data in real time, or close to it. (Real time means at the same time.) Since you can count on the replicated data to be the same or almost the same as the "real" data at any given time, you can leverage your "hot spare" in a variety of ways. For this discussion, the master server is the database containing the primary copy of the data, and the slave server is the database containing the replicated copy.

How does it work?

Some databases provide replication functionality and some do not. The very popular MySQL database does a very nice job in its implementation of replication. It's easy to configure and it works well. It uses a binary update log, which means SQL statements which update data in the database are stored in a log on the master server. The slave server keeps its own pointer, which represents its current position in the binary log of the master. If it's not up to date, it reads the master's binary log for the next SQL statement, runs the statement on its own data, and updates its pointer. This design makes it easy to have multiple slaves since each slave maintains its own pointer to the master's binary update log.

Why do we need replication?

Database replication can be used to improve system availability and performance. Depending on your application, you may employ replication to address either or both of these issues. Here's how:

Using replication to improve system availability.

Replication enhances availability in two ways. Having an exact (or nearly exact) copy of your data in another location provides the obvious benefit of allowing you to switch to a backup system quickly since you don't have to wait for a restore from tape, which could potentially take hours. Even if you do the restore, the tape won't include data that was added to the system since the last backup.

Less obviously, replication provides a great way to do system backups to tape without having to bring the system down. Just leave the master database running, temporarily stop replication, shut down the slave database and make a clean backup of your data. Re-start the slave and replication, and the slave will "catch up" to the master in short order - your users will greatly appreciate the increased uptime of your system.

And think about this: You will be more likely to make good backups often, since doing so won't impact the availability of the system to your users. If you use UNIX or LINUX, you might consider setting up a CRON job to stop your slave(s), backup the data, and re-start the slaves - this gives you a hands-off, regular backup system, with no system downtime - what a deal!

You may be asking, "Why do I still need to do backups if I have replication in place?" A lot of reasons, but one quick example should put that issue to rest: If you mean to enter the SQL statement "drop table tempLog;", but instead type "drop table allMyCustomerDataSinceTheDawnOfTime;" by mistake, your eager-to-please database will happily replicate the command on the slave, and you can say bye-bye to all that critical data. Unless you're using transactions and can "roll it back", your only recourse will most likely be your last tape backup.

For added Web site availability, use replication to implement an automatic failover system. Set up your HTTP server on its own server and have it direct queries to the master database server. If the master stops responding, the HTTP server can start sending queries to the slave instead.

Using replication to improve system performance.

If your application is query intensive, it may be possible to dramatically improve performance by using replication to implement rudimentary load-balancing: just send some queries to another server. Set up your Web site or other application so that all queries that change data (INSERT, UPDATE, DELETE) execute on the master server, and all read-only queries (SELECT) run on the slave server. If necessary, set up multiple slave servers all replicating from the same master. Design your application to distribute SELECT queries across all available slaves in some type of rotating or random fashion. Once this basic infrastructure is in place, you can simply add more slaves as query volume increases.