Thursday, June 4, 2009

Backup SQLite database with rdiff

We want to regularly take backup of running Web applications' SQLite databases (you can't just copy the file... what if the app modifies the database while copying is taking place?)

A cron job like
sqlite3 $DB .dump >$DUMP
just do the job, at least for small databases.
However, when the database grows upto some 100's of megabytes, the dump process takes more than 1 minute, while holding lock, resulting in timeout error in application.

If you are on LVM you can use snapshot to do a lock-free backup. Unfortunately we are on normal partition so locking database is inevitable.

After some research and experiments, I decided to use rdiff to make the locking period as short as possible.

rdiff is a tool to make and apply deltas of binary files. Instead of just comparing two files on the fly, rdiff works in two steps:
  1. compute signature of old file
  2. compute delta from signatgure and new file
To use this to backup an SQLite database,
  1. compute signature of previous backup
  2. lock the database
  3. compute delta from the signature and database
  4. unlock database
  5. applying delta to the previous backup makes a new backup
As you can see, reading the database is the only massive I/O which happens while holding the lock. Signature and delta are small. Thus, this method can be faster than others, provided I/O is the bottleneck.

In our case, rdiff delta only takes 4-5 seconds while cp'ing the whole database takes 25 seconds.

I wrote a Python script, sqlite-lock (attached at the end), to lock the SQLite database, and taking backup like below:
rdiff signature $LATEST $SIG
sqlite-lock rdiff delta $SIG $DB $DELTA
rdiff patch $DELTA $LATEST $NEW
ln -f $NEW $LATEST
Actually the database is at a datacenter and backup is on a local server, so rdiff delta is executed on remote side. Network traffic is minimum because only signature and delta are transmitted. As I understand this is basically what rsync does.

import sys, os
import sqlite3
except ImportError:
from pysqlite2 import dbapi2 as sqlite3

if len(sys.argv) < 3:
print >>sys.stderr, "Usage: %s DB CMDLINE..." % sys.argv[0]
print >>sys.stderr, "Executes CMDLINE while locking DB"

src = sys.argv[1]
cmdline = sys.argv[2:]

db = sqlite3.connect(src, timeout=60.0, isolation_level=None)
db.execute("BEGIN DEFERRED")
db.execute("SELECT COUNT(*) FROM sqlite_master") # This creates SHARED lock
retval = os.spawnvp(os.P_WAIT, cmdline[0], cmdline)
if retval != 0:
print >>sys.stderr, "Command returned %d" % retval

No comments: