We want to regularly take backup of running Web applications' SQLite databases (you can't just copy the file... what if the app modifies the database while copying is taking place?)
A cron job like
sqlite3 $DB .dump >$DUMP
just do the job, at least for small databases.
However, when the database grows upto some 100's of megabytes, the dump process takes more than 1 minute, while holding lock, resulting in timeout error in application.
If you are on LVM you can use snapshot to do a lock-free backup. Unfortunately we are on normal partition so locking database is inevitable.
After some research and experiments, I decided to use
rdiff to make the locking period as short as possible.
rdiff is a tool to make and apply deltas of binary files. Instead of just comparing two files on the fly, rdiff works in two steps:
- compute signature of old file
- compute delta from signatgure and new file
To use this to backup an SQLite database,
- compute signature of previous backup
- lock the database
- compute delta from the signature and database
- unlock database
- applying delta to the previous backup makes a new backup
As you can see, reading the database is the only massive I/O which happens while holding the lock. Signature and delta are small. Thus, this method can be faster than others, provided I/O is the bottleneck.
In our case, rdiff delta only takes 4-5 seconds while cp'ing the whole database takes 25 seconds.
I wrote a Python script, sqlite-lock (attached at the end), to lock the SQLite database, and taking backup like below:
rdiff signature $LATEST $SIG
sqlite-lock rdiff delta $SIG $DB $DELTA
rdiff patch $DELTA $LATEST $NEW
ln -f $NEW $LATEST
Actually the database is at a datacenter and backup is on a local server, so rdiff delta is executed on remote side. Network traffic is minimum because only signature and delta are transmitted. As I understand this is basically what rsync does.
sqlite-lock:
#!/usr/bin/python
import sys, os
try:
import sqlite3
except ImportError:
from pysqlite2 import dbapi2 as sqlite3
if len(sys.argv) < 3:
print >>sys.stderr, "Usage: %s DB CMDLINE..." % sys.argv[0]
print >>sys.stderr, "Executes CMDLINE while locking DB"
sys.exit(1)
src = sys.argv[1]
cmdline = sys.argv[2:]
db = sqlite3.connect(src, timeout=60.0, isolation_level=None)
db.execute("BEGIN DEFERRED")
db.execute("SELECT COUNT(*) FROM sqlite_master") # This creates SHARED lock
retval = os.spawnvp(os.P_WAIT, cmdline[0], cmdline)
db.execute("COMMIT")
if retval != 0:
print >>sys.stderr, "Command returned %d" % retval
sys.exit(retval)