Litestream: Continuous SQLite Replication Guide
Litestream brings continuous, real‑time replication to SQLite, turning a lightweight file‑based database into a resilient, cloud‑ready datastore. In this guide we’ll walk through installing Litestream, configuring it for production, and building a few hands‑on examples that demonstrate live replication, point‑in‑time restores, and multi‑region failover. By the end, you’ll be able to add fault tolerance to any SQLite‑backed app without rewriting a single line of SQL.
Why Replicate SQLite?
SQLite shines because it’s serverless, zero‑config, and incredibly fast for read‑heavy workloads. However, its single‑file nature means a disk failure or accidental delete can wipe out your data. Replication solves that problem by keeping a copy of the database in a remote object store (AWS S3, GCS, Azure Blob, etc.) and streaming changes as they happen.
Traditional master‑slave setups require a dedicated database server, complex networking, and a lot of operational overhead. Litestream sidesteps all that: it runs as a lightweight side‑car process, watches the SQLite file for changes, and uploads WAL (Write‑Ahead Log) segments to the cloud. The result is an always‑on backup that can be restored in seconds.
Getting Started: Installation
Litestream is distributed as a single binary for Linux, macOS, and Windows. Choose the method that matches your environment.
Linux/macOS (Homebrew)
brew install litestream
Linux (apt)
curl -L https://github.com/benbjohnson/litestream/releases/download/v0.4.0/litestream_0.4.0_linux_amd64.tar.gz -o litestream.tar.gz
tar -xzf litestream.tar.gz
sudo mv litestream /usr/local/bin/
Windows (PowerShell)
Invoke-WebRequest -Uri "https://github.com/benbjohnson/litestream/releases/download/v0.4.0/litestream_0.4.0_windows_amd64.zip" -OutFile "litestream.zip"
Expand-Archive litestream.zip -DestinationPath $Env:ProgramFiles\Litestream
# Add to PATH
$Env:Path += ";$Env:ProgramFiles\Litestream"
After installation, verify the version:
litestream version
Pro tip: Pin the binary version in your CI/CD pipeline to avoid accidental upgrades that could change default behavior.
Basic Configuration
Litestream reads a YAML configuration file (default location ~/.litestream.yml) that defines replication destinations, retention policies, and which SQLite databases to watch. Below is a minimal config that replicates app.db to an S3 bucket.
dbs:
- path: /var/www/app/data/app.db
replicas:
- url: s3://my-litestream-backups/app.db
# Optional: keep the last 30 days of WAL files
retention: 30d
Key fields:
- path: absolute path to the SQLite file.
- url: destination URL; Litestream supports
s3://,gs://,azure://, and local filesystem paths. - retention: how long to keep old WAL segments; set to
0for indefinite.
Save the file and test the configuration:
litestream replicate -config ~/.litestream.yml -dry-run
Running Litestream as a Service
In production you’ll want Litestream to start automatically and stay alive alongside your application. The easiest way on Linux is to use a systemd unit file.
[Unit]
Description=Litestream replication for SQLite
After=network-online.target
[Service]
ExecStart=/usr/local/bin/litestream replicate -config /home/ubuntu/.litestream.yml
Restart=always
User=ubuntu
Group=ubuntu
Environment="AWS_ACCESS_KEY_ID=YOUR_KEY"
Environment="AWS_SECRET_ACCESS_KEY=YOUR_SECRET"
Environment="AWS_DEFAULT_REGION=us-east-1"
[Install]
WantedBy=multi-user.target
Enable and start the service:
sudo systemctl enable litestream
sudo systemctl start litestream
sudo systemctl status litestream
Pro tip: Use IAM roles (EC2 instance profiles or ECS task roles) instead of embedding credentials in the unit file. This eliminates secret leakage and simplifies rotation.
Live Replication in Action
Let’s build a tiny Flask app that stores notes in SQLite. We'll see how Litestream replicates changes without any extra code.
Step 1: Project Setup
mkdir notes-app
cd notes-app
python -m venv venv
source venv/bin/activate
pip install flask sqlalchemy
Step 2: Define the Model
# models.py
from sqlalchemy import create_engine, Column, Integer, Text
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine('sqlite:///notes.db', echo=False, future=True)
Base = declarative_base()
class Note(Base):
__tablename__ = 'notes'
id = Column(Integer, primary_key=True)
content = Column(Text, nullable=False)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
Step 3: Simple Flask Routes
# app.py
from flask import Flask, request, jsonify
from models import Session, Note
app = Flask(__name__)
@app.route('/notes', methods=['POST'])
def create_note():
data = request.get_json()
session = Session()
note = Note(content=data['content'])
session.add(note)
session.commit()
return jsonify({'id': note.id, 'content': note.content}), 201
@app.route('/notes', methods=['GET'])
def list_notes():
session = Session()
notes = session.query(Note).all()
return jsonify([{'id': n.id, 'content': n.content} for n in notes])
Run the app:
FLASK_APP=app.py flask run
Now, every POST request writes a new row to notes.db. Litestream, if configured to watch notes.db, will instantly upload the generated WAL segment to your S3 bucket. No changes to the Flask code are required.
Point‑in‑Time Restores
One of Litestream’s most powerful features is the ability to restore the database to any moment within the retention window. This is invaluable for recovering from accidental deletes or data corruption.
Performing a Restore
- Stop the application (or lock the DB) to avoid new writes.
- Identify the timestamp you want to revert to. Litestream stores WAL files with timestamps in their names.
- Run the
litestream restorecommand, specifying the target time.
Example: restore to 2024‑12‑01 03:15 UTC.
litestream restore -config ~/.litestream.yml \
-time "2024-12-01T03:15:00Z" \
/var/www/app/data/app.db
Litestream will download the base snapshot and all WAL segments up to the given timestamp, replay them, and produce a fresh app.db file.
Pro tip: Automate daily snapshots with litestream replicate -snapshot-interval 24h. Snapshots dramatically speed up restores because Litestream can start from a recent full copy instead of replaying every WAL segment.
Multi‑Region Failover
For truly global applications you might replicate to multiple cloud providers. Litestream’s configuration can list several replicas for the same database, each pointing to a different region or provider.
dbs:
- path: /var/www/app/data/app.db
replicas:
- url: s3://us-east-1-backups/app.db
region: us-east-1
- url: s3://eu-west-1-backups/app.db
region: eu-west-1
- url: gs://asia-east1-backups/app.db
region: asia-east1
If the primary region suffers an outage, you can spin up a new instance in another region and restore the most recent replica.
Failover Script Example
#!/usr/bin/env python3
import subprocess, os, sys, datetime
DB_PATH = "/var/www/app/data/app.db"
CONFIG = os.path.expanduser("~/.litestream.yml")
TARGET_REGION = os.getenv("TARGET_REGION", "eu-west-1")
def latest_snapshot():
# List objects in the chosen bucket and pick the newest snapshot
cmd = [
"aws", "s3", "ls",
f"s3://{TARGET_REGION}-backups/app.db/",
"--recursive"
]
out = subprocess.check_output(cmd).decode()
snapshots = [line for line in out.splitlines() if line.endswith(".snap")]
if not snapshots:
sys.exit("No snapshots found")
latest = max(snapshots, key=lambda l: l.split()[0]) # sort by date
return latest.split()[-1] # object key
def restore():
snapshot = latest_snapshot()
url = f"s3://{TARGET_REGION}-backups/app.db/{snapshot}"
cmd = [
"litestream", "restore",
"-config", CONFIG,
"-url", url,
DB_PATH
]
subprocess.check_call(cmd)
if __name__ == "__main__":
restore()
print(f"Restored {DB_PATH} from {TARGET_REGION} at {datetime.datetime.utcnow()} UTC")
Deploy this script as part of your disaster‑recovery runbook. When a region goes down, run the script on a fresh instance, and your app will be back online within minutes.
Monitoring & Alerting
Litestream emits structured JSON logs to stdout, which makes it easy to ship to log aggregators (Datadog, Splunk, Loki). You can also enable the built‑in metrics endpoint for Prometheus.
Enable JSON Logging
litestream replicate -config ~/.litestream.yml -log-format json
Typical log entry for a successful WAL upload:
{
"time":"2024-03-28T12:45:01Z",
"level":"info",
"msg":"replicated WAL segment",
"db":"/var/www/app/data/app.db",
"replica":"s3://my-litestream-backups/app.db",
"segment":"20240328-1245-01.wal",
"size_bytes":65536,
"duration_ms":124
}
Prometheus Metrics
Start Litestream with the -metrics-addr flag to expose a /metrics endpoint.
litestream replicate -config ~/.litestream.yml -metrics-addr :9090
Key metrics include:
litestream_wal_segments_total– total number of WAL segments uploaded.litestream_replication_errors_total– cumulative replication failures.litestream_snapshot_duration_seconds– time taken to create a snapshot.
Pro tip: Set an alert on litestream_replication_errors_total to fire if the counter increments within a 5‑minute window. A single error often indicates network or permission problems that need immediate attention.
Real‑World Use Cases
Edge Devices & IoT – Many edge applications store telemetry in SQLite because of its low footprint. Litestream can push each device’s WAL to a central S3 bucket, allowing you to aggregate data across thousands of devices without running a full database server on the edge.
Serverless Functions – AWS Lambda or Cloudflare Workers can use SQLite for temporary caching. By configuring Litestream inside the function’s init phase, every write is instantly backed up, making the function’s state durable across invocations.
Monolithic Legacy Apps – Companies with decades‑old desktop software often rely on SQLite files stored on shared drives. Adding Litestream as a side‑car on the file server provides continuous off‑site backups without altering the legacy codebase.
Advanced Configuration Patterns
Selective Replication with Include/Exclude Filters
Sometimes you only want to replicate a subset of tables (e.g., exclude large binary blobs). While Litestream replicates at the file level, you can combine it with SQLite’s PRAGMA wal_checkpoint(TRUNCATE) and VACUUM INTO to create a trimmed copy for replication.
import sqlite3, os
DB_PATH = "app.db"
TMP_PATH = "replica.db"
def create_replica():
con = sqlite3.connect(DB_PATH)
# Copy only the schema and non‑blob tables
con.execute("VACUUM INTO ?", (TMP_PATH,))
con.close()
os.replace(TMP_PATH, DB_PATH) # Replace original for Litestream to pick up
# Run before each nightly replication
if __name__ == "__main__":
create_replica()
Encrypting Replicas at Rest
Litestream itself does not encrypt data, but you can enable server‑side encryption on your storage bucket. For S3, set the bucket’s default encryption to AES‑256 or AWS‑KMS. For GCS, enable CMEK (Customer‑Managed Encryption Keys).
Pro tip: Combine bucket policies with IAM conditions that require TLS (HTTPS) for all upload requests. This prevents accidental plaintext transmission.
Testing Replication Locally
Before you push changes to production, spin up a local MinIO container to simulate S3. This lets you verify the entire replication pipeline without incurring cloud costs.
docker run -d -p 9000:9000 \
-e MINIO_ROOT_USER=admin \
-e MINIO_ROOT_PASSWORD=secret123 \
minio/minio server /data
# Add a bucket
mc alias set local http://localhost:9000 admin secret123
mc mb local/litestream-test
Update your ~/.litestream.yml to point to the local endpoint:
dbs:
- path: ./test.db
replicas:
- url: s3://litestream-test/test.db
endpoint: http://localhost:9000
access-key-id: admin
secret-access-key: secret123
Run Litestream and make a few writes to test.db. Then inspect the bucket with mc ls local/litestream-test to confirm WAL files appear.
Security Considerations
- Least‑Privilege IAM: Grant the Litestream IAM user only
s3:PutObjectands3:GetObjectfor the specific bucket. - Network Isolation: Run Litestream in a private subnet with outbound internet access via a NAT gateway. This prevents accidental exposure of your database files.
- File Permissions: Ensure the SQLite file is owned by the Litestream process user (e.g.,
chmod 600 app.db) to avoid other processes reading raw data.
Pro tip: Enable S3 Object Lock with a compliance mode for 30 days. This makes