← Galaxy / notesorg-wide / engineering-docs-operations-pgbackrest-setup

pgBackRest setup — Backblaze B2 archive + PITR

engineering-docs-operations-pgbackrest-setup · in engineering/docs/operations · org-wide · updated 2026-06-01 10:19

Frontmatter

lang
en
imported_at
2026-06-01T10:19:43.334Z
source_path
productgalaxy/docs/operations/pgbackrest-setup.md
source_repo
productgalaxy

pgBackRest setup — Backblaze B2 archive + PITR

pgBackRest sidecar runs alongside Postgres on the same host. Full backup nightly, diff weekly (Sundays), WAL archived continuously. Retention: 30 days of fulls. Restore-to-point-in-time tested quarterly against a scratch DB.

0. Backblaze B2 setup (one-time, ~10 min)

  1. Backblaze B2 console → create a private bucket galaxy-pgbackrest-prod
  2. Buckets → bucket → Lifecycle settings → "Keep only the last 30 days of versions"
  3. App Keys → create an app key restricted to that bucket; scope = Read and Write
  4. Note: keyID + applicationKey + endpoint (e.g. s3.us-west-002.backblazeb2.com)

Add to the VPS's /etc/galaxy/.env:

PGBACKREST_S3_BUCKET=galaxy-pgbackrest-prod
PGBACKREST_S3_ENDPOINT=s3.us-west-002.backblazeb2.com
PGBACKREST_S3_REGION=us-west-002
PGBACKREST_S3_KEY=<keyID>
PGBACKREST_S3_KEY_SECRET=<applicationKey>

1. Enable WAL archiving on Postgres

docker/postgres/postgresql-prod.conf:

# pgBackRest archive command — runs on every WAL segment switch.
archive_mode = on
archive_command = 'pgbackrest --stanza=galaxy archive-push %p'
archive_timeout = 60       # force a switch every 60s even on idle (small WALs ≠ bad)

# Postgres-level tuning aligned with pgBackRest's parallel restore.
max_wal_senders = 5
wal_level = replica        # required by pgBackRest backups

Mount this file in docker-compose.yml:

postgres:
  command: postgres -c config_file=/etc/postgresql/postgresql.conf
  volumes:
    - ./docker/postgres/postgresql-prod.conf:/etc/postgresql/postgresql.conf:ro
    - /data/postgres:/var/lib/postgresql/data

Restart Postgres: docker compose restart postgres. Verify: docker exec galaxy_postgres psql -U galaxy -c "SHOW archive_mode"on.

2. pgBackRest sidecar in docker-compose.yml

pgbackrest:
  image: pgbackrest/pgbackrest:latest
  container_name: galaxy_pgbackrest
  restart: unless-stopped
  depends_on:
    postgres:
      condition: service_healthy
  environment:
    PGBACKREST_REPO1_S3_BUCKET: ${PGBACKREST_S3_BUCKET}
    PGBACKREST_REPO1_S3_ENDPOINT: ${PGBACKREST_S3_ENDPOINT}
    PGBACKREST_REPO1_S3_REGION: ${PGBACKREST_S3_REGION}
    PGBACKREST_REPO1_S3_KEY: ${PGBACKREST_S3_KEY}
    PGBACKREST_REPO1_S3_KEY_SECRET: ${PGBACKREST_S3_KEY_SECRET}
    PGBACKREST_REPO1_TYPE: s3
    PGBACKREST_REPO1_PATH: /
    PGBACKREST_REPO1_RETENTION_FULL: 30
    PGBACKREST_REPO1_RETENTION_DIFF: 4
    PGBACKREST_STANZA: galaxy
    PGBACKREST_PROCESS_MAX: 4
    PGBACKREST_LOG_LEVEL_CONSOLE: info
    PGBACKREST_PG1_HOST: postgres
    PGBACKREST_PG1_PORT: 5432
    PGBACKREST_PG1_USER: galaxy
    PGBACKREST_PG1_DATABASE: galaxy
    PGBACKREST_PG1_PATH: /var/lib/postgresql/data
  volumes:
    - /data/pgbackrest:/var/lib/pgbackrest
    - /data/postgres:/var/lib/postgresql/data:ro

Note: pgBackRest needs read-access to the Postgres data directory for stanza ops + restore — that's the :ro mount.

3. Bootstrap the stanza (one-time, ~3 min)

docker compose up -d pgbackrest
docker compose exec pgbackrest pgbackrest --stanza=galaxy stanza-create
docker compose exec pgbackrest pgbackrest --stanza=galaxy check

check must print INFO: switch wal not performed because no primary then INFO: stanza-create command end: completed successfully. If it fails, see Troubleshooting below.

4. First full backup

docker compose exec pgbackrest pgbackrest --stanza=galaxy --type=full backup
# wait ~5-10 min for 1 GB DB; bigger DB scales linearly
docker compose exec pgbackrest pgbackrest --stanza=galaxy info

Output should show full backup: <timestamp>F with status ok.

5. Scheduled backups (systemd timers)

/etc/systemd/system/galaxy-pgbackrest-full.service:

[Unit]
Description=galaxy pgBackRest full backup
After=docker.service

[Service]
Type=oneshot
ExecStart=/usr/bin/docker compose -f /home/galaxy/productgalaxy/docker-compose.yml exec -T pgbackrest pgbackrest --stanza=galaxy --type=full backup
WorkingDirectory=/home/galaxy/productgalaxy
User=galaxy

/etc/systemd/system/galaxy-pgbackrest-full.timer:

[Unit]
Description=galaxy pgBackRest full backup (nightly 03:00 UTC)

[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true

[Install]
WantedBy=timers.target

Same shape for diff (Sundays 04:00) — change --type=full to --type=diff.

sudo systemctl daemon-reload
sudo systemctl enable --now galaxy-pgbackrest-full.timer galaxy-pgbackrest-diff.timer
systemctl list-timers galaxy-*

6. Quarterly restore drill (no overwrite)

# Spin a scratch postgres container alongside production.
docker run -d --name galaxy_restore_scratch \
  -e POSTGRES_PASSWORD=scratch \
  -v /data/postgres-scratch:/var/lib/postgresql/data \
  pgvector/pgvector:pg17

# Stop it; restore from B2 over the scratch data dir.
docker stop galaxy_restore_scratch
docker compose exec pgbackrest pgbackrest \
  --stanza=galaxy --type=time \
  --target='<YYYY-MM-DD HH:MM:SS+00>' \
  --target-action=promote \
  --pg1-path=/var/lib/postgresql/data-scratch \
  restore

# Start scratch, query row counts, compare to last verify snapshot
docker start galaxy_restore_scratch
docker exec galaxy_restore_scratch psql -U galaxy -d galaxy -c "SELECT count(*) FROM comments;"

# Tear down — DO NOT leave running.
docker rm -f galaxy_restore_scratch
rm -rf /data/postgres-scratch

Record the drill timestamp + row counts in docs/operations/restore-drill-log.md. Drift > 0 rows = file an incident.

Troubleshooting

  • archive command failed: exit code 78: pgBackRest can't reach the S3 endpoint. Check PGBACKREST_REPO1_S3_* env vars; test s3cmd ls s3://<bucket> from the host with the same creds.
  • stanza-create error: missing parameter pg1-host: docker network — make sure pgbackrest service is on the same network as postgres. Add explicit networks: block if not.
  • WAL piling up on disk: archive_command is failing silently. Check docker logs galaxy_pgbackrest | grep -i error. Until fixed, Postgres WILL fill the disk (it won't recycle WALs that haven't been archived).
  • Restore says chunk … not found in repo: backup file was deleted by lifecycle policy. Bump PGBACKREST_REPO1_RETENTION_FULL if you need older points.

Outbound links (0)

This note doesn't reference any other entity.

Version history (1)

  • v12026-06-01 10:19"galaxy-docs importer: initial import"