VPS bootstrap — Hetzner ax-line
engineering-docs-operations-vps-bootstrap · in engineering/docs/operations · org-wide · updated 2026-06-01 10:19
Frontmatter
- lang
- en
- imported_at
- 2026-06-01T10:19:43.642Z
- source_path
- productgalaxy/docs/operations/vps-bootstrap.md
- source_repo
- productgalaxy
VPS bootstrap — Hetzner ax-line
Provision the production host for productgalaxy. Run-once per host. Total time: ~45 minutes including waiting for apt + Docker pulls.
0. Order the box
- Hetzner: AX42 (Ryzen 7700, 64 GB ECC, 2× 512 GB NVMe) — ~€39/mo
- Or equivalent: 8+ vCPU, 32+ GB RAM, 2 NVMe drives (one for OS+Docker, one for Postgres + pgBackRest)
- OS: Debian 12 minimal
- SSH key uploaded during order
1. First SSH + harden
ssh root@<vps-ip>
adduser galaxy --disabled-password --gecos ""
usermod -aG sudo galaxy
mkdir -p /home/galaxy/.ssh
cp /root/.ssh/authorized_keys /home/galaxy/.ssh/
chown -R galaxy:galaxy /home/galaxy/.ssh
chmod 700 /home/galaxy/.ssh
chmod 600 /home/galaxy/.ssh/authorized_keys
# disable root SSH + password auth
sed -i 's/^#*PermitRootLogin.*/PermitRootLogin no/; s/^#*PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
systemctl restart sshd
# from now on: ssh galaxy@<vps-ip>
2. Firewall
sudo apt update && sudo apt install -y ufw
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 22/tcp # SSH (consider moving to a non-22 port if pinged by bots)
# DO NOT open 80/443 — Cloudflare Tunnel handles those (per cloudflare-tunnel-setup.md).
# If skipping Cloudflare Tunnel, instead: sudo ufw allow 443/tcp
sudo ufw --force enable
3. Docker + Compose
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/debian $(. /etc/os-release && echo "$VERSION_CODENAME") stable" \
| sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo usermod -aG docker galaxy
# log out + back in so the group takes effect
4. Mount the second NVMe for Postgres + pgBackRest
lsblk # find the second drive (e.g. nvme1n1)
sudo mkfs.ext4 -L galaxy-data /dev/nvme1n1
sudo mkdir -p /data
sudo blkid /dev/nvme1n1 # copy the UUID
echo "UUID=<paste-uuid> /data ext4 defaults,noatime 0 2" | sudo tee -a /etc/fstab
sudo mount /data
sudo mkdir -p /data/postgres /data/pgbackrest
sudo chown -R galaxy:galaxy /data
5. Clone repo + secrets
mkdir -p /etc/galaxy
sudo chown galaxy:galaxy /etc/galaxy
cd ~
git clone git@github.com:parhumm/productgalaxy.git
cd productgalaxy
# Fetch the .env from SOPS-encrypted blob (see sops-secrets-setup.md)
sops -d .env.production.enc > /etc/galaxy/.env
chmod 600 /etc/galaxy/.env
ln -sf /etc/galaxy/.env .env
6. Mount /data into docker-compose.yml (one-time edit)
In docker-compose.yml, point Postgres data + pgBackRest at /data/... instead of named volumes:
postgres:
volumes:
- /data/postgres:/var/lib/postgresql/data
- ./docker/postgres/tsearch_data:/usr/share/postgresql/17/tsearch_data/galaxy-extra:ro
pgbackrest:
volumes:
- /data/pgbackrest:/var/lib/pgbackrest
7. Bring up
docker compose pull
docker compose up -d postgres
sleep 10
pnpm --filter @galaxy/db run migrate # apply 0000 + 9001-9006
docker compose up -d # app + mcp + caddy + pgbackrest
docker compose ps # all healthy?
curl -fsSL http://localhost:3000/api/v1/openapi.json | jq '.paths | length'
8. Set up monitoring + alerts
See observability-setup.md — OpenTelemetry + Grafana/Loki/Tempo bring-up.
9. First backup + restore drill
docker compose exec pgbackrest pgbackrest --stanza=galaxy stanza-create
docker compose exec pgbackrest pgbackrest --stanza=galaxy --type=full backup
docker compose exec pgbackrest pgbackrest info
Restore drill (against a scratch container, NEVER overwrite prod):
# Spin up a scratch postgres container; restore there; query row counts; tear down.
# See pgbackrest-setup.md §"Quarterly restore drill".
Troubleshooting
- Postgres won't start after
/datamount: checkchown -R 999:999 /data/postgres(the postgres user inside the container is uid 999, not your hostgalaxy) - pgBackRest stanza-create fails with
archive_mode is off: enable WAL archiving in postgresql.conf (see pgbackrest-setup.md) - Cloudflare Tunnel can't reach :3000: tunnel runs on the host; container ports are bound to
localhost, so the tunnel must use--url http://localhost:3000not the public IP