SOPS + age secrets setup (deploy-time decryption)
engineering-docs-operations-sops-secrets-setup · in engineering/docs/operations · org-wide · updated 2026-06-01 10:19
Frontmatter
- lang
- en
- imported_at
- 2026-06-01T10:19:43.525Z
- source_path
- productgalaxy/docs/operations/sops-secrets-setup.md
- source_repo
- productgalaxy
SOPS + age secrets setup (deploy-time decryption)
Status: required for deploy-staging.yml, deploy-prod.yml, deploy-prod-rollback.yml.
Owner: ops.
Last reviewed: 2026-05-25.
Why this exists
CLAUDE.md §14 and ADR-003 §"Secrets" together ban two patterns:
env_file:for prod secrets in Compose. Dockerinspectexposes every env var to anyone with Docker-socket access (read: any container in the engine, depending on socket exposure). Bug bounty reports from 2025-26 repeatedly catch teams here.- Secrets in plain
environment:blocks in prod Compose. Same problem, plus the values land in shell history, inpsoutput, and indocker compose configdumps.
The only safe pattern on a single-host Compose stack is Compose secrets:
file mounts, where each secret is a file on tmpfs inside the container,
readable only by the configured uid. To get the values onto the host in the
first place — without committing plaintext to git — we use SOPS + age:
secrets/{staging,prod}/galaxy.enc.yamlis encrypted-at-rest in git. Anyone with read access to the repo can see structure but not values.- At deploy time, the GitHub Actions runner decrypts the file using
SOPS_AGE_KEY_STAGINGorSOPS_AGE_KEY_PROD(GitHub Encrypted Secrets) andscps the plaintext yaml to/etc/galaxy/secrets/galaxy.yamlon the VPS (mode 600), then immediately shreds the runner-side copy. - Compose mounts that yaml as Docker Secrets via the
secrets:block indocker-compose.prod.yml. Containers read the file path (e.g./run/secrets/database_url), not an env var.
What you'll set up
| Component | Where it lives |
|---|---|
sops + age binaries |
local dev box + each VPS + GitHub Actions runners |
| Age key pairs (one per env) | dev box (private) + GitHub Secrets (private) + repo (public recipient in .sops.yaml) |
.sops.yaml |
repo root — maps secrets/<env>/*.enc.yaml → recipient |
secrets/staging/galaxy.enc.yaml |
repo — encrypted |
secrets/prod/galaxy.enc.yaml |
repo — encrypted |
/etc/galaxy/secrets/galaxy.yaml |
each VPS — decrypted at deploy time, mode 600 |
Prerequisites
- Mac (or Linux) dev box with Homebrew or apt.
- Repo write access.
- A safe place to store private age keys (1Password / Bitwarden vault).
Steps — one-time on your dev box
1. Install sops + age
# macOS
brew install sops age
# Debian / Ubuntu
SOPS_VER=3.9.1
curl -fsSL "https://github.com/getsops/sops/releases/download/v${SOPS_VER}/sops-v${SOPS_VER}.linux.amd64" \
-o /usr/local/bin/sops && chmod +x /usr/local/bin/sops
AGE_VER=1.2.0
curl -fsSL "https://github.com/FiloSottile/age/releases/download/v${AGE_VER}/age-v${AGE_VER}-linux-amd64.tar.gz" \
| tar -xz -C /tmp
sudo mv /tmp/age/age /tmp/age/age-keygen /usr/local/bin/
sops --version && age --version
2. Generate one age key pair per environment
mkdir -p ~/.config/sops/age
cd ~/.config/sops/age
age-keygen -o galaxy-dev.key # local dev (optional but useful)
age-keygen -o galaxy-staging.key
age-keygen -o galaxy-prod.key
# The output looks like:
# # created: 2026-05-25T...
# # public key: age1abcdef...
# AGE-SECRET-KEY-1XYZ...
#
# - The PUBLIC key (age1...) is the recipient — safe to commit.
# - The SECRET key (AGE-SECRET-KEY-1...) MUST NEVER leave this directory
# without ending up in (a) the team password manager AND (b) GitHub Secrets.
chmod 600 galaxy-*.key
Stash the secret keys in the team password manager (vault entry per
environment) BEFORE moving to step 3. A lost age key means rotating every
secret in galaxy.enc.yaml, which is painful.
3. Commit .sops.yaml to the repo root
See .sops.yaml — already created by Phase 6b. It tells
sops which recipient to use for each file:
creation_rules:
- path_regex: ^secrets/dev/.*\.enc\.yaml$
encrypted_regex: ^(?!sops_).*
age: <age public key for dev>
- path_regex: ^secrets/staging/.*\.enc\.yaml$
encrypted_regex: ^(?!sops_).*
age: <age public key for staging>
- path_regex: ^secrets/prod/.*\.enc\.yaml$
encrypted_regex: ^(?!sops_).*
age: <age public key for prod>
When the file is updated with the actual age recipient values you generated in step 2, commit + push.
4. Create the encrypted secret files
mkdir -p secrets/{staging,prod}
# Build a plain YAML in a tmpfs scratch dir, encrypt with sops, delete plain.
TMPDIR=$(mktemp -d)
cat > "$TMPDIR/galaxy.yaml" <<'YAML'
# productgalaxy staging secrets — DECRYPTED FORM, never committed.
database_url: "postgres://galaxy_app:CHANGEME@postgres:5432/galaxy"
mcp_db_url: "postgres://mcp_app:CHANGEME@postgres:5432/galaxy"
better_auth_secret: "REPLACE_WITH_RANDOM_64_CHAR"
oauth_signing_key: "REPLACE_WITH_RANDOM_64_CHAR"
docs_api_token: "REPLACE_WITH_RANDOM_32_CHAR"
b2_key_id: "B2_KEY_ID"
b2_application_key: "B2_APPLICATION_KEY"
pgbackrest_cipher_pass: "REPLACE_WITH_RANDOM_64_CHAR"
smoke_principal_jwt: "smoke principal long-lived JWT (rotate quarterly)"
YAML
sops --encrypt "$TMPDIR/galaxy.yaml" > secrets/staging/galaxy.enc.yaml
shred -u "$TMPDIR/galaxy.yaml"
rm -rf "$TMPDIR"
# Repeat for prod with prod values.
Verify the encrypted file looks like ciphertext: head secrets/staging/galaxy.enc.yaml
should show base64 blobs, not the values above. Commit + push.
Example decrypted secret shape
When sops --decrypt secrets/staging/galaxy.enc.yaml runs (either locally
or in the deploy workflow), the result is:
database_url: "postgres://galaxy_app:s3cret@postgres:5432/galaxy"
mcp_db_url: "postgres://mcp_app:m3cret@postgres:5432/galaxy"
better_auth_secret: "f1c7d3...e9" # 64 random chars
oauth_signing_key: "9a8b7c...41"
docs_api_token: "0e6d2a...c4"
b2_key_id: "K005abcdef..."
b2_application_key: "K005ghijklmn..."
pgbackrest_cipher_pass: "f3a7b1...d2"
smoke_principal_jwt: "eyJhbGc..." # OAuth M2M short-lived JWT
docker-compose.prod.yml mounts the WHOLE FILE as a single Docker secret
(/run/secrets/galaxy_yaml) and each container reads the keys it needs via
a tiny entrypoint that exports them as env vars inside the container's own
namespace (not via Compose environment:, which would leak to
docker inspect).
5. Stash the age private keys in GitHub Secrets
# from your dev box
gh secret set SOPS_AGE_KEY_STAGING < ~/.config/sops/age/galaxy-staging.key
gh secret set SOPS_AGE_KEY_PROD < ~/.config/sops/age/galaxy-prod.key
GitHub Encrypted Secrets are encrypted at rest with libsodium; only workflow runs can decrypt them, and they never appear in logs (auto-masked).
6. (Optional but recommended) install sops + age on each VPS
The deploy workflows decrypt on the runner and ship plaintext to the VPS, so
the VPS doesn't strictly need sops. But install it anyway — it's used by
the operator's emergency rollback runbook and by /galaxy:rollback:
# on each VPS (repeat the apt commands from step 1)
sudo apt-get install -y sops age
Rotation cadence
- Age keys: rotate yearly minimum (calendar reminder in ops cycle).
Rotation procedure:
- Generate a new key (
age-keygen -o galaxy-prod-2027.key). - Add the new recipient to
.sops.yaml(keep the old one for grace). sops updatekeys secrets/prod/galaxy.enc.yamlre-encrypts to both recipients. Commit + push.- Update
SOPS_AGE_KEY_PRODin GitHub Secrets to the new private key. - After 7 days, remove the old recipient from
.sops.yaml; re-runsops updatekeys; commit + push. Old key is now ignored.
- Generate a new key (
- Individual secret values (DB passwords, B2 keys, JWT signing keys):
rotate quarterly or immediately on personnel change. Use
sops secrets/prod/galaxy.enc.yamlto open an editor with the decrypted file, change values, save, sops re-encrypts on close.
Troubleshooting
| Symptom | Likely cause |
|---|---|
sops --decrypt returns "no key found" |
SOPS_AGE_KEY env var not set, or key doesn't match the recipient in the file |
sops succeeds but yaml parse fails inside container |
encrypted_regex caught a structural key by mistake — check .sops.yaml |
permission denied reading /etc/galaxy/secrets/galaxy.yaml |
deploy step didn't chmod 600 the destination; re-deploy |
| Workflow log shows secret value in plain | someone added echo $secret to a step — rotate immediately |
Hard rules (do NOT bypass)
- NEVER
git commitan unencrypted*.yamlundersecrets/. The.gitignorealready excludessecrets/**/*.yaml(without.enc.); thepre-commithook (Phase 6c) re-checks. If you bypass and push: rotate every value the file contained, then revoke the commit history withgit filter-repo. - NEVER
cat /etc/galaxy/secrets/galaxy.yamlin a Claude Code session. Settings.json deniesBash(cat:*galaxy.yaml*); if you find a way around the deny rule, file a bug — that's the three-lock pattern leaking. - NEVER paste a secret value into chat / a PR comment / a Slack message. Anything that ends up in a SaaS log is effectively compromised.