project-pal-e-backup updated 2026-03-16pal-e-backup
Vision
Off-site backup and disaster recovery for the entire Pal-E platform. If the PC dies, everything can be rebuilt from cloud backups — databases, git repos, identity, secrets, object storage. One unified pipeline, one cloud destination, managed as IaC.
User Stories
| Role | Key | Story | Success Metric |
|---|---|---|---|
| Platform owner | sleep-at-night |
I want to know that a disk failure won't destroy my platform | Full restore from cloud backups tested and documented |
| Platform owner | backup-confidence |
I want to be alerted if any backup fails | Alert fires within 1 hour of a missed backup |
| Platform owner | restore-speed |
I want to restore the full platform in under 2 hours | Restore time documented from DR test |
Plan
Active: plan-pal-e-backup — Off-Site Platform Backup
7 phases: foundation (S3 + repo), database backups, Forgejo backup, MinIO mirror, identity/secrets, monitoring/verification, disaster recovery test.
Board
board-pal-e-backup — Pal-E Backup Board. Continuous kanban.
Status
- Current backup coverage: pal-e-docs DB and woodpecker DB have CNPG WAL archiving to local MinIO. Terraform state backed up to local MinIO daily. Everything else has zero backup.
- Off-site backup: None. All backups are on the same disk as the data they protect.
- Backup verification:
cnpg-backup-verifyCronJob exists but is currently failing.
Milestones
None yet. First milestone will be defined when Phase 7 (DR test) completes — "Platform Protected."
Architecture
flowchart TD
subgraph k3s["k3s Node (archbox)"]
PG_DOCS["pal-e-docs DB
CNPG"]
PG_WP["woodpecker DB
CNPG"]
PG_BBALL["basketball-api DB
plain pod"]
PG_MCD["mcd-tracker DB
plain pod"]
FORGEJO["Forgejo
git repos + SQLite"]
KEYCLOAK["Keycloak
H2 file DB"]
MINIO["MinIO
WAL + TF state + assets"]
K8S_SECRETS["k8s Secrets"]
end
subgraph cron["Backup CronJobs"]
PGDUMP["pg_dump
daily"]
GDUMP["gitea dump
daily"]
MIRROR["mc mirror
daily"]
KEXPORT["keycloak export
daily"]
SDUMP["secrets export
daily (encrypted)"]
end
subgraph cloud["External S3 (Backblaze B2)"]
S3["s3://pal-e-backups/"]
S3_PG["postgres/"]
S3_FG["forgejo/"]
S3_MM["minio-mirror/"]
S3_KC["keycloak/"]
S3_SEC["k8s-secrets/"]
end
PG_DOCS --> PGDUMP
PG_WP --> PGDUMP
PG_BBALL --> PGDUMP
PG_MCD --> PGDUMP
FORGEJO --> GDUMP
MINIO --> MIRROR
KEYCLOAK --> KEXPORT
K8S_SECRETS --> SDUMP
PGDUMP --> S3_PG
GDUMP --> S3_FG
MIRROR --> S3_MM
KEXPORT --> S3_KC
SDUMP --> S3_SEC
S3_PG --> S3
S3_FG --> S3
S3_MM --> S3
S3_KC --> S3
S3_SEC --> S3
Backup Flow. Five CronJobs run daily in the k3s cluster. Each targets a specific data category, compresses/encrypts as appropriate, and uploads to a single external S3 bucket organized by directory. A verification job checks freshness and alerts on failure.
Repos
| Repo | Platform | Role | Status |
|---|---|---|---|
| pal-e-backup | Forgejo | Terraform + backup scripts + CronJob manifests | planned |
Inbox
No untriaged items.