fix(deploy): make LXC deploys atomic and fail-fast
Rebuild the deployment flow to prepare releases remotely, validate env/sudo prerequisites, run migrations in-release, and auto-rollback on health failures. Consolidate deployment docs and add a manual CI workflow so laptop and CI use the same push-based deploy path.
This commit is contained in:
parent
d228b44209
commit
2efdb2b785
8 changed files with 1057 additions and 319 deletions
|
|
@ -1,376 +1,259 @@
|
|||
# Deployment guide — Proxmox LXC (home network)
|
||||
# Deployment Guide (LXC + systemd + nginx)
|
||||
|
||||
Target architecture:
|
||||
This project deploys from an external machine (developer laptop or CI runner) to a Debian LXC host over SSH.
|
||||
|
||||
Deployments are push-based, release-based, and atomic:
|
||||
|
||||
- Build and validate locally
|
||||
- Upload to `/opt/innercontext/releases/<timestamp>`
|
||||
- Run backend dependency sync and migrations in that release directory
|
||||
- Promote once by switching `/opt/innercontext/current`
|
||||
- Restart services and run health checks
|
||||
- Auto-rollback on failure
|
||||
|
||||
Environment files have exactly two persistent locations on the server:
|
||||
|
||||
- `/opt/innercontext/shared/backend/.env`
|
||||
- `/opt/innercontext/shared/frontend/.env.production`
|
||||
|
||||
Each release links to those files from:
|
||||
|
||||
- `/opt/innercontext/current/backend/.env` -> `../../../shared/backend/.env`
|
||||
- `/opt/innercontext/current/frontend/.env.production` -> `../../../shared/frontend/.env.production`
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Reverse proxy (existing) innercontext LXC (new, Debian 13)
|
||||
┌──────────────────────┐ ┌────────────────────────────────────┐
|
||||
│ reverse proxy │────────────▶│ nginx :80 │
|
||||
│ innercontext.lan → * │ │ /api/* → uvicorn :8000/* │
|
||||
└──────────────────────┘ │ /* → SvelteKit Node :3000 │
|
||||
└────────────────────────────────────┘
|
||||
│ │
|
||||
FastAPI SvelteKit Node
|
||||
external machine (manual now, CI later)
|
||||
|
|
||||
| ssh + rsync
|
||||
v
|
||||
LXC host
|
||||
/opt/innercontext/
|
||||
current -> releases/<timestamp>
|
||||
releases/<timestamp>
|
||||
shared/backend/.env
|
||||
shared/frontend/.env.production
|
||||
scripts/
|
||||
```
|
||||
|
||||
> **Frontend is never built on the server.** The `vite build` + `adapter-node`
|
||||
> esbuild step is CPU/RAM-intensive and will hang on a small LXC. Build locally,
|
||||
> deploy the `build/` artifact via `deploy.sh`.
|
||||
Services:
|
||||
|
||||
## 1. Prerequisites
|
||||
- `innercontext` (FastAPI, localhost:8000)
|
||||
- `innercontext-node` (SvelteKit Node, localhost:3000)
|
||||
- `innercontext-pricing-worker` (background worker)
|
||||
|
||||
- Proxmox VE host with an existing PostgreSQL LXC and a reverse proxy
|
||||
- LAN hostname `innercontext.lan` resolvable on the network (via router DNS or `/etc/hosts`)
|
||||
- The PostgreSQL LXC must accept connections from the innercontext LXC IP
|
||||
nginx routes:
|
||||
|
||||
---
|
||||
- `/api/*` -> `http://127.0.0.1:8000/*`
|
||||
- `/*` -> `http://127.0.0.1:3000/*`
|
||||
|
||||
## 2. Create the LXC container
|
||||
## Run Model
|
||||
|
||||
In the Proxmox UI (or via CLI):
|
||||
- Manual deploy: run `./deploy.sh ...` from repo root on your laptop.
|
||||
- Optional CI deploy: run the same script from a manual workflow (`workflow_dispatch`).
|
||||
- The server never builds frontend assets.
|
||||
|
||||
```bash
|
||||
# CLI example — adjust storage, bridge, IP to your environment
|
||||
pct create 200 local:vztmpl/debian-13-standard_13.0-1_amd64.tar.zst \
|
||||
--hostname innercontext \
|
||||
--cores 2 \
|
||||
--memory 1024 \
|
||||
--swap 512 \
|
||||
--rootfs local-lvm:8 \
|
||||
--net0 name=eth0,bridge=vmbr0,ip=dhcp \
|
||||
--unprivileged 1 \
|
||||
--start 1
|
||||
```
|
||||
## One-Time Server Setup
|
||||
|
||||
Note the container's IP address after it starts (`pct exec 200 -- ip -4 a`).
|
||||
Run on the LXC host as root.
|
||||
|
||||
---
|
||||
|
||||
## 3. Container setup
|
||||
|
||||
```bash
|
||||
pct enter 200 # or SSH into the container
|
||||
```
|
||||
|
||||
### System packages
|
||||
### 1) Install runtime dependencies
|
||||
|
||||
```bash
|
||||
apt update && apt upgrade -y
|
||||
apt install -y git nginx curl ca-certificates gnupg lsb-release libpq5 rsync
|
||||
```
|
||||
apt install -y git nginx curl ca-certificates libpq5 rsync python3 python3-venv
|
||||
|
||||
### Python 3.12+ + uv
|
||||
|
||||
```bash
|
||||
apt install -y python3 python3-venv
|
||||
curl -LsSf https://astral.sh/uv/install.sh | UV_INSTALL_DIR=/usr/local/bin sh
|
||||
```
|
||||
|
||||
Installing to `/usr/local/bin` makes `uv` available system-wide (required for `sudo -u innercontext uv sync`).
|
||||
|
||||
### Node.js 24 LTS + pnpm
|
||||
|
||||
The server needs Node.js to **run** the pre-built frontend bundle, and pnpm to
|
||||
**install production runtime dependencies** (`clsx`, `bits-ui`, etc. —
|
||||
`adapter-node` bundles the SvelteKit framework but leaves these external).
|
||||
The frontend is never **built** on the server.
|
||||
|
||||
```bash
|
||||
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.4/install.sh | bash
|
||||
. "$HOME/.nvm/nvm.sh"
|
||||
nvm install 24
|
||||
```
|
||||
|
||||
Copy `node` to `/usr/local/bin` so it is accessible system-wide
|
||||
(required for `sudo -u innercontext` and for systemd).
|
||||
Use `--remove-destination` to replace any existing symlink with a real file:
|
||||
|
||||
```bash
|
||||
cp --remove-destination "$(nvm which current)" /usr/local/bin/node
|
||||
```
|
||||
|
||||
Install pnpm as a standalone binary — self-contained, no wrapper scripts,
|
||||
works system-wide:
|
||||
|
||||
```bash
|
||||
curl -fsSL "https://github.com/pnpm/pnpm/releases/latest/download/pnpm-linux-x64" \
|
||||
-o /usr/local/bin/pnpm
|
||||
chmod 755 /usr/local/bin/pnpm
|
||||
```
|
||||
|
||||
### Application user
|
||||
### 2) Create app user and directories
|
||||
|
||||
```bash
|
||||
useradd --system --create-home --shell /bin/bash innercontext
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Create the database on the PostgreSQL LXC
|
||||
|
||||
Run on the **PostgreSQL LXC**:
|
||||
|
||||
```bash
|
||||
psql -U postgres <<'SQL'
|
||||
CREATE USER innercontext WITH PASSWORD 'change-me';
|
||||
CREATE DATABASE innercontext OWNER innercontext;
|
||||
SQL
|
||||
```
|
||||
|
||||
Edit `/etc/postgresql/18/main/pg_hba.conf` and add (replace `<lxc-ip>` with the innercontext container IP):
|
||||
|
||||
```
|
||||
host innercontext innercontext <lxc-ip>/32 scram-sha-256
|
||||
```
|
||||
|
||||
Then reload:
|
||||
|
||||
```bash
|
||||
systemctl reload postgresql
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Clone the repository
|
||||
|
||||
```bash
|
||||
mkdir -p /opt/innercontext
|
||||
git clone https://github.com/your-user/innercontext.git /opt/innercontext
|
||||
mkdir -p /opt/innercontext/releases
|
||||
mkdir -p /opt/innercontext/shared/backend
|
||||
mkdir -p /opt/innercontext/shared/frontend
|
||||
mkdir -p /opt/innercontext/scripts
|
||||
chown -R innercontext:innercontext /opt/innercontext
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Backend setup
|
||||
### 3) Create shared env files
|
||||
|
||||
```bash
|
||||
cd /opt/innercontext/backend
|
||||
```
|
||||
|
||||
### Install dependencies
|
||||
|
||||
```bash
|
||||
sudo -u innercontext uv sync
|
||||
```
|
||||
|
||||
### Create `.env`
|
||||
|
||||
```bash
|
||||
cat > /opt/innercontext/backend/.env <<'EOF'
|
||||
DATABASE_URL=postgresql+psycopg://innercontext:change-me@<pg-lxc-ip>/innercontext
|
||||
GEMINI_API_KEY=your-gemini-api-key
|
||||
# GEMINI_MODEL=gemini-flash-latest # optional, this is the default
|
||||
cat > /opt/innercontext/shared/backend/.env <<'EOF'
|
||||
DATABASE_URL=postgresql+psycopg://innercontext:change-me@<pg-ip>/innercontext
|
||||
GEMINI_API_KEY=your-key
|
||||
EOF
|
||||
chmod 600 /opt/innercontext/backend/.env
|
||||
chown innercontext:innercontext /opt/innercontext/backend/.env
|
||||
```
|
||||
|
||||
### Run database migrations
|
||||
|
||||
```bash
|
||||
sudo -u innercontext bash -c '
|
||||
cd /opt/innercontext/backend
|
||||
uv run alembic upgrade head
|
||||
'
|
||||
```
|
||||
|
||||
This creates all tables on first run. On subsequent deploys it applies only the new migrations.
|
||||
|
||||
> **Existing database (tables already created by `create_db_and_tables`):**
|
||||
> Run `uv run alembic stamp head` instead to mark the current schema as migrated without re-running DDL.
|
||||
|
||||
### Test
|
||||
|
||||
```bash
|
||||
sudo -u innercontext bash -c '
|
||||
cd /opt/innercontext/backend
|
||||
uv run uvicorn main:app --host 127.0.0.1 --port 8000
|
||||
'
|
||||
# Ctrl-C after confirming it starts
|
||||
```
|
||||
|
||||
### Install systemd service
|
||||
|
||||
```bash
|
||||
cp /opt/innercontext/systemd/innercontext.service /etc/systemd/system/
|
||||
systemctl daemon-reload
|
||||
systemctl enable --now innercontext
|
||||
systemctl status innercontext
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Frontend setup
|
||||
|
||||
The frontend is **built locally and uploaded** via `deploy.sh` — never built on the server.
|
||||
This section only covers the one-time server-side configuration.
|
||||
|
||||
### Create `.env.production`
|
||||
|
||||
```bash
|
||||
cat > /opt/innercontext/frontend/.env.production <<'EOF'
|
||||
PUBLIC_API_BASE=http://innercontext.lan/api
|
||||
cat > /opt/innercontext/shared/frontend/.env.production <<'EOF'
|
||||
PUBLIC_API_BASE=http://127.0.0.1:8000
|
||||
ORIGIN=http://innercontext.lan
|
||||
EOF
|
||||
chmod 600 /opt/innercontext/frontend/.env.production
|
||||
chown innercontext:innercontext /opt/innercontext/frontend/.env.production
|
||||
|
||||
chmod 600 /opt/innercontext/shared/backend/.env
|
||||
chmod 600 /opt/innercontext/shared/frontend/.env.production
|
||||
chown innercontext:innercontext /opt/innercontext/shared/backend/.env
|
||||
chown innercontext:innercontext /opt/innercontext/shared/frontend/.env.production
|
||||
```
|
||||
|
||||
### Grant `innercontext` passwordless sudo for service restarts
|
||||
### 4) Grant deploy sudo permissions
|
||||
|
||||
```bash
|
||||
cat > /etc/sudoers.d/innercontext-deploy << 'EOF'
|
||||
innercontext ALL=(root) NOPASSWD: \
|
||||
/usr/bin/systemctl restart innercontext, \
|
||||
/usr/bin/systemctl restart innercontext-node, \
|
||||
/usr/bin/systemctl restart innercontext-pricing-worker
|
||||
/usr/bin/systemctl restart innercontext-pricing-worker, \
|
||||
/usr/bin/systemctl is-active innercontext, \
|
||||
/usr/bin/systemctl is-active innercontext-node, \
|
||||
/usr/bin/systemctl is-active innercontext-pricing-worker
|
||||
EOF
|
||||
|
||||
chmod 440 /etc/sudoers.d/innercontext-deploy
|
||||
visudo -c -f /etc/sudoers.d/innercontext-deploy
|
||||
|
||||
# Must work without password or TTY prompt:
|
||||
sudo -u innercontext sudo -n -l
|
||||
```
|
||||
|
||||
### Install systemd services
|
||||
If `sudo -n -l` fails, deployments will fail during restart/rollback with:
|
||||
`sudo: a terminal is required` or `sudo: a password is required`.
|
||||
|
||||
### 5) Install systemd and nginx configs
|
||||
|
||||
After first deploy (or after copying repo content to `/opt/innercontext/current`), install configs:
|
||||
|
||||
```bash
|
||||
cp /opt/innercontext/systemd/innercontext-node.service /etc/systemd/system/
|
||||
cp /opt/innercontext/systemd/innercontext-pricing-worker.service /etc/systemd/system/
|
||||
cp /opt/innercontext/current/systemd/innercontext.service /etc/systemd/system/
|
||||
cp /opt/innercontext/current/systemd/innercontext-node.service /etc/systemd/system/
|
||||
cp /opt/innercontext/current/systemd/innercontext-pricing-worker.service /etc/systemd/system/
|
||||
systemctl daemon-reload
|
||||
systemctl enable innercontext
|
||||
systemctl enable innercontext-node
|
||||
systemctl enable --now innercontext-pricing-worker
|
||||
# Do NOT start yet — build/ is empty until the first deploy.sh run
|
||||
```
|
||||
systemctl enable innercontext-pricing-worker
|
||||
|
||||
---
|
||||
|
||||
## 8. nginx setup
|
||||
|
||||
```bash
|
||||
cp /opt/innercontext/nginx/innercontext.conf /etc/nginx/sites-available/innercontext
|
||||
ln -s /etc/nginx/sites-available/innercontext /etc/nginx/sites-enabled/
|
||||
cp /opt/innercontext/current/nginx/innercontext.conf /etc/nginx/sites-available/innercontext
|
||||
ln -sf /etc/nginx/sites-available/innercontext /etc/nginx/sites-enabled/innercontext
|
||||
rm -f /etc/nginx/sites-enabled/default
|
||||
nginx -t
|
||||
systemctl reload nginx
|
||||
nginx -t && systemctl reload nginx
|
||||
```
|
||||
|
||||
---
|
||||
## Local Machine Setup
|
||||
|
||||
## 9. Reverse proxy configuration
|
||||
|
||||
Point your existing reverse proxy at the innercontext LXC's nginx (`<innercontext-lxc-ip>:80`).
|
||||
|
||||
Example — Caddy:
|
||||
|
||||
```
|
||||
innercontext.lan {
|
||||
reverse_proxy <innercontext-lxc-ip>:80
|
||||
}
|
||||
```
|
||||
|
||||
Example — nginx upstream:
|
||||
|
||||
```nginx
|
||||
server {
|
||||
listen 80;
|
||||
server_name innercontext.lan;
|
||||
location / {
|
||||
proxy_pass http://<innercontext-lxc-ip>:80;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Reload your reverse proxy after applying the change.
|
||||
|
||||
---
|
||||
|
||||
## 10. First deploy from local machine
|
||||
|
||||
All subsequent deploys (including the first one) use `deploy.sh` from your local machine.
|
||||
|
||||
### SSH config
|
||||
|
||||
Add to `~/.ssh/config` on your local machine:
|
||||
`~/.ssh/config`:
|
||||
|
||||
```
|
||||
Host innercontext
|
||||
HostName <innercontext-lxc-ip>
|
||||
HostName <lxc-ip>
|
||||
User innercontext
|
||||
```
|
||||
|
||||
Make sure your SSH public key is in `/home/innercontext/.ssh/authorized_keys` on the server.
|
||||
Ensure your public key is in `/home/innercontext/.ssh/authorized_keys`.
|
||||
|
||||
### Run the first deploy
|
||||
## Deploy Commands
|
||||
|
||||
From repository root on external machine:
|
||||
|
||||
```bash
|
||||
# From the repo root on your local machine:
|
||||
./deploy.sh
|
||||
./deploy.sh # full deploy (default = all)
|
||||
./deploy.sh all
|
||||
./deploy.sh backend
|
||||
./deploy.sh frontend
|
||||
./deploy.sh list
|
||||
./deploy.sh rollback
|
||||
```
|
||||
|
||||
This will:
|
||||
1. Build the frontend locally (`pnpm run build`)
|
||||
2. Upload `frontend/build/` to the server via rsync
|
||||
3. Restart `innercontext-node`
|
||||
4. Upload `backend/` source to the server
|
||||
5. Run `uv sync --frozen` on the server
|
||||
6. Restart `innercontext` (runs alembic migrations on start)
|
||||
7. Restart `innercontext-pricing-worker`
|
||||
|
||||
---
|
||||
|
||||
## 11. Verification
|
||||
Optional overrides:
|
||||
|
||||
```bash
|
||||
# From any machine on the LAN:
|
||||
curl http://innercontext.lan/api/health-check # {"status":"ok"}
|
||||
curl http://innercontext.lan/api/products # []
|
||||
curl http://innercontext.lan/ # SvelteKit HTML shell
|
||||
DEPLOY_SERVER=innercontext ./deploy.sh all
|
||||
DEPLOY_ROOT=/opt/innercontext ./deploy.sh backend
|
||||
DEPLOY_ALLOW_DIRTY=1 ./deploy.sh frontend
|
||||
```
|
||||
|
||||
The web UI should be accessible at `http://innercontext.lan`.
|
||||
## What `deploy.sh` Does
|
||||
|
||||
---
|
||||
For `backend` / `frontend` / `all`:
|
||||
|
||||
## 12. Updating the application
|
||||
1. Local checks (strict, fail-fast)
|
||||
2. Acquire `/opt/innercontext/.deploy.lock`
|
||||
3. Create `<timestamp>` release directory
|
||||
4. Upload selected component(s)
|
||||
5. Link shared env files in the release directory
|
||||
6. `uv sync` + `alembic upgrade head` (backend scope)
|
||||
7. Upload `scripts/`, `systemd/`, `nginx/`
|
||||
8. Switch `current` to the prepared release
|
||||
9. Restart affected services
|
||||
10. Run health checks
|
||||
11. Remove old releases (keep last 5)
|
||||
12. Write deploy entry to `/opt/innercontext/deploy.log`
|
||||
|
||||
If anything fails after promotion, script auto-rolls back to previous release.
|
||||
|
||||
## Health Checks
|
||||
|
||||
- Backend: `http://127.0.0.1:8000/health-check`
|
||||
- Frontend: `http://127.0.0.1:3000/`
|
||||
- Worker: `systemctl is-active innercontext-pricing-worker`
|
||||
|
||||
Manual checks:
|
||||
|
||||
```bash
|
||||
# From the repo root on your local machine:
|
||||
./deploy.sh # full deploy (frontend + backend)
|
||||
./deploy.sh frontend # frontend only
|
||||
./deploy.sh backend # backend only
|
||||
curl -sf http://127.0.0.1:8000/health-check
|
||||
curl -sf http://127.0.0.1:3000/
|
||||
systemctl is-active innercontext
|
||||
systemctl is-active innercontext-node
|
||||
systemctl is-active innercontext-pricing-worker
|
||||
```
|
||||
|
||||
---
|
||||
## Troubleshooting
|
||||
|
||||
## 13. Troubleshooting
|
||||
|
||||
### 502 Bad Gateway on `/api/*`
|
||||
### Lock exists
|
||||
|
||||
```bash
|
||||
systemctl status innercontext
|
||||
journalctl -u innercontext -n 50
|
||||
# Check .env DATABASE_URL is correct and PG LXC accepts connections
|
||||
cat /opt/innercontext/.deploy.lock
|
||||
rm -f /opt/innercontext/.deploy.lock
|
||||
```
|
||||
|
||||
### Product prices stay empty / stale
|
||||
Only remove the lock if no deployment is running.
|
||||
|
||||
### Sudo password prompt during deploy
|
||||
|
||||
Re-check `/etc/sudoers.d/innercontext-deploy` and run:
|
||||
|
||||
```bash
|
||||
systemctl status innercontext-pricing-worker
|
||||
journalctl -u innercontext-pricing-worker -n 50
|
||||
# Ensure worker is running and can connect to PostgreSQL
|
||||
visudo -c -f /etc/sudoers.d/innercontext-deploy
|
||||
sudo -u innercontext sudo systemctl is-active innercontext
|
||||
```
|
||||
|
||||
### 502 Bad Gateway on `/`
|
||||
### Backend migration failure
|
||||
|
||||
Validate env file and DB connectivity:
|
||||
|
||||
```bash
|
||||
systemctl status innercontext-node
|
||||
journalctl -u innercontext-node -n 50
|
||||
# Verify /opt/innercontext/frontend/build/index.js exists (deploy.sh ran successfully)
|
||||
ls -la /opt/innercontext/shared/backend/.env
|
||||
grep '^DATABASE_URL=' /opt/innercontext/shared/backend/.env
|
||||
```
|
||||
|
||||
### Database connection refused
|
||||
### Service fails after deploy
|
||||
|
||||
```bash
|
||||
# From innercontext LXC:
|
||||
psql postgresql+psycopg://innercontext:change-me@<pg-lxc-ip>/innercontext -c "SELECT 1"
|
||||
# If it fails, check pg_hba.conf on the PG LXC and verify the IP matches
|
||||
journalctl -u innercontext -n 100
|
||||
journalctl -u innercontext-node -n 100
|
||||
journalctl -u innercontext-pricing-worker -n 100
|
||||
```
|
||||
|
||||
## Manual CI Deploy (Optional)
|
||||
|
||||
Use the manual Forgejo workflow (`workflow_dispatch`) to run the same `./deploy.sh all` path from CI once server secrets and SSH trust are configured.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue