innercontext/docs/DEPLOYMENT.md
Piotr Oleszczyk 2efdb2b785 fix(deploy): make LXC deploys atomic and fail-fast
Rebuild the deployment flow to prepare releases remotely, validate env/sudo prerequisites, run migrations in-release, and auto-rollback on health failures. Consolidate deployment docs and add a manual CI workflow so laptop and CI use the same push-based deploy path.
2026-03-07 01:14:30 +01:00

259 lines
7.1 KiB
Markdown

# Deployment Guide (LXC + systemd + nginx)
This project deploys from an external machine (developer laptop or CI runner) to a Debian LXC host over SSH.
Deployments are push-based, release-based, and atomic:
- Build and validate locally
- Upload to `/opt/innercontext/releases/<timestamp>`
- Run backend dependency sync and migrations in that release directory
- Promote once by switching `/opt/innercontext/current`
- Restart services and run health checks
- Auto-rollback on failure
Environment files have exactly two persistent locations on the server:
- `/opt/innercontext/shared/backend/.env`
- `/opt/innercontext/shared/frontend/.env.production`
Each release links to those files from:
- `/opt/innercontext/current/backend/.env` -> `../../../shared/backend/.env`
- `/opt/innercontext/current/frontend/.env.production` -> `../../../shared/frontend/.env.production`
## Architecture
```
external machine (manual now, CI later)
|
| ssh + rsync
v
LXC host
/opt/innercontext/
current -> releases/<timestamp>
releases/<timestamp>
shared/backend/.env
shared/frontend/.env.production
scripts/
```
Services:
- `innercontext` (FastAPI, localhost:8000)
- `innercontext-node` (SvelteKit Node, localhost:3000)
- `innercontext-pricing-worker` (background worker)
nginx routes:
- `/api/*` -> `http://127.0.0.1:8000/*`
- `/*` -> `http://127.0.0.1:3000/*`
## Run Model
- Manual deploy: run `./deploy.sh ...` from repo root on your laptop.
- Optional CI deploy: run the same script from a manual workflow (`workflow_dispatch`).
- The server never builds frontend assets.
## One-Time Server Setup
Run on the LXC host as root.
### 1) Install runtime dependencies
```bash
apt update && apt upgrade -y
apt install -y git nginx curl ca-certificates libpq5 rsync python3 python3-venv
curl -LsSf https://astral.sh/uv/install.sh | UV_INSTALL_DIR=/usr/local/bin sh
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.4/install.sh | bash
. "$HOME/.nvm/nvm.sh"
nvm install 24
cp --remove-destination "$(nvm which current)" /usr/local/bin/node
curl -fsSL "https://github.com/pnpm/pnpm/releases/latest/download/pnpm-linux-x64" \
-o /usr/local/bin/pnpm
chmod 755 /usr/local/bin/pnpm
```
### 2) Create app user and directories
```bash
useradd --system --create-home --shell /bin/bash innercontext
mkdir -p /opt/innercontext/releases
mkdir -p /opt/innercontext/shared/backend
mkdir -p /opt/innercontext/shared/frontend
mkdir -p /opt/innercontext/scripts
chown -R innercontext:innercontext /opt/innercontext
```
### 3) Create shared env files
```bash
cat > /opt/innercontext/shared/backend/.env <<'EOF'
DATABASE_URL=postgresql+psycopg://innercontext:change-me@<pg-ip>/innercontext
GEMINI_API_KEY=your-key
EOF
cat > /opt/innercontext/shared/frontend/.env.production <<'EOF'
PUBLIC_API_BASE=http://127.0.0.1:8000
ORIGIN=http://innercontext.lan
EOF
chmod 600 /opt/innercontext/shared/backend/.env
chmod 600 /opt/innercontext/shared/frontend/.env.production
chown innercontext:innercontext /opt/innercontext/shared/backend/.env
chown innercontext:innercontext /opt/innercontext/shared/frontend/.env.production
```
### 4) Grant deploy sudo permissions
```bash
cat > /etc/sudoers.d/innercontext-deploy << 'EOF'
innercontext ALL=(root) NOPASSWD: \
/usr/bin/systemctl restart innercontext, \
/usr/bin/systemctl restart innercontext-node, \
/usr/bin/systemctl restart innercontext-pricing-worker, \
/usr/bin/systemctl is-active innercontext, \
/usr/bin/systemctl is-active innercontext-node, \
/usr/bin/systemctl is-active innercontext-pricing-worker
EOF
chmod 440 /etc/sudoers.d/innercontext-deploy
visudo -c -f /etc/sudoers.d/innercontext-deploy
# Must work without password or TTY prompt:
sudo -u innercontext sudo -n -l
```
If `sudo -n -l` fails, deployments will fail during restart/rollback with:
`sudo: a terminal is required` or `sudo: a password is required`.
### 5) Install systemd and nginx configs
After first deploy (or after copying repo content to `/opt/innercontext/current`), install configs:
```bash
cp /opt/innercontext/current/systemd/innercontext.service /etc/systemd/system/
cp /opt/innercontext/current/systemd/innercontext-node.service /etc/systemd/system/
cp /opt/innercontext/current/systemd/innercontext-pricing-worker.service /etc/systemd/system/
systemctl daemon-reload
systemctl enable innercontext
systemctl enable innercontext-node
systemctl enable innercontext-pricing-worker
cp /opt/innercontext/current/nginx/innercontext.conf /etc/nginx/sites-available/innercontext
ln -sf /etc/nginx/sites-available/innercontext /etc/nginx/sites-enabled/innercontext
rm -f /etc/nginx/sites-enabled/default
nginx -t && systemctl reload nginx
```
## Local Machine Setup
`~/.ssh/config`:
```
Host innercontext
HostName <lxc-ip>
User innercontext
```
Ensure your public key is in `/home/innercontext/.ssh/authorized_keys`.
## Deploy Commands
From repository root on external machine:
```bash
./deploy.sh # full deploy (default = all)
./deploy.sh all
./deploy.sh backend
./deploy.sh frontend
./deploy.sh list
./deploy.sh rollback
```
Optional overrides:
```bash
DEPLOY_SERVER=innercontext ./deploy.sh all
DEPLOY_ROOT=/opt/innercontext ./deploy.sh backend
DEPLOY_ALLOW_DIRTY=1 ./deploy.sh frontend
```
## What `deploy.sh` Does
For `backend` / `frontend` / `all`:
1. Local checks (strict, fail-fast)
2. Acquire `/opt/innercontext/.deploy.lock`
3. Create `<timestamp>` release directory
4. Upload selected component(s)
5. Link shared env files in the release directory
6. `uv sync` + `alembic upgrade head` (backend scope)
7. Upload `scripts/`, `systemd/`, `nginx/`
8. Switch `current` to the prepared release
9. Restart affected services
10. Run health checks
11. Remove old releases (keep last 5)
12. Write deploy entry to `/opt/innercontext/deploy.log`
If anything fails after promotion, script auto-rolls back to previous release.
## Health Checks
- Backend: `http://127.0.0.1:8000/health-check`
- Frontend: `http://127.0.0.1:3000/`
- Worker: `systemctl is-active innercontext-pricing-worker`
Manual checks:
```bash
curl -sf http://127.0.0.1:8000/health-check
curl -sf http://127.0.0.1:3000/
systemctl is-active innercontext
systemctl is-active innercontext-node
systemctl is-active innercontext-pricing-worker
```
## Troubleshooting
### Lock exists
```bash
cat /opt/innercontext/.deploy.lock
rm -f /opt/innercontext/.deploy.lock
```
Only remove the lock if no deployment is running.
### Sudo password prompt during deploy
Re-check `/etc/sudoers.d/innercontext-deploy` and run:
```bash
visudo -c -f /etc/sudoers.d/innercontext-deploy
sudo -u innercontext sudo systemctl is-active innercontext
```
### Backend migration failure
Validate env file and DB connectivity:
```bash
ls -la /opt/innercontext/shared/backend/.env
grep '^DATABASE_URL=' /opt/innercontext/shared/backend/.env
```
### Service fails after deploy
```bash
journalctl -u innercontext -n 100
journalctl -u innercontext-node -n 100
journalctl -u innercontext-pricing-worker -n 100
```
## Manual CI Deploy (Optional)
Use the manual Forgejo workflow (`workflow_dispatch`) to run the same `./deploy.sh all` path from CI once server secrets and SSH trust are configured.