chore(infra): align systemd units and Forgejo runners
Point services to /opt/innercontext/current release paths, remove stale phase completion docs, and switch Forgejo workflows to run on the lxc runner label.
This commit is contained in:
parent
2efdb2b785
commit
5d69a976c4
7 changed files with 118 additions and 654 deletions
108
.forgejo/workflows/ci.yml
Normal file
108
.forgejo/workflows/ci.yml
Normal file
|
|
@ -0,0 +1,108 @@
|
||||||
|
name: CI
|
||||||
|
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
branches:
|
||||||
|
- main
|
||||||
|
- develop
|
||||||
|
pull_request:
|
||||||
|
branches:
|
||||||
|
- main
|
||||||
|
- develop
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
backend-lint:
|
||||||
|
name: Backend Linting & Type Checking
|
||||||
|
runs-on: lxc
|
||||||
|
steps:
|
||||||
|
- name: Checkout code
|
||||||
|
uses: actions/checkout@v4
|
||||||
|
|
||||||
|
- name: Set up Python
|
||||||
|
uses: actions/setup-python@v5
|
||||||
|
with:
|
||||||
|
python-version: '3.12'
|
||||||
|
|
||||||
|
- name: Install uv
|
||||||
|
run: |
|
||||||
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||||
|
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
|
||||||
|
|
||||||
|
- name: Install dependencies
|
||||||
|
working-directory: backend
|
||||||
|
run: uv sync
|
||||||
|
|
||||||
|
- name: Run ruff check
|
||||||
|
working-directory: backend
|
||||||
|
run: uv run ruff check .
|
||||||
|
|
||||||
|
- name: Run black check
|
||||||
|
working-directory: backend
|
||||||
|
run: uv run black --check .
|
||||||
|
|
||||||
|
- name: Run isort check
|
||||||
|
working-directory: backend
|
||||||
|
run: uv run isort --check-only .
|
||||||
|
|
||||||
|
- name: Run mypy type checking
|
||||||
|
working-directory: backend
|
||||||
|
run: uv run mypy innercontext/
|
||||||
|
continue-on-error: true # Don't fail CI on type errors for now
|
||||||
|
|
||||||
|
frontend-check:
|
||||||
|
name: Frontend Type Checking & Linting
|
||||||
|
runs-on: lxc
|
||||||
|
steps:
|
||||||
|
- name: Checkout code
|
||||||
|
uses: actions/checkout@v4
|
||||||
|
|
||||||
|
- name: Set up Node.js
|
||||||
|
uses: actions/setup-node@v4
|
||||||
|
with:
|
||||||
|
node-version: '20'
|
||||||
|
|
||||||
|
- name: Install pnpm
|
||||||
|
run: npm install -g pnpm
|
||||||
|
|
||||||
|
- name: Install dependencies
|
||||||
|
working-directory: frontend
|
||||||
|
run: pnpm install --frozen-lockfile
|
||||||
|
|
||||||
|
- name: Run svelte-check
|
||||||
|
working-directory: frontend
|
||||||
|
run: pnpm check
|
||||||
|
|
||||||
|
- name: Run lint
|
||||||
|
working-directory: frontend
|
||||||
|
run: pnpm lint
|
||||||
|
|
||||||
|
- name: Build frontend
|
||||||
|
working-directory: frontend
|
||||||
|
run: pnpm build
|
||||||
|
|
||||||
|
backend-test:
|
||||||
|
name: Backend Tests
|
||||||
|
runs-on: lxc
|
||||||
|
# Disabled for now since tests are not integrated yet
|
||||||
|
if: false
|
||||||
|
steps:
|
||||||
|
- name: Checkout code
|
||||||
|
uses: actions/checkout@v4
|
||||||
|
|
||||||
|
- name: Set up Python
|
||||||
|
uses: actions/setup-python@v5
|
||||||
|
with:
|
||||||
|
python-version: '3.12'
|
||||||
|
|
||||||
|
- name: Install uv
|
||||||
|
run: |
|
||||||
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||||
|
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
|
||||||
|
|
||||||
|
- name: Install dependencies
|
||||||
|
working-directory: backend
|
||||||
|
run: uv sync
|
||||||
|
|
||||||
|
- name: Run tests
|
||||||
|
working-directory: backend
|
||||||
|
run: uv run pytest
|
||||||
|
|
@ -18,7 +18,7 @@ on:
|
||||||
jobs:
|
jobs:
|
||||||
deploy:
|
deploy:
|
||||||
name: Manual deployment to LXC
|
name: Manual deployment to LXC
|
||||||
runs-on: ubuntu-latest
|
runs-on: lxc
|
||||||
steps:
|
steps:
|
||||||
- name: Checkout code
|
- name: Checkout code
|
||||||
uses: actions/checkout@v4
|
uses: actions/checkout@v4
|
||||||
|
|
|
||||||
|
|
@ -1,231 +0,0 @@
|
||||||
# Phase 1: Safety & Validation - COMPLETE ✅
|
|
||||||
|
|
||||||
## Summary
|
|
||||||
|
|
||||||
Phase 1 implementation is complete! All LLM-based suggestion engines now have input sanitization and response validation to prevent dangerous suggestions from reaching users.
|
|
||||||
|
|
||||||
## What Was Implemented
|
|
||||||
|
|
||||||
### 1. Input Sanitization (`innercontext/llm_safety.py`)
|
|
||||||
- **Sanitizes user input** to prevent prompt injection attacks
|
|
||||||
- Removes patterns like "ignore previous instructions", "you are now a", etc.
|
|
||||||
- Length-limits user input (500 chars for notes, 10000 for product text)
|
|
||||||
- Wraps user input in clear delimiters for LLM
|
|
||||||
|
|
||||||
### 2. Validator Classes (`innercontext/validators/`)
|
|
||||||
Created 5 validators with comprehensive safety checks:
|
|
||||||
|
|
||||||
#### **RoutineSuggestionValidator** (88% test coverage)
|
|
||||||
- ✅ Rejects unknown product_ids
|
|
||||||
- ✅ Blocks retinoid + acid in same routine
|
|
||||||
- ✅ Enforces min_interval_hours rules
|
|
||||||
- ✅ Checks compromised barrier compatibility
|
|
||||||
- ✅ Validates context_rules (safe_after_shaving, etc.)
|
|
||||||
- ✅ Warns when AM routine missing SPF
|
|
||||||
- ✅ Rejects prohibited fields (dose, amount, etc.)
|
|
||||||
- ✅ Ensures each step has product_id OR action_type (not both/neither)
|
|
||||||
|
|
||||||
#### **BatchValidator**
|
|
||||||
- ✅ Validates each day's AM/PM routines individually
|
|
||||||
- ✅ Checks for retinoid + acid conflicts across same day
|
|
||||||
- ✅ Enforces max_frequency_per_week limits
|
|
||||||
- ✅ Tracks product usage across multi-day periods
|
|
||||||
|
|
||||||
#### **ShoppingValidator**
|
|
||||||
- ✅ Validates product types are realistic
|
|
||||||
- ✅ Blocks brand name suggestions (should be types only)
|
|
||||||
- ✅ Validates recommended frequencies
|
|
||||||
- ✅ Checks target concerns are valid
|
|
||||||
- ✅ Validates category and time recommendations
|
|
||||||
|
|
||||||
#### **ProductParseValidator**
|
|
||||||
- ✅ Validates all enum values match allowed strings
|
|
||||||
- ✅ Checks effect_profile scores are 0-5
|
|
||||||
- ✅ Validates pH ranges (0-14)
|
|
||||||
- ✅ Checks actives have valid functions
|
|
||||||
- ✅ Validates strength/irritation levels (1-3)
|
|
||||||
- ✅ Ensures booleans are actual booleans
|
|
||||||
|
|
||||||
#### **PhotoValidator**
|
|
||||||
- ✅ Validates enum values (skin_type, barrier_state, etc.)
|
|
||||||
- ✅ Checks metrics are 1-5 integers
|
|
||||||
- ✅ Validates active concerns from valid set
|
|
||||||
- ✅ Ensures risks/priorities are short phrases (<10 words)
|
|
||||||
|
|
||||||
### 3. Database Schema Updates
|
|
||||||
- Added `validation_errors` (JSON) to `ai_call_logs`
|
|
||||||
- Added `validation_warnings` (JSON) to `ai_call_logs`
|
|
||||||
- Added `auto_fixed` (boolean) to `ai_call_logs`
|
|
||||||
- Migration ready: `alembic/versions/60c8e1ade29d_add_validation_fields_to_ai_call_logs.py`
|
|
||||||
|
|
||||||
### 4. API Integration
|
|
||||||
All 5 endpoints now validate responses:
|
|
||||||
|
|
||||||
1. **`POST /routines/suggest`**
|
|
||||||
- Sanitizes user notes
|
|
||||||
- Validates routine safety before returning
|
|
||||||
- Rejects if validation errors found
|
|
||||||
- Logs warnings
|
|
||||||
|
|
||||||
2. **`POST /routines/suggest-batch`**
|
|
||||||
- Sanitizes user notes
|
|
||||||
- Validates multi-day plan safety
|
|
||||||
- Checks same-day retinoid+acid conflicts
|
|
||||||
- Enforces frequency limits across batch
|
|
||||||
|
|
||||||
3. **`POST /products/suggest`**
|
|
||||||
- Validates shopping suggestions
|
|
||||||
- Checks suggested types are realistic
|
|
||||||
- Ensures no brand names suggested
|
|
||||||
|
|
||||||
4. **`POST /products/parse-text`**
|
|
||||||
- Sanitizes input text (up to 10K chars)
|
|
||||||
- Validates all parsed fields
|
|
||||||
- Checks enum values and ranges
|
|
||||||
|
|
||||||
5. **`POST /skincare/analyze-photos`**
|
|
||||||
- Validates photo analysis output
|
|
||||||
- Checks all metrics and enums
|
|
||||||
|
|
||||||
### 5. Test Suite
|
|
||||||
Created comprehensive test suite:
|
|
||||||
- **9 test cases** for RoutineSuggestionValidator
|
|
||||||
- **All tests passing** ✅
|
|
||||||
- **88% code coverage** on validator logic
|
|
||||||
|
|
||||||
## Validation Behavior
|
|
||||||
|
|
||||||
When validation fails:
|
|
||||||
- ✅ **Errors logged** to application logs
|
|
||||||
- ✅ **HTTP 502 returned** to client with error details
|
|
||||||
- ✅ **Dangerous suggestions blocked** from reaching users
|
|
||||||
|
|
||||||
When validation has warnings:
|
|
||||||
- ✅ **Warnings logged** for monitoring
|
|
||||||
- ✅ **Response allowed** (non-critical issues)
|
|
||||||
|
|
||||||
## Files Created/Modified
|
|
||||||
|
|
||||||
### Created:
|
|
||||||
```
|
|
||||||
backend/innercontext/llm_safety.py
|
|
||||||
backend/innercontext/validators/__init__.py
|
|
||||||
backend/innercontext/validators/base.py
|
|
||||||
backend/innercontext/validators/routine_validator.py
|
|
||||||
backend/innercontext/validators/shopping_validator.py
|
|
||||||
backend/innercontext/validators/product_parse_validator.py
|
|
||||||
backend/innercontext/validators/batch_validator.py
|
|
||||||
backend/innercontext/validators/photo_validator.py
|
|
||||||
backend/alembic/versions/60c8e1ade29d_add_validation_fields_to_ai_call_logs.py
|
|
||||||
backend/tests/validators/__init__.py
|
|
||||||
backend/tests/validators/test_routine_validator.py
|
|
||||||
```
|
|
||||||
|
|
||||||
### Modified:
|
|
||||||
```
|
|
||||||
backend/innercontext/models/ai_log.py (added validation fields)
|
|
||||||
backend/innercontext/api/routines.py (added sanitization + validation)
|
|
||||||
backend/innercontext/api/products.py (added sanitization + validation)
|
|
||||||
backend/innercontext/api/skincare.py (added validation)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Safety Checks Implemented
|
|
||||||
|
|
||||||
### Critical Checks (Block Response):
|
|
||||||
1. ✅ Unknown product IDs
|
|
||||||
2. ✅ Retinoid + acid conflicts (same routine or same day)
|
|
||||||
3. ✅ min_interval_hours violations
|
|
||||||
4. ✅ Compromised barrier + high-risk actives
|
|
||||||
5. ✅ Products not safe with compromised barrier
|
|
||||||
6. ✅ Prohibited fields in response (dose, amount, etc.)
|
|
||||||
7. ✅ Invalid enum values
|
|
||||||
8. ✅ Out-of-range scores/metrics
|
|
||||||
9. ✅ Empty/malformed steps
|
|
||||||
10. ✅ Frequency limit violations (batch)
|
|
||||||
|
|
||||||
### Warning Checks (Allow but Log):
|
|
||||||
1. ✅ AM routine without SPF when leaving home
|
|
||||||
2. ✅ Products that may irritate after shaving
|
|
||||||
3. ✅ High irritation risk with compromised barrier
|
|
||||||
4. ✅ Unusual product types in shopping suggestions
|
|
||||||
5. ✅ Overly long risks/priorities in photo analysis
|
|
||||||
|
|
||||||
## Test Results
|
|
||||||
|
|
||||||
```
|
|
||||||
============================= test session starts ==============================
|
|
||||||
tests/validators/test_routine_validator.py::test_detects_retinoid_acid_conflict PASSED
|
|
||||||
tests/validators/test_routine_validator.py::test_rejects_unknown_product_ids PASSED
|
|
||||||
tests/validators/test_routine_validator.py::test_enforces_min_interval_hours PASSED
|
|
||||||
tests/validators/test_routine_validator.py::test_blocks_dose_field PASSED
|
|
||||||
tests/validators/test_routine_validator.py::test_missing_spf_in_am_leaving_home PASSED
|
|
||||||
tests/validators/test_routine_validator.py::test_compromised_barrier_restrictions PASSED
|
|
||||||
tests/validators/test_routine_validator.py::test_step_must_have_product_or_action PASSED
|
|
||||||
tests/validators/test_routine_validator.py::test_step_cannot_have_both_product_and_action PASSED
|
|
||||||
tests/validators/test_routine_validator.py::test_accepts_valid_routine PASSED
|
|
||||||
|
|
||||||
============================== 9 passed in 0.38s ===============================
|
|
||||||
```
|
|
||||||
|
|
||||||
## Deployment Steps
|
|
||||||
|
|
||||||
To deploy Phase 1 to your LXC:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# 1. On local machine - deploy backend
|
|
||||||
./deploy.sh backend
|
|
||||||
|
|
||||||
# 2. On LXC - run migration
|
|
||||||
ssh innercontext
|
|
||||||
cd /opt/innercontext/backend
|
|
||||||
sudo -u innercontext uv run alembic upgrade head
|
|
||||||
|
|
||||||
# 3. Restart service
|
|
||||||
sudo systemctl restart innercontext
|
|
||||||
|
|
||||||
# 4. Verify logs show validation working
|
|
||||||
sudo journalctl -u innercontext -f
|
|
||||||
```
|
|
||||||
|
|
||||||
## Expected Impact
|
|
||||||
|
|
||||||
### Before Phase 1:
|
|
||||||
- ❌ 6 validation failures out of 189 calls (3.2% failure rate from logs)
|
|
||||||
- ❌ No protection against prompt injection
|
|
||||||
- ❌ No safety checks on LLM outputs
|
|
||||||
- ❌ Dangerous suggestions could reach users
|
|
||||||
|
|
||||||
### After Phase 1:
|
|
||||||
- ✅ **0 dangerous suggestions reach users** (all blocked by validation)
|
|
||||||
- ✅ **100% protection** against prompt injection attacks
|
|
||||||
- ✅ **All outputs validated** before returning to users
|
|
||||||
- ✅ **Issues logged** for analysis and prompt improvement
|
|
||||||
|
|
||||||
## Known Issues from Logs (Now Fixed)
|
|
||||||
|
|
||||||
From analysis of `ai_call_log.json`:
|
|
||||||
|
|
||||||
1. **Lines 10, 27, 61, 78:** LLM returned prohibited `dose` field
|
|
||||||
- ✅ **Now blocked** by validator
|
|
||||||
|
|
||||||
2. **Line 85:** MAX_TOKENS failure (output truncated)
|
|
||||||
- ✅ **Will be detected** (malformed JSON fails validation)
|
|
||||||
|
|
||||||
3. **Line 10:** Response text truncated mid-JSON
|
|
||||||
- ✅ **Now caught** by JSON parsing + validation
|
|
||||||
|
|
||||||
4. **products/parse-text:** Only 80% success rate (4/20 failed)
|
|
||||||
- ✅ **Now has validation** to catch malformed parses
|
|
||||||
|
|
||||||
## Next Steps (Phase 2)
|
|
||||||
|
|
||||||
Phase 1 is complete and ready for deployment. Phase 2 will focus on:
|
|
||||||
1. Token optimization (70-80% reduction)
|
|
||||||
2. Quality improvements (better prompts, reasoning capture)
|
|
||||||
3. Function tools for batch planning
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
**Status:** ✅ **READY FOR DEPLOYMENT**
|
|
||||||
**Test Coverage:** 88% on validators
|
|
||||||
**All Tests:** Passing (9/9)
|
|
||||||
|
|
@ -1,412 +0,0 @@
|
||||||
# Phase 3: UI/UX Observability - COMPLETE ✅
|
|
||||||
|
|
||||||
## Summary
|
|
||||||
|
|
||||||
Phase 3 implementation is complete! The frontend now displays validation warnings, auto-fixes, LLM reasoning chains, and token usage metrics from all LLM endpoints.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## What Was Implemented
|
|
||||||
|
|
||||||
### 1. Backend API Enrichment
|
|
||||||
|
|
||||||
#### Response Models (`backend/innercontext/models/api_metadata.py`)
|
|
||||||
- **`TokenMetrics`**: Captures prompt, completion, thinking, and total tokens
|
|
||||||
- **`ResponseMetadata`**: Model name, duration, reasoning chain, token metrics
|
|
||||||
- **`EnrichedResponse`**: Base class with validation warnings, auto-fixes, metadata
|
|
||||||
|
|
||||||
#### LLM Wrapper Updates (`backend/innercontext/llm.py`)
|
|
||||||
- Modified `call_gemini()` to return `(response, log_id)` tuple
|
|
||||||
- Modified `call_gemini_with_function_tools()` to return `(response, log_id)` tuple
|
|
||||||
- Added `_build_response_metadata()` helper to extract metadata from AICallLog
|
|
||||||
|
|
||||||
#### API Endpoint Updates
|
|
||||||
**`backend/innercontext/api/routines.py`:**
|
|
||||||
- ✅ `/suggest` - Populates validation_warnings, auto_fixes_applied, metadata
|
|
||||||
- ✅ `/suggest-batch` - Populates validation_warnings, auto_fixes_applied, metadata
|
|
||||||
|
|
||||||
**`backend/innercontext/api/products.py`:**
|
|
||||||
- ✅ `/suggest` - Populates validation_warnings, auto_fixes_applied, metadata
|
|
||||||
- ✅ `/parse-text` - Updated to handle new return signature (no enrichment yet)
|
|
||||||
|
|
||||||
**`backend/innercontext/api/skincare.py`:**
|
|
||||||
- ✅ `/analyze-photos` - Updated to handle new return signature (no enrichment yet)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 2. Frontend Type Definitions
|
|
||||||
|
|
||||||
#### Updated Types (`frontend/src/lib/types.ts`)
|
|
||||||
```typescript
|
|
||||||
interface TokenMetrics {
|
|
||||||
prompt_tokens: number;
|
|
||||||
completion_tokens: number;
|
|
||||||
thoughts_tokens?: number;
|
|
||||||
total_tokens: number;
|
|
||||||
}
|
|
||||||
|
|
||||||
interface ResponseMetadata {
|
|
||||||
model_used: string;
|
|
||||||
duration_ms: number;
|
|
||||||
reasoning_chain?: string;
|
|
||||||
token_metrics?: TokenMetrics;
|
|
||||||
}
|
|
||||||
|
|
||||||
interface RoutineSuggestion {
|
|
||||||
// Existing fields...
|
|
||||||
validation_warnings?: string[];
|
|
||||||
auto_fixes_applied?: string[];
|
|
||||||
metadata?: ResponseMetadata;
|
|
||||||
}
|
|
||||||
|
|
||||||
interface BatchSuggestion {
|
|
||||||
// Existing fields...
|
|
||||||
validation_warnings?: string[];
|
|
||||||
auto_fixes_applied?: string[];
|
|
||||||
metadata?: ResponseMetadata;
|
|
||||||
}
|
|
||||||
|
|
||||||
interface ShoppingSuggestionResponse {
|
|
||||||
// Existing fields...
|
|
||||||
validation_warnings?: string[];
|
|
||||||
auto_fixes_applied?: string[];
|
|
||||||
metadata?: ResponseMetadata;
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 3. UI Components
|
|
||||||
|
|
||||||
#### ValidationWarningsAlert.svelte
|
|
||||||
- **Purpose**: Display validation warnings from backend
|
|
||||||
- **Features**:
|
|
||||||
- Yellow/amber alert styling
|
|
||||||
- List format with warning icons
|
|
||||||
- Collapsible if >3 warnings
|
|
||||||
- "Show more" button
|
|
||||||
- **Example**: "⚠️ No SPF found in AM routine while leaving home"
|
|
||||||
|
|
||||||
#### StructuredErrorDisplay.svelte
|
|
||||||
- **Purpose**: Parse and display HTTP 502 validation errors
|
|
||||||
- **Features**:
|
|
||||||
- Splits semicolon-separated error strings
|
|
||||||
- Displays as bulleted list with icons
|
|
||||||
- Extracts prefix text if present
|
|
||||||
- Red alert styling
|
|
||||||
- **Example**:
|
|
||||||
```
|
|
||||||
❌ Generated routine failed safety validation:
|
|
||||||
• Retinoid incompatible with acid in same routine
|
|
||||||
• Unknown product ID: abc12345
|
|
||||||
```
|
|
||||||
|
|
||||||
#### AutoFixBadge.svelte
|
|
||||||
- **Purpose**: Show automatically applied fixes
|
|
||||||
- **Features**:
|
|
||||||
- Green success alert styling
|
|
||||||
- List format with sparkle icon
|
|
||||||
- Communicates transparency
|
|
||||||
- **Example**: "✨ Automatically adjusted wait times and removed conflicting products"
|
|
||||||
|
|
||||||
#### ReasoningChainViewer.svelte
|
|
||||||
- **Purpose**: Display LLM thinking process from MEDIUM thinking level
|
|
||||||
- **Features**:
|
|
||||||
- Collapsible panel (collapsed by default)
|
|
||||||
- Brain icon with "AI Reasoning Process" label
|
|
||||||
- Monospace font for thinking content
|
|
||||||
- Gray background
|
|
||||||
- **Note**: Currently returns null (Gemini doesn't expose thinking content via API), but infrastructure is ready for future use
|
|
||||||
|
|
||||||
#### MetadataDebugPanel.svelte
|
|
||||||
- **Purpose**: Show token metrics and model info for cost monitoring
|
|
||||||
- **Features**:
|
|
||||||
- Collapsible panel (collapsed by default)
|
|
||||||
- Info icon with "Debug Information" label
|
|
||||||
- Displays:
|
|
||||||
- Model name (e.g., `gemini-3-flash-preview`)
|
|
||||||
- Duration in milliseconds
|
|
||||||
- Token breakdown: prompt, completion, thinking, total
|
|
||||||
- Formatted numbers with commas
|
|
||||||
- **Example**:
|
|
||||||
```
|
|
||||||
ℹ️ Debug Information (click to expand)
|
|
||||||
Model: gemini-3-flash-preview
|
|
||||||
Duration: 1,234 ms
|
|
||||||
Tokens: 1,300 prompt + 78 completion + 835 thinking = 2,213 total
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 4. CSS Styling
|
|
||||||
|
|
||||||
#### Alert Variants (`frontend/src/app.css`)
|
|
||||||
```css
|
|
||||||
.editorial-alert--warning {
|
|
||||||
border-color: hsl(42 78% 68%);
|
|
||||||
background: hsl(45 86% 92%);
|
|
||||||
color: hsl(36 68% 28%);
|
|
||||||
}
|
|
||||||
|
|
||||||
.editorial-alert--info {
|
|
||||||
border-color: hsl(204 56% 70%);
|
|
||||||
background: hsl(207 72% 93%);
|
|
||||||
color: hsl(207 78% 28%);
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 5. Integration
|
|
||||||
|
|
||||||
#### Routines Suggest Page (`frontend/src/routes/routines/suggest/+page.svelte`)
|
|
||||||
**Single Suggestion View:**
|
|
||||||
- Replaced plain error div with `<StructuredErrorDisplay>`
|
|
||||||
- Added after summary card, before steps:
|
|
||||||
- `<AutoFixBadge>` (if auto_fixes_applied)
|
|
||||||
- `<ValidationWarningsAlert>` (if validation_warnings)
|
|
||||||
- `<ReasoningChainViewer>` (if reasoning_chain)
|
|
||||||
- `<MetadataDebugPanel>` (if metadata)
|
|
||||||
|
|
||||||
**Batch Suggestion View:**
|
|
||||||
- Same components added after overall reasoning card
|
|
||||||
- Applied to batch-level metadata (not per-day)
|
|
||||||
|
|
||||||
#### Products Suggest Page (`frontend/src/routes/products/suggest/+page.svelte`)
|
|
||||||
- Replaced plain error div with `<StructuredErrorDisplay>`
|
|
||||||
- Added after reasoning card, before suggestion list:
|
|
||||||
- `<AutoFixBadge>`
|
|
||||||
- `<ValidationWarningsAlert>`
|
|
||||||
- `<ReasoningChainViewer>`
|
|
||||||
- `<MetadataDebugPanel>`
|
|
||||||
- Updated `enhanceForm()` to extract observability fields
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## What Data is Captured
|
|
||||||
|
|
||||||
### From Backend Validation (Phase 1)
|
|
||||||
- ✅ `validation_warnings`: Non-critical issues (e.g., missing SPF in AM routine)
|
|
||||||
- ✅ `auto_fixes_applied`: List of automatic corrections made
|
|
||||||
- ✅ `validation_errors`: Critical issues (blocks response with HTTP 502)
|
|
||||||
|
|
||||||
### From AICallLog (Phase 2)
|
|
||||||
- ✅ `model_used`: Model name (e.g., `gemini-3-flash-preview`)
|
|
||||||
- ✅ `duration_ms`: API call duration
|
|
||||||
- ✅ `prompt_tokens`: Input tokens
|
|
||||||
- ✅ `completion_tokens`: Output tokens
|
|
||||||
- ✅ `thoughts_tokens`: Thinking tokens (from MEDIUM thinking level)
|
|
||||||
- ✅ `total_tokens`: Sum of all token types
|
|
||||||
- ❌ `reasoning_chain`: Thinking content (always null - Gemini doesn't expose via API)
|
|
||||||
- ❌ `tool_use_prompt_tokens`: Tool overhead (always null - included in prompt_tokens)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## User Experience Improvements
|
|
||||||
|
|
||||||
### Before Phase 3
|
|
||||||
❌ **Validation Errors:**
|
|
||||||
```
|
|
||||||
Generated routine failed safety validation: No SPF found in AM routine; Retinoid incompatible with acid
|
|
||||||
```
|
|
||||||
- Single long string, hard to read
|
|
||||||
- No distinction between errors and warnings
|
|
||||||
- No explanations
|
|
||||||
|
|
||||||
❌ **No Transparency:**
|
|
||||||
- User doesn't know if request was modified
|
|
||||||
- No visibility into LLM decision-making
|
|
||||||
- No cost/performance metrics
|
|
||||||
|
|
||||||
### After Phase 3
|
|
||||||
✅ **Structured Errors:**
|
|
||||||
```
|
|
||||||
❌ Safety validation failed:
|
|
||||||
• No SPF found in AM routine while leaving home
|
|
||||||
• Retinoid incompatible with acid in same routine
|
|
||||||
```
|
|
||||||
|
|
||||||
✅ **Validation Warnings (Non-blocking):**
|
|
||||||
```
|
|
||||||
⚠️ Validation Warnings:
|
|
||||||
• AM routine missing SPF while leaving home
|
|
||||||
• Consider adding wait time between steps
|
|
||||||
[Show 2 more]
|
|
||||||
```
|
|
||||||
|
|
||||||
✅ **Auto-Fix Transparency:**
|
|
||||||
```
|
|
||||||
✨ Automatically adjusted:
|
|
||||||
• Adjusted wait times between retinoid and moisturizer
|
|
||||||
• Removed conflicting acid step
|
|
||||||
```
|
|
||||||
|
|
||||||
✅ **Token Metrics (Collapsed):**
|
|
||||||
```
|
|
||||||
ℹ️ Debug Information (click to expand)
|
|
||||||
Model: gemini-3-flash-preview
|
|
||||||
Duration: 1,234 ms
|
|
||||||
Tokens: 1,300 prompt + 78 completion + 835 thinking = 2,213 total
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Known Limitations
|
|
||||||
|
|
||||||
### 1. Reasoning Chain Not Accessible
|
|
||||||
- **Issue**: `reasoning_chain` field is always `null`
|
|
||||||
- **Cause**: Gemini API doesn't expose thinking content from MEDIUM thinking level
|
|
||||||
- **Evidence**: `thoughts_token_count` is captured (835-937 tokens), but content is internal to model
|
|
||||||
- **Status**: UI component exists and is ready if Gemini adds API support
|
|
||||||
|
|
||||||
### 2. Tool Use Tokens Not Separated
|
|
||||||
- **Issue**: `tool_use_prompt_tokens` field is always `null`
|
|
||||||
- **Cause**: Tool overhead is included in `prompt_tokens`, not reported separately
|
|
||||||
- **Evidence**: ~3000 token overhead observed in production logs
|
|
||||||
- **Status**: Not blocking - total token count is still accurate
|
|
||||||
|
|
||||||
### 3. I18n Translations Not Added
|
|
||||||
- **Issue**: No Polish translations for new UI text
|
|
||||||
- **Status**: Deferred to Phase 4 (low priority)
|
|
||||||
- **Impact**: Components use English hardcoded labels
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Testing Plan
|
|
||||||
|
|
||||||
### Manual Testing Checklist
|
|
||||||
1. **Trigger validation warnings** (e.g., request AM routine without specifying leaving home)
|
|
||||||
2. **Trigger validation errors** (e.g., request invalid product combinations)
|
|
||||||
3. **Check token metrics** match `ai_call_logs` table entries
|
|
||||||
4. **Verify reasoning chain** displays correctly (if Gemini adds support)
|
|
||||||
5. **Test collapsible panels** (expand/collapse)
|
|
||||||
6. **Responsive design** (mobile, tablet, desktop)
|
|
||||||
|
|
||||||
### Test Scenarios
|
|
||||||
|
|
||||||
#### Scenario 1: Successful Routine with Warning
|
|
||||||
```
|
|
||||||
Request: AM routine, leaving home = true, no notes
|
|
||||||
Expected:
|
|
||||||
- ✅ Suggestion generated
|
|
||||||
- ⚠️ Warning: "Consider adding antioxidant serum before SPF"
|
|
||||||
- ℹ️ Metadata shows token usage
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Scenario 2: Validation Error
|
|
||||||
```
|
|
||||||
Request: PM routine with incompatible products
|
|
||||||
Expected:
|
|
||||||
- ❌ Structured error: "Retinoid incompatible with acid"
|
|
||||||
- No suggestion displayed
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Scenario 3: Auto-Fix Applied
|
|
||||||
```
|
|
||||||
Request: Routine with conflicting wait times
|
|
||||||
Expected:
|
|
||||||
- ✅ Suggestion generated
|
|
||||||
- ✨ Auto-fix: "Adjusted wait times between steps"
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Success Metrics
|
|
||||||
|
|
||||||
### User Experience
|
|
||||||
- ✅ Validation warnings visible (not just errors)
|
|
||||||
- ✅ HTTP 502 errors show structured breakdown
|
|
||||||
- ✅ Auto-fixes communicated transparently
|
|
||||||
- ✅ Error messages easier to understand
|
|
||||||
|
|
||||||
### Developer Experience
|
|
||||||
- ✅ Token metrics visible for cost monitoring
|
|
||||||
- ✅ Model info displayed for debugging
|
|
||||||
- ✅ Duration tracking for performance analysis
|
|
||||||
- ✅ Full token breakdown (prompt, completion, thinking)
|
|
||||||
|
|
||||||
### Technical
|
|
||||||
- ✅ 0 TypeScript errors (`svelte-check` passes)
|
|
||||||
- ✅ All components follow design system
|
|
||||||
- ✅ Backend passes `ruff` lint
|
|
||||||
- ✅ Code formatted with `black`/`isort`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Next Steps
|
|
||||||
|
|
||||||
### Immediate (Deployment)
|
|
||||||
1. **Run database migrations** (if any pending)
|
|
||||||
2. **Deploy backend** to Proxmox LXC
|
|
||||||
3. **Deploy frontend** to production
|
|
||||||
4. **Monitor first 10-20 API calls** for metadata population
|
|
||||||
|
|
||||||
### Phase 4 (Optional Future Work)
|
|
||||||
1. **i18n**: Add Polish translations for new UI components
|
|
||||||
2. **Enhanced reasoning display**: If Gemini adds API support for thinking content
|
|
||||||
3. **Cost dashboard**: Aggregate token metrics across all calls
|
|
||||||
4. **User preferences**: Allow hiding debug panels permanently
|
|
||||||
5. **Export functionality**: Download token metrics as CSV
|
|
||||||
6. **Tooltips**: Add explanations for token types
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## File Changes
|
|
||||||
|
|
||||||
### Backend Files Modified
|
|
||||||
- `backend/innercontext/llm.py` - Return log_id tuple
|
|
||||||
- `backend/innercontext/api/routines.py` - Populate observability fields
|
|
||||||
- `backend/innercontext/api/products.py` - Populate observability fields
|
|
||||||
- `backend/innercontext/api/skincare.py` - Handle new return signature
|
|
||||||
|
|
||||||
### Backend Files Created
|
|
||||||
- `backend/innercontext/models/api_metadata.py` - Response metadata models
|
|
||||||
|
|
||||||
### Frontend Files Modified
|
|
||||||
- `frontend/src/lib/types.ts` - Add observability types
|
|
||||||
- `frontend/src/app.css` - Add warning/info alert variants
|
|
||||||
- `frontend/src/routes/routines/suggest/+page.svelte` - Integrate components
|
|
||||||
- `frontend/src/routes/products/suggest/+page.svelte` - Integrate components
|
|
||||||
|
|
||||||
### Frontend Files Created
|
|
||||||
- `frontend/src/lib/components/ValidationWarningsAlert.svelte`
|
|
||||||
- `frontend/src/lib/components/StructuredErrorDisplay.svelte`
|
|
||||||
- `frontend/src/lib/components/AutoFixBadge.svelte`
|
|
||||||
- `frontend/src/lib/components/ReasoningChainViewer.svelte`
|
|
||||||
- `frontend/src/lib/components/MetadataDebugPanel.svelte`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Commits
|
|
||||||
|
|
||||||
1. **`3c3248c`** - `feat(api): add Phase 3 observability - expose validation warnings and metadata to frontend`
|
|
||||||
- Backend API enrichment
|
|
||||||
- Response models created
|
|
||||||
- LLM wrapper updated
|
|
||||||
|
|
||||||
2. **`5d3f876`** - `feat(frontend): add Phase 3 UI components for observability`
|
|
||||||
- All 5 UI components created
|
|
||||||
- CSS alert variants added
|
|
||||||
- Integration into suggestion pages
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Deployment Checklist
|
|
||||||
|
|
||||||
- [ ] Pull latest code on production server
|
|
||||||
- [ ] Run backend migrations: `cd backend && uv run alembic upgrade head`
|
|
||||||
- [ ] Restart backend service: `sudo systemctl restart innercontext-backend`
|
|
||||||
- [ ] Rebuild frontend: `cd frontend && pnpm build`
|
|
||||||
- [ ] Restart frontend service (if applicable)
|
|
||||||
- [ ] Test routine suggestion endpoint
|
|
||||||
- [ ] Test products suggestion endpoint
|
|
||||||
- [ ] Verify token metrics in MetadataDebugPanel
|
|
||||||
- [ ] Check for any JavaScript console errors
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
**Status: Phase 3 COMPLETE ✅**
|
|
||||||
- Backend API enriched with observability data
|
|
||||||
- Frontend UI components created and integrated
|
|
||||||
- All tests passing, zero errors
|
|
||||||
- Ready for production deployment
|
|
||||||
|
|
@ -6,11 +6,11 @@ After=network.target
|
||||||
Type=simple
|
Type=simple
|
||||||
User=innercontext
|
User=innercontext
|
||||||
Group=innercontext
|
Group=innercontext
|
||||||
WorkingDirectory=/opt/innercontext/frontend
|
WorkingDirectory=/opt/innercontext/current/frontend
|
||||||
Environment=PORT=3000
|
Environment=PORT=3000
|
||||||
Environment=HOST=127.0.0.1
|
Environment=HOST=127.0.0.1
|
||||||
EnvironmentFile=/opt/innercontext/frontend/.env.production
|
EnvironmentFile=/opt/innercontext/current/frontend/.env.production
|
||||||
ExecStart=/usr/local/bin/node /opt/innercontext/frontend/build/index.js
|
ExecStart=/usr/local/bin/node /opt/innercontext/current/frontend/build/index.js
|
||||||
Restart=on-failure
|
Restart=on-failure
|
||||||
RestartSec=5
|
RestartSec=5
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -6,9 +6,9 @@ After=network.target
|
||||||
Type=simple
|
Type=simple
|
||||||
User=innercontext
|
User=innercontext
|
||||||
Group=innercontext
|
Group=innercontext
|
||||||
WorkingDirectory=/opt/innercontext/backend
|
WorkingDirectory=/opt/innercontext/current/backend
|
||||||
EnvironmentFile=/opt/innercontext/backend/.env
|
EnvironmentFile=/opt/innercontext/current/backend/.env
|
||||||
ExecStart=/opt/innercontext/backend/.venv/bin/python -m innercontext.workers.pricing
|
ExecStart=/opt/innercontext/current/backend/.venv/bin/python -m innercontext.workers.pricing
|
||||||
Restart=on-failure
|
Restart=on-failure
|
||||||
RestartSec=5
|
RestartSec=5
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -6,10 +6,9 @@ After=network.target
|
||||||
Type=simple
|
Type=simple
|
||||||
User=innercontext
|
User=innercontext
|
||||||
Group=innercontext
|
Group=innercontext
|
||||||
WorkingDirectory=/opt/innercontext/backend
|
WorkingDirectory=/opt/innercontext/current/backend
|
||||||
EnvironmentFile=/opt/innercontext/backend/.env
|
EnvironmentFile=/opt/innercontext/current/backend/.env
|
||||||
ExecStartPre=/opt/innercontext/backend/.venv/bin/alembic upgrade head
|
ExecStart=/opt/innercontext/current/backend/.venv/bin/uvicorn main:app --host 127.0.0.1 --port 8000
|
||||||
ExecStart=/opt/innercontext/backend/.venv/bin/uvicorn main:app --host 127.0.0.1 --port 8000
|
|
||||||
Restart=on-failure
|
Restart=on-failure
|
||||||
RestartSec=5
|
RestartSec=5
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue