121 lines
7.2 KiB
Markdown
121 lines
7.2 KiB
Markdown
# Backend
|
|
|
|
Python 3.12 FastAPI backend. Entry: `main.py` → `db.py` → routers in `innercontext/api/`.
|
|
|
|
## Structure
|
|
|
|
```
|
|
backend/
|
|
├── main.py # FastAPI app, lifespan, CORS, router registration
|
|
├── db.py # Engine, get_session() dependency, create_db_and_tables()
|
|
├── innercontext/
|
|
│ ├── api/ # 7 FastAPI routers
|
|
│ │ ├── products.py # CRUD + LLM parse/suggest + pricing
|
|
│ │ ├── routines.py # CRUD + LLM suggest/batch + grooming schedule
|
|
│ │ ├── health.py # Medications + lab results CRUD
|
|
│ │ ├── skincare.py # Snapshots + photo analysis (Gemini vision)
|
|
│ │ ├── inventory.py # Product inventory CRUD
|
|
│ │ ├── profile.py # User profile upsert
|
|
│ │ ├── ai_logs.py # LLM call log viewer
|
|
│ │ ├── llm_context.py # Context builders (Tier 1 summary / Tier 2 detailed)
|
|
│ │ ├── product_llm_tools.py # Gemini function tool declarations + handlers
|
|
│ │ └── utils.py # get_or_404()
|
|
│ ├── models/ # SQLModel tables + Pydantic types
|
|
│ │ ├── product.py # Product, ProductInventory, _ev(), to_llm_context()
|
|
│ │ ├── health.py # MedicationEntry, MedicationUsage, LabResult
|
|
│ │ ├── routine.py # Routine, RoutineStep, GroomingSchedule
|
|
│ │ ├── skincare.py # SkinConditionSnapshot (JSON: concerns, risks, priorities)
|
|
│ │ ├── profile.py # UserProfile
|
|
│ │ ├── pricing.py # PricingRecalcJob (async tier calculation)
|
|
│ │ ├── ai_log.py # AICallLog (token metrics, reasoning chain, tool trace)
|
|
│ │ ├── enums.py # 20+ enums (ProductCategory, SkinType, SkinConcern, etc.)
|
|
│ │ ├── base.py # utc_now() helper
|
|
│ │ ├── domain.py # Domain enum (HEALTH, SKINCARE)
|
|
│ │ └── api_metadata.py # ResponseMetadata, TokenMetrics (Phase 3 observability)
|
|
│ ├── validators/ # LLM response validators (non-blocking)
|
|
│ │ ├── base.py # ValidationResult, BaseValidator abstract
|
|
│ │ ├── routine_validator.py # Retinoid+acid, intervals, SPF, barrier safety
|
|
│ │ ├── batch_validator.py # Multi-day frequency + same-day conflicts
|
|
│ │ ├── product_parse_validator.py # Enum checks, effect_profile, pH, actives
|
|
│ │ ├── shopping_validator.py # Category, priority, text quality
|
|
│ │ └── photo_validator.py # Skin metrics 1-5, enum checks
|
|
│ ├── services/
|
|
│ │ ├── fx.py # NBP API currency conversion (24h cache, thread-safe)
|
|
│ │ └── pricing_jobs.py # Job queue (enqueue, claim with FOR UPDATE SKIP LOCKED)
|
|
│ ├── workers/
|
|
│ │ └── pricing.py # Background pricing worker
|
|
│ ├── llm.py # Gemini client, call_gemini(), call_gemini_with_function_tools()
|
|
│ └── llm_safety.py # Prompt injection prevention (sanitize + isolate)
|
|
├── tests/ # 171 pytest tests (SQLite in-memory, isolated per test)
|
|
├── alembic/ # 17 migration versions
|
|
└── pyproject.toml # uv, pytest (--cov), ruff, black, isort (black profile)
|
|
```
|
|
|
|
## Model Conventions
|
|
|
|
- **JSON columns**: `sa_column=Column(JSON, nullable=...)` on `table=True` models only. DB-agnostic (not JSONB).
|
|
- **`updated_at`**: MUST use `sa_column=Column(DateTime(timezone=True), onupdate=utc_now)`. Never plain `Field(default_factory=...)`.
|
|
- **`_ev()` helper** (`product.py`): Normalises enum values — returns `.value` if enum, `str()` otherwise. Required when fields may be raw dicts (from DB) or Python enum instances.
|
|
- **`model_validator(mode="after")`**: Does NOT fire on `table=True` instances (SQLModel 0.0.37 + Pydantic v2 bug). Product validators are documentation only.
|
|
- **`to_llm_context()`**: Returns token-optimised dict. Filters `effect_profile` to nonzero values (≥2). Handles both dict and object forms.
|
|
- **`short_id`**: 8-char UUID prefix on Product. Used in LLM context for token efficiency → expanded to full UUID before DB queries.
|
|
|
|
## LLM Integration
|
|
|
|
Two config patterns in `llm.py`:
|
|
- `get_extraction_config()`: temp=0.0, MINIMAL thinking. Deterministic data parsing.
|
|
- `get_creative_config()`: temp=0.4, MEDIUM thinking. Suggestions with reasoning chain capture.
|
|
|
|
Three context tiers in `llm_context.py`:
|
|
- **Tier 1** (~15-20 tokens/product): One-line summary with status, key effects, safety flags.
|
|
- **Tier 2** (~40-50 tokens/product): Top 5 actives + effect_profile + context_rules. Used in function tool responses.
|
|
- **Tier 3**: Full `to_llm_context()`. Token-heavy, rarely used.
|
|
|
|
Function calling (`product_llm_tools.py`):
|
|
- `call_gemini_with_function_tools()`: Iterative tool loop, max 2 roundtrips.
|
|
- `PRODUCT_DETAILS_FUNCTION_DECLARATION`: Gemini function schema for product lookups.
|
|
- INCI lists excluded from LLM context by default (~12-15KB per product saved).
|
|
|
|
All calls logged to `AICallLog` with: token metrics, reasoning_chain, tool_trace, validation results.
|
|
|
|
Safety (`llm_safety.py`):
|
|
- `sanitize_user_input()`: Removes prompt injection patterns, limits length.
|
|
- `isolate_user_input()`: Wraps with boundary markers, treats as data not instructions.
|
|
|
|
## Validators
|
|
|
|
All extend `BaseValidator`, return `ValidationResult` (errors, warnings, auto_fixes). Validation is **non-blocking** — errors returned in response body as `validation_warnings`, not as HTTP 4xx.
|
|
|
|
Key safety checks in `routine_validator.py`:
|
|
- No retinoid + acid in same routine (detects via `effect_profile.retinoid_strength > 0` and exfoliant functions in actives)
|
|
- Respect `min_interval_hours` and `max_frequency_per_week`
|
|
- Check `context_rules`: `safe_after_shaving`, `safe_with_compromised_barrier`
|
|
- AM routines need SPF when `leaving_home=True`
|
|
- No high `irritation_risk` or `barrier_disruption_risk` with compromised barrier
|
|
|
|
## API Patterns
|
|
|
|
- All routers use `Depends(get_session)` for DB access.
|
|
- `get_or_404(session, Model, id)` for 404 responses.
|
|
- LLM endpoints: build context → call Gemini → validate → log to AICallLog → return data + `ResponseMetadata`.
|
|
- Product pricing: enqueues `PricingRecalcJob` on create/update. Worker claims with `FOR UPDATE SKIP LOCKED`.
|
|
- Gemini API rejects int-enum in `response_schema` — `AIActiveIngredient` overrides fields with plain `int` + `# type: ignore[assignment]`.
|
|
|
|
## Environment
|
|
|
|
| Variable | Default | Required |
|
|
|----------|---------|----------|
|
|
| `DATABASE_URL` | `postgresql+psycopg://localhost/innercontext` | Yes |
|
|
| `GEMINI_API_KEY` | — | For LLM features |
|
|
| `GEMINI_MODEL` | `gemini-3-flash-preview` | No |
|
|
|
|
`main.py` calls `load_dotenv()` before importing `db.py` to ensure `DATABASE_URL` is read from `.env`.
|
|
|
|
## Testing
|
|
|
|
- `cd backend && uv run pytest`
|
|
- SQLite in-memory per test — fully isolated, no cleanup needed.
|
|
- `conftest.py` fixtures: `session`, `client` (TestClient with patched engine), `product_data`, `created_product`, `medication_data`, `created_medication`, `created_routine`.
|
|
- LLM calls mocked with `unittest.mock.patch` and `monkeypatch`.
|
|
- Coverage: `--cov=innercontext --cov-report=term-missing`.
|
|
- No test markers or parametrize — explicit test functions only.
|