Commit graph

15 commits

Author SHA1 Message Date
3bf19d8acb feat(api): add enhanced token metrics logging for Gemini API
Add comprehensive token breakdown logging to understand MAX_TOKENS behavior
and verify documentation claims about thinking tokens.

New Fields Added to ai_call_logs:
- thoughts_tokens: Thinking tokens (thoughtsTokenCount) - documented as
  separate from output budget
- tool_use_prompt_tokens: Tool use overhead (toolUsePromptTokenCount)
- cached_content_tokens: Cached content tokens (cachedContentTokenCount)

Purpose:
Investigate token counting mystery from production logs where:
  prompt_tokens: 4400
  completion_tokens: 589
  total_tokens: 8489  ← Should be 4400 + 589 = 4989, missing 3500!

According to Gemini API docs (Polish translation):
  totalTokenCount = promptTokenCount + candidatesTokenCount
  (thoughts NOT included in total)

But production logs show 3500 token gap. New logging will reveal:
1. Are thinking tokens actually separate from max_output_tokens limit?
2. Where did the 3500 missing tokens go?
3. Does MEDIUM thinking level consume output budget despite docs?
4. Are tool use tokens included in total but not shown separately?

Changes:
- Added 3 new integer columns to ai_call_logs (nullable)
- Enhanced llm.py to capture all usage_metadata fields
- Used getattr() for safe access (fields may not exist in all responses)
- Database migration: 7e6f73d1cc95

This will provide complete data for future LLM calls to diagnose:
- MAX_TOKENS failures
- Token budget behavior
- Thinking token costs
- Tool use overhead
2026-03-06 12:17:13 +01:00
5bb2ea5f08 feat(api): add short_id column for consistent LLM UUID handling
Resolves validation failures where LLM fabricated full UUIDs from 8-char
prefixes shown in context, causing 'unknown product_id' errors.

Root Cause Analysis:
- Context showed 8-char short IDs: '77cbf37c' (Phase 2 optimization)
- Function tool returned full UUIDs: '77cbf37c-3830-4927-...'
- LLM saw BOTH formats, got confused, invented UUIDs for final response
- Validators rejected fabricated UUIDs as unknown products

Solution: Consistent 8-char short_id across LLM boundary:
1. Database: New short_id column (8 chars, unique, indexed)
2. Context: Shows short_id (was: str(id)[:8])
3. Function tools: Return short_id (was: full UUID)
4. Translation layer: Expands short_id → UUID before validation
5. Database: Stores full UUIDs (no schema change for existing data)

Changes:
- Added products.short_id column with unique constraint + index
- Migration populates from UUID prefix, handles collisions via regeneration
- Product model auto-generates short_id for new products
- LLM contexts use product.short_id consistently
- Function tools return product.short_id
- Added _expand_product_id() translation layer in routines.py
- Integrated expansion in suggest_routine() and suggest_batch()
- Validators work with full UUIDs (no changes needed)

Benefits:
 LLM never sees full UUIDs, no format confusion
 Maintains Phase 2 token optimization (~85% reduction)
 O(1) indexed short_id lookups vs O(n) pattern matching
 Unique constraint prevents collisions at DB level
 Clean separation: 8-char for LLM, 36-char for application

From production error:
  Step 1: unknown product_id 77cbf37c-3830-4927-9669-07447206689d
  (LLM invented the last 28 characters)

Now resolved: LLM uses '77cbf37c' consistently, translation layer
expands to real UUID before validation.
2026-03-06 10:58:26 +01:00
c87d1b8581 feat(api): implement Phase 2 token optimization and reasoning capture
- Add tiered context system (summary/detailed/full) to reduce token usage by 70-80%
- Replace old _build_products_context with build_products_context_summary_list (Tier 1: ~15 tokens/product vs 150)
- Optimize function tool responses: exclude INCI list by default (saves ~15KB/product)
- Reduce actives from 24 to top 5 in function tools
- Add reasoning_chain field to AICallLog model for observability
- Implement _extract_thinking_content to capture LLM reasoning (MEDIUM thinking level)
- Strengthen prompt enforcement for prohibited fields (dose, amount, quantity)
- Update get_creative_config to use MEDIUM thinking level instead of LOW

Token Savings:
- Routine suggestions: 9,613 → ~1,300 tokens (-86%)
- Batch planning: 12,580 → ~1,800 tokens (-86%)
- Function tool responses: ~15KB → ~2KB per product (-87%)

Breaks discovered in log analysis (ai_call_log.json):
- Lines 10, 27, 61, 78: LLM returned prohibited dose field
- Line 85: MAX_TOKENS failure (output truncated)

Phase 2 complete. Next: two-phase batch planning with safety verification.
2026-03-06 10:26:29 +01:00
2a9391ad32 feat(api): add LLM response validation and input sanitization
Implement Phase 1: Safety & Validation for all LLM-based suggestion engines.

- Add input sanitization module to prevent prompt injection attacks
- Implement 5 comprehensive validators (routine, batch, shopping, product parse, photo)
- Add 10+ critical safety checks (retinoid+acid conflicts, barrier compatibility, etc.)
- Integrate validation into all 5 API endpoints (routines, products, skincare)
- Add validation fields to ai_call_logs table (validation_errors, validation_warnings, auto_fixed)
- Create database migration for validation fields
- Add comprehensive test suite (9/9 tests passing, 88% coverage on validators)

Safety improvements:
- Blocks retinoid + acid conflicts in same routine/day
- Rejects unknown product IDs
- Enforces min_interval_hours rules
- Protects compromised skin barriers
- Prevents prohibited fields (dose, amount) in responses
- Validates all enum values and score ranges

All validation failures are logged and responses are rejected with HTTP 502.
2026-03-06 10:16:47 +01:00
b99b9ed68e feat(profile): add profile settings and LLM user context 2026-03-05 15:57:21 +01:00
013492ec2b refactor(products): remove usage notes and contraindications fields 2026-03-05 10:11:24 +01:00
30315fdf56 fix(backend): create pricetier enum before migration 2026-03-04 23:16:55 +01:00
0e439b4ca7 feat(backend): move product pricing to async persisted jobs 2026-03-04 22:46:16 +01:00
83ba4cc5c0 feat(products): compute price tiers from objective price/use 2026-03-04 14:47:18 +01:00
c5ea38880c refactor(products): remove obsolete interaction fields across stack 2026-03-04 12:42:12 +01:00
cfd2485b7e feat(api): add INCI tool-calling with normalized tool traces
Enable on-demand INCI retrieval in /routines/suggest through Gemini function calling so detailed ingredient data is fetched only when needed. Persist and normalize tool_trace data in AI logs to make function-call behavior directly inspectable via /ai-logs endpoints.
2026-03-04 11:35:19 +01:00
092fd87606 fix(llm): log and handle non-STOP finish_reason from Gemini
When Gemini stops generation early (e.g. due to safety filters or
thinking-model quirks), finish_reason != STOP but no exception is raised,
causing the caller to receive truncated JSON and a confusing 502 "invalid
JSON" error. Now:
- finish_reason is extracted from candidates[0] and stored in ai_call_logs
- any non-STOP finish_reason raises HTTP 502 with a clear message
- Alembic migration adds the finish_reason column to ai_call_logs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 20:08:22 +01:00
75ef1bca56 feat(routines): add minoxidil beard/mustache option to routine suggestions
- Add include_minoxidil_beard flag to SuggestRoutineRequest and SuggestBatchRequest
- Detect minoxidil products by scanning name, brand, INCI and actives; pass them
  to the LLM even though they are medications
- Inject CELE UŻYTKOWNIKA context block into prompts when flag is enabled
- Add _build_objectives_context() returning empty string when flag is off
- Add call_gemini() helper that centralises Gemini API calls and logs every
  request/response to a new ai_call_logs table (AICallLog model + /ai-logs router)
- Nginx: raise client_max_body_size to 16 MB for photo uploads

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 19:46:07 +01:00
5cb44b2c65 fix(backend): apply black/isort formatting and fix ruff noqa annotations
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 17:27:07 +01:00
3c1dcbeb06 feat(backend): add Alembic migrations
- Add alembic 1.14 to dependencies (uv sync → 1.18.4 installed)
- Configure alembic/env.py: loads DATABASE_URL from env, imports all
  SQLModel models so metadata is fully populated for autogenerate
- Generate initial migration (c2d626a2b36c) covering all 9 tables:
  products, product_inventory, medication_entries, medication_usages,
  lab_results, routines, routine_steps, grooming_schedule,
  skin_condition_snapshots — with all indexes and constraints
- Add ExecStartPre to innercontext.service: runs alembic upgrade head
  before uvicorn starts (idempotent, safe on every restart)
- Update DEPLOYMENT.md: add migration step to backend setup and update
  flow; document alembic stamp head for existing installations

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 20:14:57 +01:00