innercontext

Author	SHA1	Message	Date
Piotr Oleszczyk	3bf19d8acb	feat(api): add enhanced token metrics logging for Gemini API Add comprehensive token breakdown logging to understand MAX_TOKENS behavior and verify documentation claims about thinking tokens. New Fields Added to ai_call_logs: - thoughts_tokens: Thinking tokens (thoughtsTokenCount) - documented as separate from output budget - tool_use_prompt_tokens: Tool use overhead (toolUsePromptTokenCount) - cached_content_tokens: Cached content tokens (cachedContentTokenCount) Purpose: Investigate token counting mystery from production logs where: prompt_tokens: 4400 completion_tokens: 589 total_tokens: 8489 ← Should be 4400 + 589 = 4989, missing 3500! According to Gemini API docs (Polish translation): totalTokenCount = promptTokenCount + candidatesTokenCount (thoughts NOT included in total) But production logs show 3500 token gap. New logging will reveal: 1. Are thinking tokens actually separate from max_output_tokens limit? 2. Where did the 3500 missing tokens go? 3. Does MEDIUM thinking level consume output budget despite docs? 4. Are tool use tokens included in total but not shown separately? Changes: - Added 3 new integer columns to ai_call_logs (nullable) - Enhanced llm.py to capture all usage_metadata fields - Used getattr() for safe access (fields may not exist in all responses) - Database migration: 7e6f73d1cc95 This will provide complete data for future LLM calls to diagnose: - MAX_TOKENS failures - Token budget behavior - Thinking token costs - Tool use overhead	2026-03-06 12:17:13 +01:00
Piotr Oleszczyk	5bb2ea5f08	feat(api): add short_id column for consistent LLM UUID handling Resolves validation failures where LLM fabricated full UUIDs from 8-char prefixes shown in context, causing 'unknown product_id' errors. Root Cause Analysis: - Context showed 8-char short IDs: '77cbf37c' (Phase 2 optimization) - Function tool returned full UUIDs: '77cbf37c-3830-4927-...' - LLM saw BOTH formats, got confused, invented UUIDs for final response - Validators rejected fabricated UUIDs as unknown products Solution: Consistent 8-char short_id across LLM boundary: 1. Database: New short_id column (8 chars, unique, indexed) 2. Context: Shows short_id (was: str(id)[:8]) 3. Function tools: Return short_id (was: full UUID) 4. Translation layer: Expands short_id → UUID before validation 5. Database: Stores full UUIDs (no schema change for existing data) Changes: - Added products.short_id column with unique constraint + index - Migration populates from UUID prefix, handles collisions via regeneration - Product model auto-generates short_id for new products - LLM contexts use product.short_id consistently - Function tools return product.short_id - Added _expand_product_id() translation layer in routines.py - Integrated expansion in suggest_routine() and suggest_batch() - Validators work with full UUIDs (no changes needed) Benefits: ✅ LLM never sees full UUIDs, no format confusion ✅ Maintains Phase 2 token optimization (~85% reduction) ✅ O(1) indexed short_id lookups vs O(n) pattern matching ✅ Unique constraint prevents collisions at DB level ✅ Clean separation: 8-char for LLM, 36-char for application From production error: Step 1: unknown product_id 77cbf37c-3830-4927-9669-07447206689d (LLM invented the last 28 characters) Now resolved: LLM uses '77cbf37c' consistently, translation layer expands to real UUID before validation.	2026-03-06 10:58:26 +01:00
Piotr Oleszczyk	c87d1b8581	feat(api): implement Phase 2 token optimization and reasoning capture - Add tiered context system (summary/detailed/full) to reduce token usage by 70-80% - Replace old _build_products_context with build_products_context_summary_list (Tier 1: ~15 tokens/product vs 150) - Optimize function tool responses: exclude INCI list by default (saves ~15KB/product) - Reduce actives from 24 to top 5 in function tools - Add reasoning_chain field to AICallLog model for observability - Implement _extract_thinking_content to capture LLM reasoning (MEDIUM thinking level) - Strengthen prompt enforcement for prohibited fields (dose, amount, quantity) - Update get_creative_config to use MEDIUM thinking level instead of LOW Token Savings: - Routine suggestions: 9,613 → ~1,300 tokens (-86%) - Batch planning: 12,580 → ~1,800 tokens (-86%) - Function tool responses: ~15KB → ~2KB per product (-87%) Breaks discovered in log analysis (ai_call_log.json): - Lines 10, 27, 61, 78: LLM returned prohibited dose field - Line 85: MAX_TOKENS failure (output truncated) Phase 2 complete. Next: two-phase batch planning with safety verification.	2026-03-06 10:26:29 +01:00
Piotr Oleszczyk	2a9391ad32	feat(api): add LLM response validation and input sanitization Implement Phase 1: Safety & Validation for all LLM-based suggestion engines. - Add input sanitization module to prevent prompt injection attacks - Implement 5 comprehensive validators (routine, batch, shopping, product parse, photo) - Add 10+ critical safety checks (retinoid+acid conflicts, barrier compatibility, etc.) - Integrate validation into all 5 API endpoints (routines, products, skincare) - Add validation fields to ai_call_logs table (validation_errors, validation_warnings, auto_fixed) - Create database migration for validation fields - Add comprehensive test suite (9/9 tests passing, 88% coverage on validators) Safety improvements: - Blocks retinoid + acid conflicts in same routine/day - Rejects unknown product IDs - Enforces min_interval_hours rules - Protects compromised skin barriers - Prevents prohibited fields (dose, amount) in responses - Validates all enum values and score ranges All validation failures are logged and responses are rejected with HTTP 502.	2026-03-06 10:16:47 +01:00
Piotr Oleszczyk	b99b9ed68e	feat(profile): add profile settings and LLM user context	2026-03-05 15:57:21 +01:00
Piotr Oleszczyk	013492ec2b	refactor(products): remove usage notes and contraindications fields	2026-03-05 10:11:24 +01:00
Piotr Oleszczyk	30315fdf56	fix(backend): create pricetier enum before migration	2026-03-04 23:16:55 +01:00
Piotr Oleszczyk	0e439b4ca7	feat(backend): move product pricing to async persisted jobs	2026-03-04 22:46:16 +01:00
Piotr Oleszczyk	83ba4cc5c0	feat(products): compute price tiers from objective price/use	2026-03-04 14:47:18 +01:00
Piotr Oleszczyk	c5ea38880c	refactor(products): remove obsolete interaction fields across stack	2026-03-04 12:42:12 +01:00
Piotr Oleszczyk	cfd2485b7e	feat(api): add INCI tool-calling with normalized tool traces Enable on-demand INCI retrieval in /routines/suggest through Gemini function calling so detailed ingredient data is fetched only when needed. Persist and normalize tool_trace data in AI logs to make function-call behavior directly inspectable via /ai-logs endpoints.	2026-03-04 11:35:19 +01:00
Piotr Oleszczyk	092fd87606	fix(llm): log and handle non-STOP finish_reason from Gemini When Gemini stops generation early (e.g. due to safety filters or thinking-model quirks), finish_reason != STOP but no exception is raised, causing the caller to receive truncated JSON and a confusing 502 "invalid JSON" error. Now: - finish_reason is extracted from candidates[0] and stored in ai_call_logs - any non-STOP finish_reason raises HTTP 502 with a clear message - Alembic migration adds the finish_reason column to ai_call_logs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 20:08:22 +01:00
Piotr Oleszczyk	75ef1bca56	feat(routines): add minoxidil beard/mustache option to routine suggestions - Add include_minoxidil_beard flag to SuggestRoutineRequest and SuggestBatchRequest - Detect minoxidil products by scanning name, brand, INCI and actives; pass them to the LLM even though they are medications - Inject CELE UŻYTKOWNIKA context block into prompts when flag is enabled - Add _build_objectives_context() returning empty string when flag is off - Add call_gemini() helper that centralises Gemini API calls and logs every request/response to a new ai_call_logs table (AICallLog model + /ai-logs router) - Nginx: raise client_max_body_size to 16 MB for photo uploads Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 19:46:07 +01:00
Piotr Oleszczyk	5cb44b2c65	fix(backend): apply black/isort formatting and fix ruff noqa annotations Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 17:27:07 +01:00
Piotr Oleszczyk	3c1dcbeb06	feat(backend): add Alembic migrations - Add alembic 1.14 to dependencies (uv sync → 1.18.4 installed) - Configure alembic/env.py: loads DATABASE_URL from env, imports all SQLModel models so metadata is fully populated for autogenerate - Generate initial migration (c2d626a2b36c) covering all 9 tables: products, product_inventory, medication_entries, medication_usages, lab_results, routines, routine_steps, grooming_schedule, skin_condition_snapshots — with all indexes and constraints - Add ExecStartPre to innercontext.service: runs alembic upgrade head before uvicorn starts (idempotent, safe on every restart) - Update DEPLOYMENT.md: add migration step to backend setup and update flow; document alembic stamp head for existing installations Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-28 20:14:57 +01:00

15 commits