innercontext

Author	SHA1	Message	Date
Piotr Oleszczyk	c8fa80be99	fix(api): rename 'metadata' to 'response_metadata' to avoid Pydantic conflict The field name 'metadata' conflicts with Pydantic's internal ClassVar. Renamed to 'response_metadata' throughout: - Backend: RoutineSuggestion, BatchSuggestion, ShoppingSuggestionResponse - Frontend: TypeScript types and component usages This fixes the AttributeError when setting metadata on SQLModel instances.	2026-03-06 16:16:35 +01:00
Piotr Oleszczyk	3c3248c2ea	feat(api): add Phase 3 observability - expose validation warnings and metadata to frontend Backend changes: - Create ResponseMetadata and TokenMetrics models for API responses - Modify call_gemini() and call_gemini_with_function_tools() to return (response, log_id) tuple - Add _build_response_metadata() helper to extract metadata from AICallLog - Update routines API (/suggest, /suggest-batch) to populate validation_warnings, auto_fixes_applied, and metadata - Update products API (/suggest) to populate observability fields - Update skincare API to handle new return signature Frontend changes: - Add TypeScript types: TokenMetrics, ResponseMetadata - Update RoutineSuggestion, BatchSuggestion, ShoppingSuggestionResponse with observability fields Next: Create UI components to display warnings, reasoning chains, and token metrics	2026-03-06 15:50:28 +01:00
Piotr Oleszczyk	3bf19d8acb	feat(api): add enhanced token metrics logging for Gemini API Add comprehensive token breakdown logging to understand MAX_TOKENS behavior and verify documentation claims about thinking tokens. New Fields Added to ai_call_logs: - thoughts_tokens: Thinking tokens (thoughtsTokenCount) - documented as separate from output budget - tool_use_prompt_tokens: Tool use overhead (toolUsePromptTokenCount) - cached_content_tokens: Cached content tokens (cachedContentTokenCount) Purpose: Investigate token counting mystery from production logs where: prompt_tokens: 4400 completion_tokens: 589 total_tokens: 8489 ← Should be 4400 + 589 = 4989, missing 3500! According to Gemini API docs (Polish translation): totalTokenCount = promptTokenCount + candidatesTokenCount (thoughts NOT included in total) But production logs show 3500 token gap. New logging will reveal: 1. Are thinking tokens actually separate from max_output_tokens limit? 2. Where did the 3500 missing tokens go? 3. Does MEDIUM thinking level consume output budget despite docs? 4. Are tool use tokens included in total but not shown separately? Changes: - Added 3 new integer columns to ai_call_logs (nullable) - Enhanced llm.py to capture all usage_metadata fields - Used getattr() for safe access (fields may not exist in all responses) - Database migration: 7e6f73d1cc95 This will provide complete data for future LLM calls to diagnose: - MAX_TOKENS failures - Token budget behavior - Thinking token costs - Tool use overhead	2026-03-06 12:17:13 +01:00
Piotr Oleszczyk	5bb2ea5f08	feat(api): add short_id column for consistent LLM UUID handling Resolves validation failures where LLM fabricated full UUIDs from 8-char prefixes shown in context, causing 'unknown product_id' errors. Root Cause Analysis: - Context showed 8-char short IDs: '77cbf37c' (Phase 2 optimization) - Function tool returned full UUIDs: '77cbf37c-3830-4927-...' - LLM saw BOTH formats, got confused, invented UUIDs for final response - Validators rejected fabricated UUIDs as unknown products Solution: Consistent 8-char short_id across LLM boundary: 1. Database: New short_id column (8 chars, unique, indexed) 2. Context: Shows short_id (was: str(id)[:8]) 3. Function tools: Return short_id (was: full UUID) 4. Translation layer: Expands short_id → UUID before validation 5. Database: Stores full UUIDs (no schema change for existing data) Changes: - Added products.short_id column with unique constraint + index - Migration populates from UUID prefix, handles collisions via regeneration - Product model auto-generates short_id for new products - LLM contexts use product.short_id consistently - Function tools return product.short_id - Added _expand_product_id() translation layer in routines.py - Integrated expansion in suggest_routine() and suggest_batch() - Validators work with full UUIDs (no changes needed) Benefits: ✅ LLM never sees full UUIDs, no format confusion ✅ Maintains Phase 2 token optimization (~85% reduction) ✅ O(1) indexed short_id lookups vs O(n) pattern matching ✅ Unique constraint prevents collisions at DB level ✅ Clean separation: 8-char for LLM, 36-char for application From production error: Step 1: unknown product_id 77cbf37c-3830-4927-9669-07447206689d (LLM invented the last 28 characters) Now resolved: LLM uses '77cbf37c' consistently, translation layer expands to real UUID before validation.	2026-03-06 10:58:26 +01:00
Piotr Oleszczyk	710b53e471	fix(api): resolve function tool UUID mismatch and MAX_TOKENS errors Two critical bugs identified from production logs: 1. UUID Mismatch Bug (0 products returned from function tools): - Context shows 8-char short IDs: '63278801' - Function handler expected full UUIDs: '63278801-xxxx-...' - LLM requested short IDs, handler couldn't match → 0 products Fix: Index products by BOTH full UUID and short ID (first 8 chars) in build_product_details_tool_handler. Accept either format. Added deduplication to handle duplicate requests. Maintains Phase 2 token optimization (no context changes). 2. MAX_TOKENS Error (response truncation): - max_output_tokens=4096 includes thinking tokens (~3500) - Only ~500 tokens left for JSON response - MEDIUM thinking level (Phase 2) consumed budget Fix: Increase max_output_tokens from 4096 → 8192 across all creative endpoints (routines/suggest, routines/suggest-batch, products/suggest). Updated default in get_creative_config(). Gives headroom: ~3500 thinking + ~4500 response = ~8000 total From production logs (ai_call_logs): - Log 71699654: Success but response_text null (function call only) - Log 2db37c0f: MAX_TOKENS failure, tool returned 0 products Both issues now resolved.	2026-03-06 10:44:12 +01:00
Piotr Oleszczyk	3ef1f249b6	fix(api): handle dict vs object in build_product_context_summary When products are loaded from PostgreSQL, JSON columns (effect_profile, context_rules) are deserialized as plain dicts, not Pydantic models. The build_product_context_summary function was accessing these fields as object attributes (.safe_with_compromised_barrier) which caused: AttributeError: 'dict' object has no attribute 'safe_with_compromised_barrier' Fix: Add isinstance(dict) checks like build_product_context_detailed already does. Handle both dict (from DB) and object (from Pydantic) cases. Traceback from production: File "llm_context.py", line 91, in build_product_context_summary if product.context_rules.safe_with_compromised_barrier: AttributeError: 'dict' object has no attribute...	2026-03-06 10:34:51 +01:00
Piotr Oleszczyk	594dae474b	refactor(api): remove redundant field ban language from prompts Schema enforcement already prevents LLM from returning fields outside the defined response_schema (_SingleStepOut, _BatchStepOut). Explicit field bans (dose, amount, quantity, application_amount) are redundant and add unnecessary token cost. Removed: - 'KRYTYCZNE' warning about schema violations - 'ZABRONIONE POLA' explicit field list - 4-line 'ABSOLUTNIE ZABRONIONE' dose prohibition section Token savings: ~80 tokens per prompt (system instruction overhead) Trust the schema - cleaner prompts, same enforcement.	2026-03-06 10:30:36 +01:00
Piotr Oleszczyk	c87d1b8581	feat(api): implement Phase 2 token optimization and reasoning capture - Add tiered context system (summary/detailed/full) to reduce token usage by 70-80% - Replace old _build_products_context with build_products_context_summary_list (Tier 1: ~15 tokens/product vs 150) - Optimize function tool responses: exclude INCI list by default (saves ~15KB/product) - Reduce actives from 24 to top 5 in function tools - Add reasoning_chain field to AICallLog model for observability - Implement _extract_thinking_content to capture LLM reasoning (MEDIUM thinking level) - Strengthen prompt enforcement for prohibited fields (dose, amount, quantity) - Update get_creative_config to use MEDIUM thinking level instead of LOW Token Savings: - Routine suggestions: 9,613 → ~1,300 tokens (-86%) - Batch planning: 12,580 → ~1,800 tokens (-86%) - Function tool responses: ~15KB → ~2KB per product (-87%) Breaks discovered in log analysis (ai_call_log.json): - Lines 10, 27, 61, 78: LLM returned prohibited dose field - Line 85: MAX_TOKENS failure (output truncated) Phase 2 complete. Next: two-phase batch planning with safety verification.	2026-03-06 10:26:29 +01:00
Piotr Oleszczyk	e239f61408	style: apply black and isort formatting Run formatting tools on Phase 1 changes: - black (code formatter) - isort (import sorter) - ruff (linter) All linting checks pass.	2026-03-06 10:17:00 +01:00
Piotr Oleszczyk	2a9391ad32	feat(api): add LLM response validation and input sanitization Implement Phase 1: Safety & Validation for all LLM-based suggestion engines. - Add input sanitization module to prevent prompt injection attacks - Implement 5 comprehensive validators (routine, batch, shopping, product parse, photo) - Add 10+ critical safety checks (retinoid+acid conflicts, barrier compatibility, etc.) - Integrate validation into all 5 API endpoints (routines, products, skincare) - Add validation fields to ai_call_logs table (validation_errors, validation_warnings, auto_fixed) - Create database migration for validation fields - Add comprehensive test suite (9/9 tests passing, 88% coverage on validators) Safety improvements: - Blocks retinoid + acid conflicts in same routine/day - Rejects unknown product IDs - Enforces min_interval_hours rules - Protects compromised skin barriers - Prevents prohibited fields (dose, amount) in responses - Validates all enum values and score ranges All validation failures are logged and responses are rejected with HTTP 502.	2026-03-06 10:16:47 +01:00
Piotr Oleszczyk	e3ed0dd3a3	fix(routines): enforce min_interval_hours and minoxidil flag server-side Two bugs in /routines/suggest where the LLM could override hard constraints: 1. Products with min_interval_hours (e.g. retinol at 72h) were passed to the LLM even if used too recently. The LLM reasoned away the constraint in at least one observed case. Fix: added _filter_products_by_interval() which removes ineligible products before the prompt is built, so they don't appear in AVAILABLE PRODUCTS at all. 2. Minoxidil was included in the available products list regardless of the include_minoxidil_beard flag. Only the objectives context was gated, leaving the product visible to the LLM which would include it based on recent usage history. Fix: added include_minoxidil param to _get_available_products() and threaded it through suggest_routine and suggest_batch. Also refactored _build_products_context() to accept a pre-supplied products list instead of calling _get_available_products() internally, ensuring the tool handler and context text always use the same filtered set. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-05 23:36:15 +01:00
Piotr Oleszczyk	7a66a7911d	feat(backend): include last-used date in product LLM details	2026-03-05 16:48:49 +01:00
Piotr Oleszczyk	40d26514a1	refactor(backend): consolidate product LLM function tools	2026-03-05 16:44:03 +01:00
Piotr Oleszczyk	b99b9ed68e	feat(profile): add profile settings and LLM user context	2026-03-05 15:57:21 +01:00
Piotr Oleszczyk	db3d9514d5	fix(routines): remove dose from AI routine suggestions	2026-03-05 14:19:18 +01:00
Piotr Oleszczyk	0a4ccefe28	feat(repo): expand lab results workflows across backend and frontend	2026-03-05 12:46:49 +01:00
Piotr Oleszczyk	013492ec2b	refactor(products): remove usage notes and contraindications fields	2026-03-05 10:11:24 +01:00
Piotr Oleszczyk	30315fdf56	fix(backend): create pricetier enum before migration	2026-03-04 23:16:55 +01:00
Piotr Oleszczyk	0e439b4ca7	feat(backend): move product pricing to async persisted jobs	2026-03-04 22:46:16 +01:00
Piotr Oleszczyk	c869f88db2	chore(backend): enable psycopg binary dependency	2026-03-04 21:46:38 +01:00
Piotr Oleszczyk	83ba4cc5c0	feat(products): compute price tiers from objective price/use	2026-03-04 14:47:18 +01:00
Piotr Oleszczyk	c5ea38880c	refactor(products): remove obsolete interaction fields across stack	2026-03-04 12:42:12 +01:00
Piotr Oleszczyk	1d8a8eafb8	refactor(api): remove MCP server integration and docs references	2026-03-04 12:28:30 +01:00
Piotr Oleszczyk	5dd8242985	fix(routines): simplify inventory preference in system prompt	2026-03-04 12:18:07 +01:00
Piotr Oleszczyk	b58fcb1440	feat(api): add tool-calling flow for shopping suggestions Keep /products/suggest lean by exposing product UUIDs and fetching INCI, safety rules, actives, and usage notes on demand through Gemini function tools. Add conservative fallback behavior for tool roundtrip limits and expand helper tests to cover tool wiring and payload handlers.	2026-03-04 12:05:33 +01:00
Piotr Oleszczyk	558708653c	feat(api): expand routines tool-calling to reduce prompt load Keep the /routines/suggest base context lean by sending only active names and fetching detailed safety, actives, usage notes, and INCI on demand. Add a conservative fallback when tool roundtrip limits are hit to preserve safe outputs instead of failing the request.	2026-03-04 11:52:07 +01:00
Piotr Oleszczyk	cfd2485b7e	feat(api): add INCI tool-calling with normalized tool traces Enable on-demand INCI retrieval in /routines/suggest through Gemini function calling so detailed ingredient data is fetched only when needed. Persist and normalize tool_trace data in AI logs to make function-call behavior directly inspectable via /ai-logs endpoints.	2026-03-04 11:35:19 +01:00
Piotr Oleszczyk	c0eeb0425d	fix(routines): include product safety and usage signals in prompts Expose leave-on behavior, contraindications, safety alerts, and compact usage notes in AVAILABLE PRODUCTS so Gemini can make safer routine decisions with real-world product constraints.	2026-03-04 02:42:16 +01:00
Piotr Oleszczyk	9bbc34ffd2	test(api): fix ruff issues in routine tests	2026-03-04 02:23:19 +01:00
Piotr Oleszczyk	472a3034a0	feat(routines): refine therapeutic and travel-mode prompt rules	2026-03-04 02:22:39 +01:00
Piotr Oleszczyk	820d58ea37	feat(routines): enrich single AI suggestions with concise context	2026-03-04 01:22:57 +01:00
Piotr Oleszczyk	88f3642387	test(api): add tests for ai suggestion endpoints and helpers	2026-03-03 22:06:33 +01:00
Piotr Oleszczyk	5ad9b66a21	build(backend): add pytest-cov configuration and report generation	2026-03-03 22:06:24 +01:00
Piotr Oleszczyk	ba1f10d99f	refactor(llm): optimize Gemini config profiles for extraction and creativity Introduces `get_extraction_config` and `get_creative_config` to standardize Gemini API calls. * Defines explicit config profiles with appropriate `temperature` and `thinking_level` for Gemini 3 Flash. * Extraction tasks use minimal thinking and temp=0.0 to reduce latency and token usage. * Creative tasks use low thinking, temp=0.4, and top_p=0.8 to balance naturalness and safety. * Applies these helpers across products, routines, and skincare endpoints. * Also updates default model to `gemini-3-flash-preview`.	2026-03-03 21:24:23 +01:00
Piotr Oleszczyk	78df7322a9	refactor(api): remove shopping assistant logic from mcp_server	2026-03-03 20:51:42 +01:00
Piotr Oleszczyk	0e7a39836f	refactor(routines): use category and short uuid for recent history representation	2026-03-03 20:29:36 +01:00
Piotr Oleszczyk	28fb74b9bf	refactor(routines): translate prompt input keys to english to reduce language switch penalty	2026-03-03 20:24:56 +01:00
Piotr Oleszczyk	9574c91be1	refactor(routines): remove hardcoded grooming actions from system prompt	2026-03-03 20:22:59 +01:00
Piotr Oleszczyk	4627ec70bf	refactor(routines): remove examples from inventory management rule to avoid bias	2026-03-03 20:07:13 +01:00
Piotr Oleszczyk	30ebc093bf	feat(routines): adjust inventory management prompt to allow opening better suited sealed products	2026-03-03 20:06:38 +01:00
Piotr Oleszczyk	877051cfaf	feat(routines): add actives and recent usage tracking to product context	2026-03-03 20:01:39 +01:00
Piotr Oleszczyk	1109d9f397	fix(products): only suggest when real need exists	2026-03-03 19:51:49 +01:00
Piotr Oleszczyk	609995732b	feat(routines): add minimize_products option for batch suggestions	2026-03-03 00:50:49 +01:00
Piotr Oleszczyk	40f9a353bb	feat(products): add shopping suggestions feature - Add POST /api/products/suggest endpoint that analyzes skin condition and inventory to suggest product types (e.g., 'Salicylic Acid 2% Masque') - Add MCP tool get_shopping_suggestions() for MCP clients - Add 'Suggest' button to Products page in frontend - Add /products/suggest page with suggestion cards - Include product type, key ingredients, target concerns, why_needed, recommended_time, and frequency in suggestions - Fix stock logic: sealed products now count as available inventory - Add legend to clarify ✓ (in stock) vs ✗ (not in stock) markers	2026-03-02 22:38:08 +01:00
Piotr Oleszczyk	389ca5ffdc	fix(backend): resolve ty check errors across api, mcp, and lifespan typing	2026-03-02 15:51:14 +01:00
Piotr Oleszczyk	c85ca355df	refactor(routines): streamline suggest prompt — merge inventory context, add leaving_home SPF hint - Remove _build_inventory_context; fold pao_months into DOSTĘPNE PRODUKTY entries - Remove "Otwarte równolegle" duplicate section from prompt - Rename OSTATNIE RUTYNY (7 dni) → OSTATNIE RUTYNY - Add _build_day_context and SuggestRoutineRequest.leaving_home (optional bool) - System prompt: replace unconditional PAO rule with conditional; add SPF factor selection logic based on KONTEKST DNIA leaving_home value - Frontend: leaving_home checkbox (AM only) + i18n keys pl/en Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 23:47:54 +01:00
Piotr Oleszczyk	258b8c4330	refactor(routines): use SQLAlchemy is_(False) for product filters	2026-03-01 23:23:04 +01:00
Piotr Oleszczyk	d3bd2ff30d	feat(skincare): allow HEIC/HEIF uploads in skin analysis	2026-03-01 23:23:04 +01:00
Piotr Oleszczyk	f1acfa21fc	feat(routines): add inventory-aware product selection rules	2026-03-01 22:15:47 +01:00
Piotr Oleszczyk	914c6087bd	fix(products): work around Gemini int-enum schema rejection in parse-text Gemini API rejects int-valued enums (StrengthLevel) in response_schema, raising a validation error before any request is sent. Fix by introducing AIActiveIngredient (inherits ActiveIngredient, overrides strength_level and irritation_potential as Optional[int]) and ProductParseLLMResponse used only as the Gemini schema. The two-step validation converts ints back to StrengthLevel via Pydantic coercion. Adds a test covering the numeric strength level path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 22:00:48 +01:00

1 2

83 commits