fix(backend): restore response_mime_type=json, raise max_output_tokens to 16384

Regular generation was hitting MAX_TOKENS at 8192. Constrained decoding with 16384 should be a viable middle ground between the truncation at 8192 and the timeout at 65536. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 22:26:41 +01:00 · 2026-02-28 22:26:41 +01:00 · a3753d0929
commit a3753d0929
parent 3fbf6d7041
1 changed files with 2 additions and 1 deletions
--- a/backend/innercontext/api/products.py
+++ b/backend/innercontext/api/products.py
@ -359,7 +359,8 @@ def parse_product_text(data: ProductParseRequest) -> ProductParseResponse:
        contents=f"Extract product data from this text:\n\n{data.text}",
        config=genai_types.GenerateContentConfig(
            system_instruction=_product_parse_system_prompt(),
-            max_output_tokens=8192,
+            response_mime_type="application/json",
+            max_output_tokens=16384,
            temperature=0.0,
        ),
    )