fix(backend): drop response_mime_type=application/json to avoid constrained decoding
Constrained decoding is ~10x slower and consumes hidden tokens for constraint processing, causing truncation at ~1000 chars even with 8192 max_output_tokens. The system prompt already instructs the model to output raw minified JSON; our NaN/markdown-fence sanitisation handles edge cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
26069f5d66
commit
3fbf6d7041
1 changed files with 1 additions and 2 deletions
|
|
@ -359,8 +359,7 @@ def parse_product_text(data: ProductParseRequest) -> ProductParseResponse:
|
|||
contents=f"Extract product data from this text:\n\n{data.text}",
|
||||
config=genai_types.GenerateContentConfig(
|
||||
system_instruction=_product_parse_system_prompt(),
|
||||
response_mime_type="application/json",
|
||||
max_output_tokens=65536,
|
||||
max_output_tokens=8192,
|
||||
temperature=0.0,
|
||||
),
|
||||
)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue