fix(backend): drop response_mime_type=application/json to avoid constrained decoding
Constrained decoding is ~10x slower and consumes hidden tokens for constraint processing, causing truncation at ~1000 chars even with 8192 max_output_tokens. The system prompt already instructs the model to output raw minified JSON; our NaN/markdown-fence sanitisation handles edge cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
26069f5d66
commit
3fbf6d7041
1 changed files with 1 additions and 2 deletions
|
|
@ -359,8 +359,7 @@ def parse_product_text(data: ProductParseRequest) -> ProductParseResponse:
|
||||||
contents=f"Extract product data from this text:\n\n{data.text}",
|
contents=f"Extract product data from this text:\n\n{data.text}",
|
||||||
config=genai_types.GenerateContentConfig(
|
config=genai_types.GenerateContentConfig(
|
||||||
system_instruction=_product_parse_system_prompt(),
|
system_instruction=_product_parse_system_prompt(),
|
||||||
response_mime_type="application/json",
|
max_output_tokens=8192,
|
||||||
max_output_tokens=65536,
|
|
||||||
temperature=0.0,
|
temperature=0.0,
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue