"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "推荐一首刘德华的歌曲"
}
],
"max_tokens": 9,
"n": 3,
"best_of": 10,
"temperature": 0,
"top_k": 50,
"top_p": 0.95,
"logprobs": true,
"ignore_eos": false,
"stream": false
两个问题:
1,qwen3的模型,请求时候怎么配,能没有think的过程,直接输出结果?
2,Best-of-N 请求的时候,为什么没有输出?
-d '{
"model": "Qwen3-4B",