DeepSeek 思考模式、JSON 和工具调用

本文说明腾讯 TokenHub DeepSeek 模型的扩展能力，包括思考模式、结构化 JSON 输出和 Function Calling。

一、思考模式

DeepSeek 模型支持通过

thinking

thinking

字段控制思考模式。

写法	说明
`{"type": "enabled"}` {"type": "enabled"}	开启思考模式。适合复杂推理、数学、代码分析、方案设计等任务。
`{"type": "disabled"}` {"type": "disabled"}	关闭思考模式。适合摘要、分类、简单问答等任务，可减少 Token 消耗和延迟。

部分 DeepSeek 调用还支持配置推理深度，例如：

{ "thinking": { "type": "enabled", "reasoning_effort": "high" } }

复杂推理任务可以提高推理深度；普通问答、摘要和分类任务建议关闭思考模式。

示例：

curl -X POST "$AI_GATEWAY_BASE_URL/chat/completions" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-v4-pro", "messages": [ { "role": "user", "content": "请分析这个系统架构中可能的性能瓶颈，并给出优化方案。" } ], "max_tokens": 2048, "thinking": { "type": "enabled", "reasoning_effort": "high" } }'

开启思考模式后，响应的

message

message

中可能包含：

{ "role": "assistant", "reasoning_content": "这里是模型的推理过程。", "content": "这里是最终答案。" }

建议：

面向终端用户时，通常只展示
```
content
```
content
。
```
reasoning_content
```
reasoning_content
可用于调试、内部分析或高级模式展示。
普通多轮对话中，下一轮通常只回传上一轮的
```
content
```
content
，不需要回传
```
reasoning_content
```
reasoning_content
。
如果使用复杂工具调用和交错式思考能力，应按模型详情页说明处理历史思考内容。
开启思考模式时响应时间可能更长，建议配合
```
stream=true
```
stream=true
使用，降低超时风险。

二、流式思考输出

开启流式输出时，思考过程和最终答案可能分别以增量字段返回。

from openai import OpenAI client = OpenAI( api_key="<your-api-key>", base_url="https://cn-shanghai-alicloud-aimesh.api.clickzetta.com/gateway/v1", ) stream = client.chat.completions.create( model="deepseek-v4-pro", messages=[ {"role": "user", "content": "分析一下向量检索系统的优势和挑战。"} ], max_tokens=2048, stream=True, extra_body={"thinking": {"type": "enabled"}}, ) answer_started = False for chunk in stream: if not chunk.choices: continue delta = chunk.choices[0].delta reasoning_delta = getattr(delta, "reasoning_content", None) if reasoning_delta: # 生产环境可选择不展示思考过程 print(reasoning_delta, end="", flush=True) if delta.content: if not answer_started: print("\n\n=== 最终答案 ===") answer_started = True print(delta.content, end="", flush=True)

三、JSON 模式

当业务需要稳定解析模型输出时，可以使用

response_format

response_format

要求模型返回 JSON。

curl -X POST "$AI_GATEWAY_BASE_URL/chat/completions" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-v4-flash", "messages": [ { "role": "system", "content": "请只输出合法 JSON，不要输出 Markdown。" }, { "role": "user", "content": "抽取工单信息：客户反馈 API 调用 429，影响线上报表。字段包括 priority、category、summary。" } ], "max_tokens": 512, "response_format": { "type": "json_object" }, "thinking": { "type": "disabled" } }'

注意：

使用 JSON 模式时，必须在
```
system
```
system
或
```
user
```
user
消息中明确要求输出 JSON。
不建议同时开启
```
thinking.type=enabled
```
thinking.type=enabled
和
```
response_format.type=json_object
```
response_format.type=json_object
。
业务侧仍应进行 JSON 解析和 Schema 校验。
如果开启思考模式，应确保业务只解析最终
```
content
```
content
中的 JSON，不要把
```
reasoning_content
```
reasoning_content
当作业务结果。

四、Function Calling

Function Calling 适合让模型决定是否调用外部工具，例如查天气、查订单、查询知识库、执行 SQL 等。

curl -X POST "$AI_GATEWAY_BASE_URL/chat/completions" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-v4-flash", "messages": [ { "role": "user", "content": "北京今天天气怎么样？" } ], "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "查询指定城市的天气", "parameters": { "type": "object", "properties": { "city": { "type": "string", "description": "城市名称" } }, "required": ["city"] } } } ], "tool_choice": "auto", "thinking": { "type": "disabled" } }'

如果模型判断需要调用工具，响应中会返回

tool_calls

tool_calls

：

{ "role": "assistant", "content": null, "tool_calls": [ { "id": "call_abc123", "type": "function", "function": { "name": "get_weather", "arguments": "{\"city\":\"北京\"}" } } ] }

业务系统执行工具后，将结果作为

role: "tool"

role: "tool"

回传给模型：

{ "model": "deepseek-v4-flash", "messages": [ { "role": "user", "content": "北京今天天气怎么样？" }, { "role": "assistant", "content": null, "tool_calls": [ { "id": "call_abc123", "type": "function", "function": { "name": "get_weather", "arguments": "{\"city\":\"北京\"}" } } ] }, { "role": "tool", "tool_call_id": "call_abc123", "content": "{\"temperature\":22,\"weather\":\"晴\",\"humidity\":45}" } ], "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "查询指定城市的天气", "parameters": { "type": "object", "properties": { "city": { "type": "string" } }, "required": ["city"] } } } ], "thinking": { "type": "disabled" } }

五、工具调用字段

字段	类型	说明
`tools` tools	array	工具定义列表。
`tools[].type` tools[].type	string	工具类型，通常为 `function` function 。
`tools[].function.name` tools[].function.name	string	函数名。
`tools[].function.description` tools[].function.description	string	函数说明，帮助模型判断何时调用。
`tools[].function.parameters` tools[].function.parameters	object	JSON Schema 参数定义。
`tool_choice` tool_choice	string / object	工具选择策略。
`parallel_tool_calls` parallel_tool_calls	boolean	是否允许一次响应中并行调用多个工具。

tool_choice

tool_choice

常见取值：

取值	说明
`auto` auto	由模型自动判断是否调用工具。
`none` none	禁止调用工具。
`required` required	强制模型调用工具。
`{"type":"function","function":{"name":"xxx"}}` {"type":"function","function":{"name":"xxx"}}	强制调用指定工具。

六、使用建议

简单问答、分类、摘要建议关闭思考模式，降低成本和延迟。
复杂推理、代码分析、方案设计建议开启思考模式，并适当提高
```
max_tokens
```
max_tokens
。
工具调用时，工具描述和参数 Schema 要尽量清晰，避免模型生成无效参数。
工具结果回传时，
```
tool_call_id
```
tool_call_id
必须与模型返回的工具调用 ID 一致。
不要让模型直接执行生产写操作；高风险工具应由业务系统做权限校验和人工确认。

联系我们