腾讯 TokenHub DeepSeek 调用总览

本文说明如何通过 AI Gateway 调用腾讯云 TokenHub DeepSeek 模型。当前 AI Gateway 在腾讯云 TokenHub 下仅支持 DeepSeek 系列模型，因此本章节围绕

deepseek-v4-flash

deepseek-v4-flash

和

deepseek-v4-pro

deepseek-v4-pro

展开，不再泛化介绍 TokenHub 的其他模型家族。

TokenHub 官方兼容 OpenAI Chat Completions 和 Anthropic Messages 两种协议；在 AI Gateway 中调用时，用户使用 AI Gateway 的网关地址和 API Key，不需要直接使用腾讯 TokenHub 的原始 API Key。

一、适用模型

当前 AI Gateway 中腾讯 TokenHub 支持以下 DeepSeek 模型：

模型	类型	推荐场景
`deepseek-v4-flash` deepseek-v4-flash	文本 / 推理	高频问答、摘要、分类、代码解释、低延迟推理。
`deepseek-v4-pro` deepseek-v4-pro	文本 / 推理	复杂推理、代码分析、方案设计、长文档处理。

模型名称以 模型广场 详情页展示和示例代码为准。

二、Endpoint 总览

接口协议	Endpoint	鉴权方式	适用场景
OpenAI Chat Completions	`https://cn-shanghai-alicloud-aimesh.api.clickzetta.com/gateway/v1/chat/completions` https://cn-shanghai-alicloud-aimesh.api.clickzetta.com/gateway/v1/chat/completions	`Authorization: Bearer <API_KEY>` Authorization: Bearer <API_KEY>	推荐默认使用，适合绝大多数 DeepSeek 文本、推理、JSON 和工具调用场景。
Anthropic Messages	`https://cn-shanghai-alicloud-aimesh.api.clickzetta.com/gateway/v1/messages` https://cn-shanghai-alicloud-aimesh.api.clickzetta.com/gateway/v1/messages	`x-api-key: <API_KEY>` x-api-key: <API_KEY>	适合已经使用 Anthropic SDK、Claude Code 或 Messages 协议的业务。

注意：

腾讯 TokenHub 文本模型使用
```
/gateway/v1/...
```
/gateway/v1/...
。
不要把腾讯文本模型写成
```
/gateway/api/v1/chat/completions
```
/gateway/api/v1/chat/completions
。
火山引擎模型使用
```
/gateway/api/v3/...
```
/gateway/api/v3/...
，不要混用。
API Key 使用 AI Gateway 中创建的 Key，不是腾讯 TokenHub 原始 Key。
如果需要控制 DeepSeek 思考模式、读取
```
reasoning_content
```
reasoning_content
、使用 JSON 模式或工具调用，建议优先使用 OpenAI Chat Completions。

三、环境变量

export AI_GATEWAY_BASE_URL="https://cn-shanghai-alicloud-aimesh.api.clickzetta.com/gateway/v1" export API_KEY="<your-api-key>"

四、最小调用示例

curl -X POST "$AI_GATEWAY_BASE_URL/chat/completions" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-v4-flash", "messages": [ { "role": "user", "content": "请用一句话介绍 AI Gateway。" } ], "max_tokens": 512, "thinking": { "type": "disabled" } }'

五、模型选择建议

场景	推荐模型	说明
高频问答、摘要、分类	`deepseek-v4-flash` deepseek-v4-flash	响应更轻、更适合高并发和低延迟场景。
复杂推理、代码分析、方案设计	`deepseek-v4-pro` deepseek-v4-pro	更适合需要高质量推理的任务。
成本敏感场景	优先测试 `deepseek-v4-flash` deepseek-v4-flash	如果效果满足业务要求，可优先使用 flash。
质量优先场景	优先测试 `deepseek-v4-pro` deepseek-v4-pro	适合复杂任务，但需要评估成本和延迟。

六、DeepSeek 调用要点

能力	建议
普通问答	使用 `deepseek-v4-flash` deepseek-v4-flash ，并设置 `thinking: {"type": "disabled"}` thinking: {"type": "disabled"} 。
复杂推理	使用 `deepseek-v4-pro` deepseek-v4-pro ，开启 `thinking: {"type": "enabled"}` thinking: {"type": "enabled"} ，必要时设置 `reasoning_effort` reasoning_effort 。
长输出 / 推理任务	建议开启 `stream: true` stream: true ，避免长时间等待导致超时。
JSON 输出	使用 `response_format: {"type": "json_object"}` response_format: {"type": "json_object"} ，同时关闭思考模式。
工具调用	使用 OpenAI Chat Completions 的 `tools` tools / `tool_choice` tool_choice 字段。
多轮对话	回写上一轮 `assistant` assistant 消息时，通常只回写 `content` content ，不需要回写 `reasoning_content` reasoning_content 。

七、BYOK 和路由说明

如果通过 BYOK 绑定腾讯 TokenHub Key：

API 请求仍然使用 AI Gateway API Key。
TokenHub 原始 Key 只在 AI Gateway 后端用于调用上游。
默认模式和指定供应商模式下，用户 BYOK 优先；BYOK 不可用时可回退到平台内置供应商。
仅 BYOK 模式下，不会回退到平台内置供应商。

八、章节说明

章节内容

DeepSeek OpenAI Chat Completions 调用请求字段、流式输出、Python 和 Node.js SDK 示例。

DeepSeek 思考模式、JSON 和工具调用

章节	内容
DeepSeek OpenAI Chat Completions 调用	请求字段、流式输出、Python 和 Node.js SDK 示例。
DeepSeek 思考模式、JSON 和工具调用	`thinking` thinking 、 `reasoning_content` reasoning_content 、JSON 模式、Function Calling。
DeepSeek Anthropic Messages 兼容调用	Anthropic 请求头、字段差异、工具调用和 SDK 示例。

thinking

thinking

、

reasoning_content

reasoning_content

、JSON 模式、Function Calling。

DeepSeek Anthropic Messages 兼容调用 Anthropic 请求头、字段差异、工具调用和 SDK 示例。

联系我们