大模型服务平台 TokenHub MiniMax 调用指南

概述
MiniMax 系列模型已接入大模型服务平台 TokenHub，支持 OpenAI Chat Completions 协议，开发者无需更换 SDK 即可快速接入。本文介绍通用调用示例以及 MiniMax 特有的思考模式、Function Calling 等核心能力。
支持的模型
TokenHub 当前支持以下 MiniMax 模型（具体以 模型列表 为准）：
模型 ID
类型
思考能力
上下文窗口
最大输入
最大输出
minimax-m3
通用对话模型（支持文本/图片/视频多模态）
支持（adaptive，默认开启）
1M
1M
512K
minimax-m2.7
通用对话模型
支持
200K
200K
128K
minimax-m2.5
通用对话模型
支持
200K
200K
128K
前提条件
已注册腾讯云账号并开通 TokenHub 服务。
已在 TokenHub 控制台 获取 API Key。
已根据所用语言安装对应 SDK 或具备 HTTP 请求能力。
快速开始
以下示例展示最简单的单轮对话调用，请将 YOUR_API_KEY 替换为您创建的 API Key。
cURL
Python
Node.js
Java
Go
curl https://tokenhub.tencentmaas.com/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -d '{
    "model": "minimax-m3",
    "messages": [
      {"role": "user", "content": "你好，请介绍一下你自己"}
    ],
    "max_tokens": 1024
  }'
# pip install openai
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub.tencentmaas.com/v1",
)
﻿
response = client.chat.completions.create(
    model="minimax-m3",
    messages=[
        {"role": "user", "content": "你好，请介绍一下你自己"}
    ],
    max_tokens=1024,
)
print(response.choices[0].message.content)
// npm install openai
import OpenAI from "openai";
﻿
const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://tokenhub.tencentmaas.com/v1",
});
﻿
const response = await client.chat.completions.create({
  model: "minimax-m3",
  messages: [
    { role: "user", content: "你好，请介绍一下你自己" }
  ],
  max_tokens: 1024,
});
console.log(response.choices[0].message.content);
// 使用 OkHttp，添加依赖：implementation("com.squareup.okhttp3:okhttp:4.12.0")
import okhttp3.*;
import org.json.*;
﻿
OkHttpClient httpClient = new OkHttpClient();
﻿
JSONObject body = new JSONObject();
body.put("model", "minimax-m3");
body.put("max_tokens", 1024);
JSONArray messages = new JSONArray();
JSONObject userMsg = new JSONObject();
userMsg.put("role", "user");
userMsg.put("content", "你好，请介绍一下你自己");
messages.put(userMsg);
body.put("messages", messages);
﻿
Request request = new Request.Builder()
    .url("https://tokenhub.tencentmaas.com/v1/chat/completions")
    .addHeader("Authorization", "Bearer YOUR_API_KEY")
    .addHeader("Content-Type", "application/json")
    .post(RequestBody.create(body.toString(), MediaType.get("application/json")))
    .build();
﻿
try (Response response = httpClient.newCall(request).execute()) {
    JSONObject result = new JSONObject(response.body().string());
    System.out.println(result.getJSONArray("choices")
        .getJSONObject(0).getJSONObject("message").getString("content"));
}
package main
﻿
import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)
﻿
func main() {
    body := map[string]interface{}{
        "model": "minimax-m3",
        "messages": []map[string]string{
            {"role": "user", "content": "你好，请介绍一下你自己"},
        },
        "max_tokens": 1024,
    }
    data, _ := json.Marshal(body)
﻿
    req, _ := http.NewRequest("POST",
        "https://tokenhub.tencentmaas.com/v1/chat/completions",
        bytes.NewBuffer(data))
    req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
    req.Header.Set("Content-Type", "application/json")
﻿
    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    respBody, _ := io.ReadAll(resp.Body)
﻿
    var result map[string]interface{}
    json.Unmarshal(respBody, &result)
    choices := result["choices"].([]interface{})
    msg := choices[0].(map[string]interface{})["message"].(map[string]interface{})
    fmt.Println(msg["content"])
}
通用调用示例
基础对话
发送单轮对话请求，获取模型回复。
cURL
Python
Node.js
Java
Go
curl https://tokenhub.tencentmaas.com/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -d '{
    "model": "minimax-m3",
    "messages": [
      {"role": "user", "content": "介绍一下大语言模型"}
    ],
    "max_tokens": 1024
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub.tencentmaas.com/v1",
)
﻿
response = client.chat.completions.create(
    model="minimax-m3",
    messages=[
        {"role": "user", "content": "介绍一下大语言模型"}
    ],
    max_tokens=1024,
)
print(response.choices[0].message.content)
import OpenAI from "openai";
﻿
const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://tokenhub.tencentmaas.com/v1",
});
﻿
const response = await client.chat.completions.create({
  model: "minimax-m3",
  messages: [
    { role: "user", content: "介绍一下大语言模型" }
  ],
  max_tokens: 1024,
});
console.log(response.choices[0].message.content);
import okhttp3.*;
import org.json.*;
﻿
OkHttpClient httpClient = new OkHttpClient();
﻿
JSONObject body = new JSONObject();
body.put("model", "minimax-m3");
body.put("max_tokens", 1024);
﻿
JSONArray messages = new JSONArray();
messages.put(new JSONObject().put("role", "user").put("content", "介绍一下大语言模型"));
body.put("messages", messages);
﻿
Request request = new Request.Builder()
    .url("https://tokenhub.tencentmaas.com/v1/chat/completions")
    .addHeader("Authorization", "Bearer YOUR_API_KEY")
    .addHeader("Content-Type", "application/json")
    .post(RequestBody.create(body.toString(), MediaType.get("application/json")))
    .build();
﻿
try (Response response = httpClient.newCall(request).execute()) {
    JSONObject result = new JSONObject(response.body().string());
    System.out.println(result.getJSONArray("choices")
        .getJSONObject(0).getJSONObject("message").getString("content"));
}
body := map[string]interface{}{
    "model": "minimax-m3",
    "messages": []map[string]string{
        {"role": "user", "content": "介绍一下大语言模型"},
    },
    "max_tokens": 1024,
}
// ... 其余请求代码同快速开始示例
流式输出
将 stream 设置为 true 开启 SSE 流式输出，适合长文本生成场景，可有效避免超时，改善用户体验。
cURL
Python
Node.js
Java
Go
curl https://tokenhub.tencentmaas.com/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -d '{
    "model": "minimax-m3",
    "messages": [
      {"role": "user", "content": "写一首关于春天的短诗"}
    ],
    "max_tokens": 512,
    "stream": true
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub.tencentmaas.com/v1",
)
﻿
stream = client.chat.completions.create(
    model="minimax-m3",
    messages=[
        {"role": "user", "content": "写一首关于春天的短诗"}
    ],
    max_tokens=512,
    stream=True,
)
﻿
for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
import OpenAI from "openai";
﻿
const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://tokenhub.tencentmaas.com/v1",
});
﻿
const stream = await client.chat.completions.create({
  model: "minimax-m3",
  messages: [
    { role: "user", content: "写一首关于春天的短诗" }
  ],
  max_tokens: 512,
  stream: true,
});
﻿
for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}
import okhttp3.*;
import okhttp3.sse.*;
import org.json.*;
﻿
OkHttpClient httpClient = new OkHttpClient();
﻿
JSONObject body = new JSONObject();
body.put("model", "minimax-m3");
body.put("max_tokens", 512);
body.put("stream", true);
body.put("messages", new JSONArray()
    .put(new JSONObject().put("role", "user").put("content", "写一首关于春天的短诗")));
﻿
Request request = new Request.Builder()
    .url("https://tokenhub.tencentmaas.com/v1/chat/completions")
    .addHeader("Authorization", "Bearer YOUR_API_KEY")
    .addHeader("Content-Type", "application/json")
    .post(RequestBody.create(body.toString(), MediaType.get("application/json")))
    .build();
﻿
EventSources.createFactory(httpClient).newEventSource(request, new EventSourceListener() {
    @Override
    public void onEvent(EventSource source, String id, String type, String data) {
        if ("[DONE]".equals(data)) return;
        try {
            JSONObject json = new JSONObject(data);
            String content = json.getJSONArray("choices").getJSONObject(0)
                .getJSONObject("delta").optString("content", "");
            if (!content.isEmpty()) System.out.print(content);
        } catch (JSONException ignored) {}
    }
});
import (
    "bufio"
    "bytes"
    "encoding/json"
    "fmt"
    "net/http"
    "strings"
)
﻿
body := map[string]interface{}{
    "model":      "minimax-m3",
    "messages":   []map[string]string{{"role": "user", "content": "写一首关于春天的短诗"}},
    "max_tokens": 512,
    "stream":     true,
}
data, _ := json.Marshal(body)
﻿
req, _ := http.NewRequest("POST",
    "https://tokenhub.tencentmaas.com/v1/chat/completions",
    bytes.NewBuffer(data))
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")
﻿
resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()
﻿
scanner := bufio.NewScanner(resp.Body)
for scanner.Scan() {
    line := scanner.Text()
    if !strings.HasPrefix(line, "data: ") || line == "data: [DONE]" {
        continue
    }
    var chunk map[string]interface{}
    json.Unmarshal([]byte(strings.TrimPrefix(line, "data: ")), &chunk)
    choices := chunk["choices"].([]interface{})
    delta := choices[0].(map[string]interface{})["delta"].(map[string]interface{})
    if content, ok := delta["content"].(string); ok {
        fmt.Print(content)
    }
}
System Prompt
通过 system 角色消息设置模型的行为指令和背景信息。
cURL
Python
Node.js
Java
Go
curl https://tokenhub.tencentmaas.com/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -d '{
    "model": "minimax-m3",
    "messages": [
      {"role": "system", "content": "你是一位专业的 Python 编程助手，只回答与 Python 相关的问题，回答简洁明了。"},
      {"role": "user", "content": "如何读取一个 CSV 文件？"}
    ],
    "max_tokens": 512
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub.tencentmaas.com/v1",
)
﻿
response = client.chat.completions.create(
    model="minimax-m3",
    messages=[
        {
            "role": "system",
            "content": "你是一位专业的 Python 编程助手，只回答与 Python 相关的问题，回答简洁明了。",
        },
        {"role": "user", "content": "如何读取一个 CSV 文件？"},
    ],
    max_tokens=512,
)
print(response.choices[0].message.content)
import OpenAI from "openai";
﻿
const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://tokenhub.tencentmaas.com/v1",
});
﻿
const response = await client.chat.completions.create({
  model: "minimax-m3",
  messages: [
    {
      role: "system",
      content: "你是一位专业的 Python 编程助手，只回答与 Python 相关的问题，回答简洁明了。",
    },
    { role: "user", content: "如何读取一个 CSV 文件？" },
  ],
  max_tokens: 512,
});
console.log(response.choices[0].message.content);
JSONObject body = new JSONObject();
body.put("model", "minimax-m3");
body.put("max_tokens", 512);
body.put("messages", new JSONArray()
    .put(new JSONObject().put("role", "system")
        .put("content", "你是一位专业的 Python 编程助手，只回答与 Python 相关的问题，回答简洁明了。"))
    .put(new JSONObject().put("role", "user")
        .put("content", "如何读取一个 CSV 文件？")));
// ... 发送请求代码同上
body := map[string]interface{}{
    "model": "minimax-m3",
    "messages": []map[string]string{
        {"role": "system", "content": "你是一位专业的 Python 编程助手，只回答与 Python 相关的问题，回答简洁明了。"},
        {"role": "user", "content": "如何读取一个 CSV 文件？"},
    },
    "max_tokens": 512,
}
// ... 发送请求代码同快速开始
多轮对话
将历史消息一并传入 messages 数组，即可实现上下文记忆的多轮对话。
cURL
Python
Node.js
Java
Go
curl https://tokenhub.tencentmaas.com/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -d '{
    "model": "minimax-m3",
    "messages": [
      {"role": "user", "content": "我叫小明，我喜欢打篮球"},
      {"role": "assistant", "content": "你好，小明！打篮球是一项很棒的运动。"},
      {"role": "user", "content": "你还记得我的名字和爱好吗？"}
    ],
    "max_tokens": 256
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub.tencentmaas.com/v1",
)
﻿
# 维护对话历史
conversation = [
    {"role": "system", "content": "你是一个友好的 AI 助手。"},
]
﻿
def chat(user_input):
    conversation.append({"role": "user", "content": user_input})
    response = client.chat.completions.create(
        model="minimax-m3",
        messages=conversation,
        max_tokens=1024,
    )
    reply = response.choices[0].message.content
    conversation.append({"role": "assistant", "content": reply})
    return reply
﻿
print(chat("我叫小明，我喜欢打篮球"))
print(chat("你还记得我的名字和爱好吗？"))
import OpenAI from "openai";
﻿
const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://tokenhub.tencentmaas.com/v1",
});
﻿
const conversation = [
  { role: "system", content: "你是一个友好的 AI 助手。" },
];
﻿
async function chat(userInput) {
  conversation.push({ role: "user", content: userInput });
  const response = await client.chat.completions.create({
    model: "minimax-m3",
    messages: conversation,
    max_tokens: 1024,
  });
  const reply = response.choices[0].message.content;
  conversation.push({ role: "assistant", content: reply });
  return reply;
}
﻿
console.log(await chat("我叫小明，我喜欢打篮球"));
console.log(await chat("你还记得我的名字和爱好吗？"));
JSONArray messages = new JSONArray();
messages.put(new JSONObject().put("role", "system").put("content", "你是一个友好的 AI 助手。"));
messages.put(new JSONObject().put("role", "user").put("content", "我叫小明，我喜欢打篮球"));
messages.put(new JSONObject().put("role", "assistant").put("content", "你好，小明！打篮球是一项很棒的运动。"));
messages.put(new JSONObject().put("role", "user").put("content", "你还记得我的名字和爱好吗？"));
﻿
JSONObject body = new JSONObject();
body.put("model", "minimax-m3");
body.put("messages", messages);
body.put("max_tokens", 1024);
// ... 发送请求代码同上
body := map[string]interface{}{
    "model": "minimax-m3",
    "messages": []map[string]string{
        {"role": "system", "content": "你是一个友好的 AI 助手。"},
        {"role": "user", "content": "我叫小明，我喜欢打篮球"},
        {"role": "assistant", "content": "你好，小明！打篮球是一项很棒的运动。"},
        {"role": "user", "content": "你还记得我的名字和爱好吗？"},
    },
    "max_tokens": 1024,
}
// ... 发送请求代码同快速开始
Function Calling（工具调用）
Function Calling 允许模型调用外部工具获取实时数据。模型本身不执行函数，而是返回应调用的函数名和参数，由用户代码执行后将结果传回模型，最终得到自然语言回答。
调用流程：
1. 用户提问 → 模型返回 tool_calls（包含函数名和参数）。
2. 用户代码执行该函数 → 将结果以 role: tool 消息传回。
3. 模型根据函数结果生成最终自然语言回答。
cURL
Python
Node.js
Java
Go
# 第一轮：发送问题 + 工具定义
curl https://tokenhub.tencentmaas.com/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -d '{
    "model": "minimax-m3",
    "messages": [
      {"role": "user", "content": "北京今天天气怎么样？"}
    ],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "获取指定城市的天气信息",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {"type": "string", "description": "城市名称，如北京"}
          },
          "required": ["city"]
        }
      }
    }]
  }'
﻿
# 第二轮：将工具执行结果传回（tool_call_id 替换为实际返回的 id）
curl https://tokenhub.tencentmaas.com/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -d '{
    "model": "minimax-m3",
    "messages": [
      {"role": "user", "content": "北京今天天气怎么样？"},
      {"role": "assistant", "tool_calls": [{"id": "call_xxx", "type": "function", "function": {"name": "get_weather", "arguments": "{\\"city\\": \\"北京\\"}"}}]},
      {"role": "tool", "tool_call_id": "call_xxx", "content": "晴，气温28℃，湿度50%"}
    ],
    "tools": [{"type": "function", "function": {"name": "get_weather", "description": "获取指定城市的天气信息", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}}}]
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub.tencentmaas.com/v1",
)
﻿
# 定义工具
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "获取指定城市的天气信息",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "城市名称，如北京"}
                },
                "required": ["city"],
            },
        },
    }
]
﻿
# 第一轮：发送问题
messages = [{"role": "user", "content": "北京今天天气怎么样？"}]
response = client.chat.completions.create(
    model="minimax-m3",
    messages=messages,
    tools=tools,
)
assistant_message = response.choices[0].message
﻿
# 模型发起工具调用
if response.choices[0].finish_reason == "tool_calls":
    tool_call = assistant_message.tool_calls[0]
    print(f"模型调用工具：{tool_call.function.name}，参数：{tool_call.function.arguments}")
﻿
    # 执行工具（此处为模拟返回）
    tool_result = "晴，气温28℃，湿度50%"
﻿
    # 第二轮：将工具结果传回模型
    messages.append(assistant_message)
    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": tool_result,
    })
﻿
    final_response = client.chat.completions.create(
        model="minimax-m3",
        messages=messages,
        tools=tools,
    )
    print(final_response.choices[0].message.content)
import OpenAI from "openai";
﻿
const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://tokenhub.tencentmaas.com/v1",
});
﻿
const tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "获取指定城市的天气信息",
      parameters: {
        type: "object",
        properties: {
          city: { type: "string", description: "城市名称，如北京" },
        },
        required: ["city"],
      },
    },
  },
];
﻿
// 第一轮
const messages = [{ role: "user", content: "北京今天天气怎么样？" }];
const response1 = await client.chat.completions.create({
  model: "minimax-m3",
  messages,
  tools,
});
﻿
const assistantMsg = response1.choices[0].message;
if (response1.choices[0].finish_reason === "tool_calls") {
  const toolCall = assistantMsg.tool_calls[0];
  console.log(`工具调用：${toolCall.function.name}，参数：${toolCall.function.arguments}`);
﻿
  const toolResult = "晴，气温28℃，湿度50%";
  messages.push(assistantMsg);
  messages.push({ role: "tool", tool_call_id: toolCall.id, content: toolResult });
﻿
  const response2 = await client.chat.completions.create({
    model: "minimax-m3",
    messages,
    tools,
  });
  console.log(response2.choices[0].message.content);
}
JSONObject toolFunc = new JSONObject()
    .put("name", "get_weather")
    .put("description", "获取指定城市的天气信息")
    .put("parameters", new JSONObject()
        .put("type", "object")
        .put("properties", new JSONObject()
            .put("city", new JSONObject().put("type", "string").put("description", "城市名称")))
        .put("required", new JSONArray().put("city")));
﻿
JSONArray tools = new JSONArray()
    .put(new JSONObject().put("type", "function").put("function", toolFunc));
﻿
JSONObject body = new JSONObject();
body.put("model", "minimax-m3");
body.put("messages", new JSONArray()
    .put(new JSONObject().put("role", "user").put("content", "北京今天天气怎么样？")));
body.put("tools", tools);
// ... 发送请求，解析 tool_calls，执行工具，构造第二轮请求
body := map[string]interface{}{
    "model": "minimax-m3",
    "messages": []map[string]string{
        {"role": "user", "content": "北京今天天气怎么样？"},
    },
    "tools": []map[string]interface{}{{
        "type": "function",
        "function": map[string]interface{}{
            "name":        "get_weather",
            "description": "获取指定城市的天气信息",
            "parameters": map[string]interface{}{
                "type": "object",
                "properties": map[string]interface{}{
                    "city": map[string]string{"type": "string", "description": "城市名称"},
                },
                "required": []string{"city"},
            },
        },
    }},
}
// ... 发送请求，解析 tool_calls，构造第二轮请求
思考模式
MiniMax 系列模型内置思考能力。省略 thinking 参数时，默认开启 adaptive thinking，响应会包含思考内容。对于 M2.x 模型，thinking 无法关闭。思考内容默认以 <think>...</think> 标签嵌入在响应的 content 字段中；如需将思考内容拆分到独立字段，可在请求体中添加 "reasoning_split": true。
thinking 参数说明
字段
类型
默认值
取值范围
说明
type
string
"adaptive"
"adaptive" / "disabled"
adaptive：为 M3 开启 adaptive thinking（默认值）；disabled：让 M3 跳过 thinking 直接回答，M2.x 传入 disabled 仍会保持开启
使用 reasoning_split 拆分思考内容
cURL
Python
Node.js
Java
Go
curl https://tokenhub.tencentmaas.com/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -d '{
    "model": "minimax-m3",
    "messages": [
      {"role": "user", "content": "解方程 x^2 - 5x + 6 = 0"}
    ],
    "max_tokens": 2048,
    "reasoning_split": true
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub.tencentmaas.com/v1",
)
﻿
response = client.chat.completions.create(
    model="minimax-m3",
    messages=[{"role": "user", "content": "解方程 x^2 - 5x + 6 = 0"}],
    max_tokens=2048,
    extra_body={"reasoning_split": True},  # 拆分思考内容到独立字段
)
﻿
msg = response.choices[0].message
﻿
# reasoning_split=True 时，推理过程在独立字段中返回
reasoning = getattr(msg, "reasoning_content", None)
if reasoning:
    print("=== 推理过程 ===")
    print(reasoning)
﻿
print("=== 最终答案 ===")
print(msg.content)
import OpenAI from "openai";
﻿
const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://tokenhub.tencentmaas.com/v1",
});
﻿
const response = await client.chat.completions.create({
  model: "minimax-m3",
  messages: [{ role: "user", content: "解方程 x^2 - 5x + 6 = 0" }],
  max_tokens: 2048,
  // @ts-ignore - reasoning_split 为扩展字段
  reasoning_split: true,
});
﻿
const msg = response.choices[0].message;
const reasoning = (msg as any).reasoning_content;
if (reasoning) {
  console.log("=== 推理过程 ===");
  console.log(reasoning);
}
console.log("=== 最终答案 ===");
console.log(msg.content);
JSONObject body = new JSONObject();
body.put("model", "minimax-m3");
body.put("max_tokens", 2048);
body.put("reasoning_split", true);
body.put("messages", new JSONArray()
    .put(new JSONObject().put("role", "user").put("content", "解方程 x^2 - 5x + 6 = 0")));
﻿
// ... 发送请求
try (Response response = httpClient.newCall(request).execute()) {
    JSONObject result = new JSONObject(response.body().string());
    JSONObject message = result.getJSONArray("choices")
        .getJSONObject(0).getJSONObject("message");
    String reasoning = message.optString("reasoning_content", "");
    String content = message.getString("content");
    System.out.println("推理过程: " + reasoning);
    System.out.println("最终答案: " + content);
}
body := map[string]interface{}{
    "model":           "minimax-m3",
    "max_tokens":      2048,
    "reasoning_split": true,
    "messages": []map[string]string{
        {"role": "user", "content": "解方程 x^2 - 5x + 6 = 0"},
    },
}
// ... 发送请求，从响应中解析 reasoning_content 和 content 字段
响应结构说明
不传 reasoning_split（默认）：思考内容以 <think> 标签嵌入 content 中：
{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "<think>\\n因式分解：(x-2)(x-3) = 0，所以 x = 2 或 x = 3。\\n</think>\\n方程 x² - 5x + 6 = 0 的解为：**x = 2** 或 **x = 3**"
    }
  }]
}
传 reasoning_split: true：思考内容拆分到独立的 reasoning_content 和 reasoning_details 字段：
{
  "choices": [{
    "message": {
      "role": "assistant",
      "reasoning_content": "因式分解：(x-2)(x-3) = 0，所以 x = 2 或 x = 3。",
      "reasoning_details": [...],
      "content": "方程 x² - 5x + 6 = 0 的解为：**x = 2** 或 **x = 3**"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "completion_tokens": 120,
    "completion_tokens_details": {
      "reasoning_tokens": 80
    }
  }
}
流式思考输出
开启流式输出时，reasoning_content 和 content 均以增量 delta 形式返回，需分别处理：
Python
Node.js
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub.tencentmaas.com/v1",
)
﻿
stream = client.chat.completions.create(
    model="minimax-m3",
    messages=[{"role": "user", "content": "分析一下量子计算的优势和挑战"}],
    max_tokens=2048,
    stream=True,
    extra_body={"reasoning_split": True},
)
﻿
print("=== 推理过程（实时）===")
answer_started = False
﻿
for chunk in stream:
    if not chunk.choices:
        continue
    delta = chunk.choices[0].delta
﻿
    reasoning_delta = getattr(delta, "reasoning_content", None)
    if reasoning_delta:
        print(reasoning_delta, end="", flush=True)
﻿
    if delta.content:
        if not answer_started:
            print("\\n\\n=== 最终答案（实时）===")
            answer_started = True
        print(delta.content, end="", flush=True)
import OpenAI from "openai";
﻿
const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://tokenhub.tencentmaas.com/v1",
});
﻿
const stream = await client.chat.completions.create({
  model: "minimax-m3",
  messages: [{ role: "user", content: "分析一下量子计算的优势和挑战" }],
  max_tokens: 2048,
  stream: true,
  // @ts-ignore
  reasoning_split: true,
});
﻿
let answerStarted = false;
process.stdout.write("=== 推理过程（实时）===\\n");
﻿
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta;
  if (!delta) continue;
﻿
  const reasoning = (delta as any).reasoning_content;
  if (reasoning) process.stdout.write(reasoning);
﻿
  if (delta.content) {
    if (!answerStarted) {
      process.stdout.write("\\n\\n=== 最终答案（实时）===\\n");
      answerStarted = true;
    }
    process.stdout.write(delta.content);
  }
}
JSON 模式
设置 response_format 为 json_object 可以确保模型输出合法的 JSON 字符串，适合需要结构化数据的场景。
注意：
使用 JSON 模式时，必须在 system 或 user 消息中明确要求模型输出 JSON 格式，否则可能导致模型一直输出空内容。
cURL
Python
Node.js
Java
Go
curl https://tokenhub.tencentmaas.com/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -d '{
    "model": "minimax-m3",
    "messages": [
      {"role": "system", "content": "请以 JSON 格式返回结果。"},
      {"role": "user", "content": "返回三座中国城市的信息，每个包含 name、province、population 字段"}
    ],
    "max_tokens": 512,
    "response_format": {"type": "json_object"}
  }'
import json
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub.tencentmaas.com/v1",
)
﻿
response = client.chat.completions.create(
    model="minimax-m3",
    messages=[
        {"role": "system", "content": "请以 JSON 格式返回结果。"},
        {
            "role": "user",
            "content": "返回三座中国城市的信息，每个包含 name、province、population 字段",
        },
    ],
    max_tokens=512,
    response_format={"type": "json_object"},
)
﻿
result = json.loads(response.choices[0].message.content)
print(json.dumps(result, ensure_ascii=False, indent=2))
import OpenAI from "openai";
﻿
const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://tokenhub.tencentmaas.com/v1",
});
﻿
const response = await client.chat.completions.create({
  model: "minimax-m3",
  messages: [
    { role: "system", content: "请以 JSON 格式返回结果。" },
    {
      role: "user",
      content: "返回三座中国城市的信息，每个包含 name、province、population 字段",
    },
  ],
  max_tokens: 512,
  response_format: { type: "json_object" },
});
﻿
const result = JSON.parse(response.choices[0].message.content);
console.log(JSON.stringify(result, null, 2));
JSONObject body = new JSONObject();
body.put("model", "minimax-m3");
body.put("max_tokens", 512);
body.put("response_format", new JSONObject().put("type", "json_object"));
body.put("messages", new JSONArray()
    .put(new JSONObject().put("role", "system").put("content", "请以 JSON 格式返回结果。"))
    .put(new JSONObject().put("role", "user").put("content",
        "返回三座中国城市的信息，每个包含 name、province、population 字段")));
// ... 发送请求，解析返回的 JSON 字符串
body := map[string]interface{}{
    "model":           "minimax-m3",
    "max_tokens":      512,
    "response_format": map[string]string{"type": "json_object"},
    "messages": []map[string]string{
        {"role": "system", "content": "请以 JSON 格式返回结果。"},
        {"role": "user", "content": "返回三座中国城市的信息，每个包含 name、province、population 字段"},
    },
}
// ... 发送请求
与其他模型的关键差异
维度
MiniMax M3/M2.7/M2.5
OpenAI / Claude / GLM 等
思考能力开关
M3 通过 thinking.type 控制（adaptive/disabled）；M2.x thinking 无法关闭
通常通过切换 model 或单独的 reasoning 参数控制
推理过程字段
默认嵌入 content 的 <think> 标签；传 reasoning_split=true 后拆分至 reasoning_content
多数模型不暴露推理过程
OpenAI SDK 访问推理字段
必须用 hasattr / getattr
-
temperature 范围
0-2，默认 1
通常 0-2
max_tokens 推荐值
普通任务 1024-4096；思考模式建议 ≥ 2048
通常 1024-4096 即可
上下文窗口
M3：1M tokens；M2.x：200K tokens
通常 128K tokens
最大输出
M3：512K tokens；M2.x：128K tokens
通常 16K tokens
多轮对话 messages 回写
只需回写 content，无需回写 reasoning_content
通常只需回写 content
推荐参数与最佳实践
参数 / 实践
建议
说明
max_tokens
普通任务 1024-4096；思考模式建议 ≥ 2048
思考内容和回答共享 token 配额
reasoning_split
需要单独处理推理过程时传 true
默认思考内容嵌在 content 的 <think> 标签中
stream
长文本生成建议开启
避免请求超时，提升响应体验
temperature
使用默认值 1；创意写作可调高至 1.3-1.5；代码生成可调低至 0.2-0.5
MiniMax 的 temperature 范围为 0-2，与 OpenAI 一致
多轮对话
只将 content 回传，不回传 reasoning_content
减少 token 消耗
SDK 访问推理字段
Python 用 getattr(msg, "reasoning_content", None)；Node.js 用 (msg as any).reasoning_content
OpenAI SDK 类型定义中无此字段
模型选择
minimax-m3 为最新旗舰版，支持 1M 上下文；minimax-m2.7 综合能力强；minimax-m2.5 适合对成本敏感的场景
-
使用限制
限制项
说明
temperature 范围
MiniMax 的 temperature 取值范围为 0-2，默认值为 1，与 OpenAI 一致。
超时风险
思考模式开启时响应时间较长，建议配合 stream=true 使用，避免超时。
思考模式与 JSON 模式
不建议同时开启 thinking.type=adaptive 和 response_format.type=json_object。
相关文档
﻿语言模型调用概览：TokenHub 语言模型通用调用文档，包含 BaseURL、API Key、多轮对话、Function Calling、Anthropic 协议等通用说明。
模型 ID	类型	思考能力	上下文窗口	最大输入	最大输出
`minimax-m3`	通用对话模型（支持文本/图片/视频多模态）	支持（adaptive，默认开启）	1M	1M	512K
`minimax-m2.7`	通用对话模型	支持	200K	200K	128K
`minimax-m2.5`	通用对话模型	支持	200K	200K	128K
字段	类型	默认值	取值范围	说明
`type`	string	`"adaptive"`	`"adaptive"` / `"disabled"`	`adaptive`：为 M3 开启 adaptive thinking（默认值）；`disabled`：让 M3 跳过 thinking 直接回答，M2.x 传入 `disabled` 仍会保持开启
维度	MiniMax M3/M2.7/M2.5	OpenAI / Claude / GLM 等
思考能力开关	M3 通过 `thinking.type` 控制（`adaptive`/`disabled`）；M2.x thinking 无法关闭	通常通过切换 model 或单独的 reasoning 参数控制
推理过程字段	默认嵌入 `content` 的 `<think>` 标签；传 `reasoning_split=true` 后拆分至 `reasoning_content`	多数模型不暴露推理过程
OpenAI SDK 访问推理字段	必须用 `hasattr` / `getattr`	-
`temperature` 范围	0-2，默认 1	通常 0-2
`max_tokens` 推荐值	普通任务 1024-4096；思考模式建议 ≥ 2048	通常 1024-4096 即可
上下文窗口	M3：1M tokens；M2.x：200K tokens	通常 128K tokens
最大输出	M3：512K tokens；M2.x：128K tokens	通常 16K tokens
多轮对话 messages 回写	只需回写 `content`，无需回写 `reasoning_content`	通常只需回写 `content`
参数 / 实践	建议	说明
`max_tokens`	普通任务 1024-4096；思考模式建议 ≥ 2048	思考内容和回答共享 token 配额
`reasoning_split`	需要单独处理推理过程时传 `true`	默认思考内容嵌在 `content` 的 `<think>` 标签中
`stream`	长文本生成建议开启	避免请求超时，提升响应体验
`temperature`	使用默认值 1；创意写作可调高至 1.3-1.5；代码生成可调低至 0.2-0.5	MiniMax 的 temperature 范围为 0-2，与 OpenAI 一致
多轮对话	只将 `content` 回传，不回传 `reasoning_content`	减少 token 消耗
SDK 访问推理字段	Python 用 `getattr(msg, "reasoning_content", None)`；Node.js 用 `(msg as any).reasoning_content`	OpenAI SDK 类型定义中无此字段
模型选择	`minimax-m3` 为最新旗舰版，支持 1M 上下文；`minimax-m2.7` 综合能力强；`minimax-m2.5` 适合对成本敏感的场景	-
限制项	说明
`temperature` 范围	MiniMax 的 temperature 取值范围为 0-2，默认值为 1，与 OpenAI 一致。
超时风险	思考模式开启时响应时间较长，建议配合 `stream=true` 使用，避免超时。
思考模式与 JSON 模式	不建议同时开启 `thinking.type=adaptive` 和 `response_format.type=json_object`。
MiniMax 调用指南

本页目录：

概述

支持的模型

前提条件

快速开始

通用调用示例

基础对话

流式输出

System Prompt

多轮对话

Function Calling（工具调用）

思考模式

thinking 参数说明

使用 reasoning_split 拆分思考内容

响应结构说明

流式思考输出

JSON 模式

与其他模型的关键差异

推荐参数与最佳实践

使用限制

相关文档