文档中心 API 中心 语音合成 实时语音合成

实时语音合成

最近更新时间:2019-10-15 11:25:52

1. 接口描述

接口请求域名:tts.cloud.tencent.com/stream
接口请求频率限制:20次/每秒。
腾讯云语音合成技术(TTS)可以将任意文本转化为语音,实现让机器和应用张口说话。 腾讯 TTS 技术可以应用到很多场景,例如,移动 App 语音播报新闻;智能设备语音提醒;依靠网上现有节目或少量录音,快速合成明星语音,降低邀约成本;支持车载导航语音合成的个性化语音播报。本接口内测期间免费使用

2. 输入参数

参数名称 必选 类型 描述
Action String 本接口取值:TextToStreamAudio
AppId Integer 账号 AppId
SecretId String 官网 SecretId
Timestamp Integer 当前 UNIX 时间戳,可记录发起 API 请求的时间。例如1529223702,如果与当前时间相差过大,会引起签名过期错误。
Expired Integer 签名的有效期,是一个符合 UNIX Epoch 时间戳规范的数值,单位为秒;Expired 必须大于 Timestamp 且 Expired-Timestamp 小于90天。
Text String 合成语音的源文本。中文最大支持600个汉字(全角标点符号算一个汉字),英文最大支持1800个字母(半角标点符号算一个字母)。包含空格等字符时需要 URL encode 再传输。
SessionId String 一次请求对应一个 SessionId,会原样返回,建议传入类似于 uuid 的字符串防止重复。
ModelType Integer 模型类型,1:默认模型
Volume Float 音量大小,范围:[0,10],分别对应11个等级的音量,默认值为0,代表正常音量。没有静音选项。
输入除以上整数之外的其他参数不生效,按默认值处理。
Speed Integer 语速,范围:[-2,2],分别对应不同语速:
-2代表0.6倍
-1代表0.8倍
0代表1.0倍(默认)
1代表1.2倍
2代表1.5倍
输入除以上整数之外的其他参数不生效,按默认值处理。
ProjectId Integer 项目 ID,用户自定义,默认为0。
VoiceType Integer 音色:
0:亲和女声(默认)
1:亲和男声
2:成熟男声
4:温暖女声
5:情感女声
6:情感男声
PrimaryLanguage Integer 主语言类型:
1:中文(默认)
2:英文
SampleRate Integer 音频采样率:
16000:16k(默认)
8000:8k
Codec String 返回音频格式:
opus:返回多段含 opus 压缩分片音频,数据量小,建议使用(默认)。
pcm:返回二进制 pcm 音频,使用简单,但数据量大。

请求 headers

{
    "Content-Type":"application/json",
    "Authorization":"HRCKlbwPhWtVvfGn914qE5O1rwc="
}

鉴权说明请参考 接口鉴权 文档。

注:仅需要生成签名串,签名串不需要编码。

3. 输出参数

协议说明:使用 HTTP Chunk 协议,一次 HTTP 请求内按序返回多段分片直到音频结束。

Opus

Codec 参数为 opus 返回,默认返回。返回多个语音分片,分片大小不一,单个分片格式说明如下:

标识头“opus”H
(4字节字符串)
分片序号 S
(4字节二进制数据)
分片音频长度 L
(4字节二进制数据)
分片音频 D
(长度为分片音频长度 L)
标识新的分片的开始 序号从0开始,-1结束 按分片音频长度读取音频数据 base64后的 Opus 压缩音频

其中最后一片音频(序号 S = -1)数据固定为“AAAA”,该段数据无效。

Pcm

Codec 参数为 pcm 返回,同等条件下返回数据量约为 Opus 格式的 10 倍。
返回格式:二进制
返回内容:pcm 音频流

4. 示例

Opus

Codec 参数为 opus 返回,Java 示例如下:

//填写请求参数
mRequestBody = "{"+
        "\"AppId\":1255824371,"+
        "\"SecretId\":\"AKID3wb2D8EFRN3UgbwengJ5h5E8nERPabqq\","+
        "\"PrimaryLanguage\": 1,"+
        "\"SampleRate\": 16000,"+
        "\"Action\":\"TextToStreamAudio\","+
        "\"VoiceType\":1,"+
        "\"Text\":\"我只是拿来测试的文本\","+
        "\"SessionId\":\"session-1234\","+
        "\"Timestamp\":1535362116,"+
        "\"Expired\":1535365716,"+
        "\"Speed\":0,"+
        "\"Volume\":0"+
"}";

//http post请求
URL url = new URL(SERVER_URL);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("POST");
conn.setRequestProperty("Content-Type", "application/json");
conn.setRequestProperty("Authorization", "VVerIEkz2OAuSBdNtH6W2M1xal0=");
conn.connect();
OutputStream out = conn.getOutputStream();
out.write(postJsonBody.getBytes("UTF-8"));
out.flush();
out.close();
InputStream inputStream = conn.getInputStream();

//解析返回数据
while (!Thread.currentThread().isInterrupted()) {
    //read header
    byte[] headBuffer = new byte[4];
    fill(inputStream, headBuffer);

    //read seq
    byte[] seqBuffer = new byte[4];
    fill(inputStream, seqBuffer);
    int seq = bytesToInt(seqBuffer);

    //read pkg size
    byte[] pbPkgSizeHeader = new byte[4];
    fill(inputStream, pbPkgSizeHeader);
    int pbPkgSize = bytesToInt(pbPkgSizeHeader);

    //read pkg
    byte[] pbPkg = new byte[pbPkgSize];
    fill(inputStream, pbPkg);

    //init decoder
    if (decoder == null) {
        decoder = new OpusDecoder();
    }
    //decode
    short[] pcm = decoder.decodeTTSData(seq, pbPkg);

    //enqueue
    //save pcm to a list or a buffer

    //end
    if (seq == -1) {
        break;
    }
}


private static boolean fill(InputStream in, byte[] buffer) throws IOException {
        int length = buffer.length;
        int hasRead = 0;
        while (true) {
            int offset = hasRead;
            int count = length - hasRead;
            int currentRead = in.read(buffer, offset, count);
            if (currentRead >= 0) {
                hasRead += currentRead;
                if (hasRead == length) {
                    return true;
                }
            }
            if (currentRead == -1) {
                return false;
            }
        }
    }

Pcm

Codec 参数为 pcm 返回,Python 示例如下:

def tts_wav_demo():
    header = {
        "Content-Type": "application/json",
        "Authorization": "VVerIEkz2OAuSBdNtH6W2M1xal0="
    }
    request_data = {
        "AppId": 1255824371,
        "SecretId": "AKID3wb2D8EFRN3UgbwengJ5h5E8nERPabqq",
        "PrimaryLanguage": 1,
        "SampleRate": 16000,
        "Action": "TextToStreamAudio",
        "VoiceType": 1,
        "Text": "我只是拿来测试的文本",
        "SessionId": "session-1234",
        "Timestamp": 1535362116,
        "Expired": 1535365716,
        "Speed": 0,
        "Volume": 0
    }
    r = requests.post(req_url, headers=header, data=request_data)
    file_object = open("./test.pcm", "w")
    for chunk in r.iter_content(1000):
        file_object.write(chunk)
    file_object.close()

Codec 参数为 pcm 返回,Java 示例如下:


import javax.crypto.Mac;
import javax.crypto.spec.SecretKeySpec;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.security.InvalidKeyException;
import java.security.NoSuchAlgorithmException;
import java.util.*;

public class TtsController {
    private static final String SERVER_URL = "https://aai.cloud.tencent.com/tts";
    private static final String DOMAIN_NAME ="aai.cloud.tencent.com/tts";
    private static final String SECRET_KEY = "*******************************";
    private static final String SECRET_ID = "*******************************";

    public static void main(String[] args){
        TreeMap<String, String> requestMap = new TreeMap<>();
        requestMap.put("Action","TextToStreamAudio");
        requestMap.put("Codec", "pcm");
        requestMap.put("Text", "我只是拿来测试的文本");
        requestMap.put("SessionId", "session-1234");
        requestMap.put("AppId", "1255824371");
        requestMap.put("Timestamp", ""+ System.currentTimeMillis()/1000L);
        requestMap.put("Expired", ""+ (System.currentTimeMillis()/1000L + 600));
        requestMap.put("SecretId", SECRET_ID);
        new WorkingThread(requestMap).start();
    }

    private static final class WorkingThread extends Thread {

        private TreeMap<String, String> requestMap;
        private String mRequestBody;

        WorkingThread(TreeMap<String, String> requestMap) {
            this.requestMap = requestMap;
            mRequestBody = "{\n" +
                    "\t\"Action\":\""+ requestMap.get("Action") +"\",\n" +
                    "\t\"Codec\":\""+ requestMap.get("Codec")+"\",\n" +
                    "\t\"Text\":\"" + requestMap.get("Text") + "\",\n" +
                    "\t\"SessionId\":\""+requestMap.get("SessionId")+"\",\n" +
                    "\t\"Timestamp\":"+requestMap.get("Timestamp")+",\n" +
                    "\t\"Expired\":"+requestMap.get("Expired")+",\n" +
                    "\t\"SecretId\":\""+requestMap.get("SecretId")+"\",\n" +
                    "\t\"AppId\":"+requestMap.get("AppId") +"\n" +
                    "}";
            System.out.println(mRequestBody);
        }

        @Override
        public void run() {
            InputStream inputStream = null;
            try {
                inputStream = obtainResponseStreamWithJava(mRequestBody, requestMap);
                processProtocolBufferStream(inputStream);
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

        private void processProtocolBufferStream(final InputStream inputStream) throws FileNotFoundException {
            FileOutputStream fos = new FileOutputStream("test.pcm",true);
            boolean fillSuccess;

            while (true) {
                byte[] pcmData = new byte[1024];
                try {
                    fillSuccess = fill(inputStream, pcmData);
                    fos.write(pcmData);
                    if(!fillSuccess){
                        fos.flush();
                        fos.close();
                        break;
                    }
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }

    }

    private  static InputStream obtainResponseStreamWithJava(String postJsonBody, TreeMap<String,String> requestMap) throws IOException {
        //发送POST请求
        URL url = new URL(SERVER_URL);
        HttpURLConnection conn = (HttpURLConnection)url.openConnection();
        String authorization = generateSign(requestMap);
        conn.setRequestMethod("POST");
        conn.setDoOutput(true);
        conn.setRequestProperty("Content-Type", "application/json");
        conn.setRequestProperty("Authorization", authorization);
        conn.connect();
        OutputStream out = conn.getOutputStream();
        out.write(postJsonBody.getBytes(StandardCharsets.UTF_8));
        out.flush();
        out.close();

        return conn.getInputStream();
    }

    private static String generateSign(TreeMap<String, String> params)  {
        String paramStr = "POST"+ DOMAIN_NAME + "?";
        StringBuilder builder = new StringBuilder(paramStr);
        for (Map.Entry<String, String> entry: params.entrySet()) {
            builder.append(String.format(Locale.CHINESE,"%s=%s", entry.getKey(), String.valueOf(entry.getValue())))
                    .append("&");
        }

        //去掉最后一个&
        builder.deleteCharAt(builder.lastIndexOf("&"));

        String sign = "";
        String source = builder.toString();
        System.out.println(source);
        Mac mac = null;
        try {
            mac = Mac.getInstance("HmacSHA1");
            SecretKeySpec keySpec = new SecretKeySpec(SECRET_KEY.getBytes(), "HmacSHA1");
            mac.init(keySpec);
            mac.update(source.getBytes());
            sign = Base64.getEncoder().encodeToString(mac.doFinal());
        } catch (NoSuchAlgorithmException | InvalidKeyException e) {
            e.printStackTrace();
        }

        System.out.println("生成签名串:" + sign);
        return sign;
    }

    /**
     * 从 InputStream 读取内容到 buffer, 直到 buffer 填满
     * @return 如果 InputStream 内容不足以填满 buffer, 则返回 false.
     * @throws IOException 可能抛出的异常
     */
    private static boolean fill(InputStream in, byte[] buffer) throws IOException {
        int length = buffer.length;
        int hasRead = 0;
        while (true) {
            int offset = hasRead;
            int count = length - hasRead;
            int currentRead = in.read(buffer, offset, count);
            if (currentRead >= 0) {
                hasRead += currentRead;
                if (hasRead == length) {
                    return true;
                }
            }
            if (currentRead == -1) {
                return false;
            }
        }
    }

}