前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >SP Module 6 Speech Synthesis – Waveform Generation and Connected Speech

SP Module 6 Speech Synthesis – Waveform Generation and Connected Speech

作者头像
杨丝儿
发布2022-11-24 17:16:50
3810
发布2022-11-24 17:16:50
举报
文章被收录于专栏:杨丝儿的小站杨丝儿的小站

Diphone

Phones are not a suitable unit for waveform concatenation, so we used diphones, which capture co-articulation.

Diphone starts at the middle of one phone and ends at the middle of the other.

Coarticulation is the overlapping of adjacent articulations or the influence of the target phoneme on surrounding phonemes. Middles of phones are more stable in their spectral properties than the edges, because of coarticulation. So, concatenating diphones should lead to smoother joins

s18221310312022
s18221310312022
s22373310222022
s22373310222022

Waveform concatenation

Concatenation of waveforms is a simple way of making synthetic speech, but we need to take care about how we do it.

  1. discontinuity cause pops
  2. periodicity alignment cause glitches
s22424210222022
s22424210222022
s22432010222022
s22432010222022
s18440910312022
s18440910312022

Overlap-add

Cross-fading between two waveforms is an effective way to avoid some of the artefacts of concatenation.

s22460610222022
s22460610222022
s22461610222022
s22461610222022

Pitch period

This fundamental building block of speech waveforms offers a route to source-filter separation in the time domain.

Overlap of pitch period or impulse signal is observed.

s22524010222022
s22524010222022

extract pitch period (with taper window) for each pitch mark, and we make the time for each pitch period twice the T0T_0T0​.

s22571610222022
s22571610222022

overlap to get the reconstruction signal similar to the original one. the whole process is called copy sentences.

s22573710222022
s22573710222022
s22575410222022
s22575410222022

TD-PSOLA

Applying overlap-add techniques to pitch period waveforms allows the modification of F0 and duration without changing the phone identity.

Time-domain pitch-synchronous overlap-and-add

Pitch period closer to each other

s23105710222022
s23105710222022

Pitch period far apart from each other

s23112110222022
s23112110222022

make a copy of one pitch period and insert to the sequence.

s23120110222022
s23120110222022

delete one pitch period

s23121510222022
s23121510222022

Diphone synthesis:

  • One recording of every diphone (small database)
  • Use signal processing methods to change F0, duration, and smooth joins to match linguistic specification
  • e.g. TD-PSOLA

Unit Selection

Unit selection:

  • Record a large naturalistic database
  • Select diphone units based on closeness to the linguistic specification
  • If the database has enough variation, don’t worry (too much) about signal processing!

Choice of units to concatenate depends on:

  • Target cost: how well the unit matches the linguistic specification
  • Join cost: how well edges of the units match

Convolution

A non-mathematical illustration of the equivalence of convolution (in the time domain), multiplication of magnitude spectra, and addition of log magnitude spectra.

s23205410222022
s23205410222022

Summary

s23225210222022
s23225210222022
本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2022-10-22,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Diphone
  • Waveform concatenation
  • Overlap-add
  • Pitch period
  • TD-PSOLA
  • Unit Selection
  • Convolution
  • Summary
相关产品与服务
数据库
云数据库为企业提供了完善的关系型数据库、非关系型数据库、分析型数据库和数据库生态工具。您可以通过产品选择和组合搭建,轻松实现高可靠、高可用性、高性能等数据库需求。云数据库服务也可大幅减少您的运维工作量,更专注于业务发展,让企业一站式享受数据上云及分布式架构的技术红利!
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档