读论文 《TransForm Mapping Using Shared Decision Tree Context Clustering for HMM-based Cross-Lingual Speech Synthesis》(2)
3 Cross-lingualspeakeradaptationusing STC with a bilingual corpus
第一段:
- In the state mapping technique described in the previous section, the mismatch of language characteristics affects the mapping performance of transformation matrices because only the acoustic features are taken into account in the KLD-based map- ping construction. To improve the mapping performance, we use not only acoustic features but also contextual factors when constructing the transform mapping.
- 这两句话昨天看过了,
- 暂时不要去太扣细节,可以稍微的带点深度的理解每句话,但是不要扣的太深即可
- 说白了,本文和Yijian Wu的论文都是做的state maping,
- Yijian Wu是基于KLD的,仅仅考虑的acoutic features
- 本文是基于STC的,同时考虑acoustc features和contextual factors
- By using contextual factors, we can also take articulation manners and suprasegmental features into account for the mapping construction.
- 发音方式
- 超音段特征
- 什么是超音段特征???
- 搞语音也有一段时间了,连超音段特征是什么,现在都还不太清楚。
- In this section, we propose a novel transform mapping technique based on shared decision tree context clustering (STC) for cross-lingual speaker adaptation, which can reduce the influences of the speaker and language mismatches between average voice models of input and output languages.
- 本文提出了一种新颖的state mapping技术,基于STC树
- 擦,我想起来了STC树不是作者原创的,是已经有人提出来,用于SAT的过程中,在自适应时,如果有多个说话人,而且说话人之间的差距挺大的,用STC可以训练出一个更好的average voice model
- 不过,作者也还是挺厉害的,因为,之前做state mapping都是基于KLD的,他能把SAT的技术STC用来做state mapping也是挺厉害的
- 首先,他对state mapping是很清楚的
- 然后,清楚的明白KLD state mapping的缺点是什么
- 清楚的明白STC的全过程
- 清楚的明白STC的优点可以克服KLD的缺点。
郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。