Lip contour description based on orthogonal transform in visual driven speech synthesis system

LI Gang; WANG Meng-jun; LIN Ling; ZENG Rui-li

doi:null

您当前的位置：

首页 >

文章列表页 >

Lip contour description based on orthogonal transform in visual driven speech synthesis system

更新时间：2020-08-12

- Lip contour description based on orthogonal transform in visual driven speech synthesis system
- Optics and Precision Engineering Vol. 15, Issue 7, Pages: 1117-1123(2007)
- 作者机构：
  
  1. 天津大学精密仪器与光电子工程学院,天津 300072
  2. 军事交通学院,天津 300161
- 作者简介：
- 基金信息：
- DOI：
  CLC： TN912.34
- Received：18 October 2006，
  
  Revised：25 December 2006，
  
  Published Online：30 July 2007，
  
  Published：30 July 2007
- 稿件说明：
移动端阅览
LI Gang, WANG Meng-jun, LIN Ling, et al. Lip contour description based on orthogonal transform in visual driven speech synthesis system[J]. Optics and precision engineering, 2007, 15(7): 1117-1123.
DOI：

LI Gang, WANG Meng-jun, LIN Ling, et al. Lip contour description based on orthogonal transform in visual driven speech synthesis system[J]. Optics and precision engineering, 2007, 15(7): 1117-1123. DOI：

摘要

为了能够自动而且快速地获取唇读系统中所必需的唇形轮廓特征

提出了将正交压缩变换的方法用于唇形轮廓的特征提取

并对得到的唇形轮廓曲线进行了分析研究。通过离散傅里叶变换(DFT)和离散余弦变换(DCT)分别得到描述唇形轮廓特征的傅里叶描述子和离散余弦变换描述子

然后将两类描述子作为唇形轮廓的特征向量

采用隐马尔可夫模型(HMM)进行学习和识别。基于独立汉字发音的实验表明:在达到40％的识别率时

刻画唇形轮廓特征所需的离散余弦变换描述子数目为15个

傅里叶描述子数目为20个。在相同的识别效果时

刻画唇形轮廓特征所需的离散余弦变换描述子数目少于傅里叶描述子

可减少数据运算量和运算所需时间。

Abstract

In order to describe the lip contours in a lip reading system automatically and fleetly

orthogonal compression transformation was applied to the feature extraction of lip contours. Discrete Fourier Transform (DFT) and Discrete Cosine Transform(DCT) were used to get the descriptors of lip contours in the asymmetrical lip contour model. Then the Hidden Markov Model (HMM) was trained using two kinds of descriptors as the eigenvectors of lip contours. The experiments based on isolated Chinese words show that the number of DCT descriptors needed is 15

while the number of DFT descriptors is 20 at the same recognition rate of 40%. Experiments also show that the computing quantity and the consuming time are reduced obviously by the DCT at the same recognition rate.

关键词

Keywords

references

Views

404

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Improving Chinese lip-reading recognizing rate by unsymmetrical lip contour model

Related Author

LI Gang

WANG Meng-jun

LIN Ling

Related Institution

School of Precision Instrument and Opto-Electronics Engineering, Tianjin University

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰