Developing Predictor Surfaces for Vowels and Voiced Fricatives for Lip Synchronization

No Thumbnail Available

Date

2001-06-11

Journal Title

Series/Report No.

Journal ISSN

Volume Title

Publisher

Abstract

This paper describes a method to construct predictor surfaces for mouth parameters, using Delaunay triangulation. The first and second moments of the input speech signal are mapped to the shape of the mouth. Predictor surfaces are built for four external shape parameters of the mouth. The surfaces include shapes for vowels and some voiced fricatives. Described also is a method for developing real time animation synchronized with sound for vowel and voiced fricative utterances with or without silence and sequences of utterances. The content or kind of speech is not known in advance. Spectral analysis is used to classify the type of sound. Training sounds are used to generate predictor surfaces. Voiced samples containing single sound without any mouth movement during utterance are used to train the system. All the extreme mouth positions and frequently occurring mouth shapes are taken into consideration during training. Relative values of the mouth parameters are set for these sounds; interpolatory surfaces are built using this known data and are used to predict the parameter values for future recordings. Hermite cubic polynomials are used to generate the shapes necessary to depict a human mouth and the jaw. Voice of three speakers is recorded and a comparison of the surfaces of these speakers is made. A speaker-dependent lip synchronization system that develops animation for vowel and voiced fricative utterances is developed.

Description

Keywords

Citation

Degree

MS

Discipline

Computer Science

Collections