Exploiting Speech Production Knowledge for Speech Technology Applications

In the overview article cited below, we review the potential for incorporating direct (such as those shown in the table), or inferred, speech production knowledge in speech technology development. We review the technologies that can be used to capture speech articulation information, and discuss how meaningful (speech and speaker) representations can be derived from articulatory data thus captured and further how they can be estimated from the acoustics in the absence of these direct measurements. We present some applications that have used speech production information to further the state of the art in automatic speech and speaker recognition. We also offer an outlook on how such knowledge and applications can in turn inform scientific understanding of the human speech communication process.

Vikram Ramanarayanan, Prasanta Ghosh, Adam Lammert and Shrikanth S. Narayanan (2012), Exploiting speech production information for automatic speech and speaker modeling and recognition – possibilities and new opportunities , in proceedings of: APSIPA 2012, Los Angeles, CA, Dec 2012 [pdf].

Exploiting Speech Production Knowledge for Speech Technology Applications

Extracting Discriminative Information from Speech

Speaker Recognition

Articulatory Recognition