RMIT University
Browse

Automatic visual speech segmentation and recognition using directional motion history images and Zernike moments

journal contribution
posted on 2024-11-01, 12:31 authored by Ayaz Ahmed Shaikh, Dinesh KumarDinesh Kumar, Jayavardhana Gubbi
Appearance-based visual speech recognition using only video signals is presented. The proposed technique is based on the use of directional motion history images (DMHIs), which is an extension of the popular optical-flow method for object tracking. Zernike moments of each DMHI are computed in order to perform the classification. The technique incorporates automatic temporal segmentation of isolated utterances. The segmentation of isolated utterance is achieved using pair-wise pixel comparison. Support vector machine is used for classification and the results are based on leave-one-out paradigm. Experimental results show that the proposed technique achieves better performance in visemes recognition than others reported in literature. The benefit of this proposed visual speech recognition method is that it is suitable for real-time applications due to quick motion tracking system and the fast classification method employed. It has applications in command and control using lip movement to text conversion and can be used in noisy environment and also for assisting speech impaired persons.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1007/s00371-012-0751-7
  2. 2.
    ISSN - Is published in 01782789

Journal

The Visual Computer: International Journal of Computer Graphics

Volume

Online

Start page

1

End page

14

Total pages

14

Publisher

Springer

Place published

Germany

Language

English

Copyright

© 2012 Springer-Verlag

Former Identifier

2006038302

Esploro creation date

2020-06-22

Fedora creation date

2013-04-29

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC