RMIT University
Browse

Study of wavelet packet energy entropy for emotion classification in speech and glottal signals

conference contribution
posted on 2024-10-31, 17:13 authored by Ling He, Margaret LechMargaret Lech, Jing Zhang, Xiaomei Ren, Lihua Deng
The automatic speech emotion recognition has important applications in human-machine communication. Majority of current research in this area is focused on finding optimal feature parameters. In recent studies, several glottal features were examined as potential cues for emotion differentiation. In this study, a new type of feature parameter is proposed, which calculates energy entropy on values within selected Wavelet Packet frequency bands. The modeling and classification tasks are conducted using the classical GMM algorithm. The experiments use two data sets: the Speech Under Simulated Emotion (SUSE) data set annotated with three different emotions (angry, neutral and soft) and Berlin Emotional Speech (BES) database annotated with seven different emotions (angry, bored, disgust, fear, happy, sad and neutral). The average classification accuracy achieved for the SUSE data (74%-76%) is significantly higher than the accuracy achieved for the BES data (51%-54%). In both cases, the accuracy was significantly higher than the respective random guessing levels (33% for SUSE and 14.3% for BES).

History

Start page

1

End page

6

Total pages

6

Outlet

Proceedings of the 5th International Conference on Digital Image Processing (ICDIP 2013, SPIE 8878)

Editors

Yulin Wang, Xie Yi

Name of conference

ICDIP 2013

Publisher

SPIE

Place published

USA

Start date

2013-04-21

End date

2013-04-22

Language

English

Copyright

© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

Former Identifier

2006043702

Esploro creation date

2020-06-22

Fedora creation date

2014-02-18

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC