Emotion recognition in natural speech using empirical mode decomposition and Renyi entropy
conference contribution
posted on 2024-10-31, 09:57authored byLing He, Margaret LechMargaret Lech, Namunu Maddage, Nicholas Allen
A new approach to the feature extraction process for automatic emotion classification in speech is presented and tested. The proposed feature extraction is based on the empirical mode decomposition (EMD) combined with the calculation of Renyi entropy. The proposed method was tested on natural speech data subjectively annotated with five different emotions: angry, anxious, dysphoric, happy and neutral. The data represented 44 male and 27 female speakers. Each emotion was represented by 200 utterances of an average duration 1.5 s. The modeling and classification was based on the Gaussian mixture model (GMM). The classification results for the Renyi entropy of order 2 produced an average correct classification rate of 48% varying only slightly across different emotions (std=7.5).