Separation of speech sources using an Acoustic Vector Sensor
conference contribution
posted on 2024-10-31, 15:45authored byM. Shujau, Christian Ritz, Ian Burnett
This paper investigates how the directional characteristics of an Acoustic Vector Sensor (AVS) can be used to separate speech sources. The technique described in this work takes advantage of the frequency domain direction of arrival estimates to identify the location, relative to the AVS array, of each individual speaker in a group of speakers and separate them accordingly into individual speech signals. Results presented in this work show that the technique can be used for real-time separation of speech sources using a single 20ms frame of speech, furthermore the results presented show that there is an average improvement in the Signal to Interference Ratio (SIR) for the proposed algorithm over the unprocessed recording of 15.1 dB and an average improvement of 5.4 dB in terms of Signal to Distortion Ratio (SDR) over the unprocessed recordings. In addition to the SIR and SDR results, Perceptual Evaluation of Speech Quality (PESQ) and listening tests both show an improvement in perceptual quality of 1 Mean Opinion Score (MOS) over unprocessed recordings.
History
Start page
1
End page
6
Total pages
6
Outlet
The 2011 IEEE International Workshop on Multimedia Signal Processing
Editors
Wen Gao, Anthony Vetro, Zhengyou Zhang
Name of conference
The 2011 IEEE International Workshop on Multimedia Signal Processing