RMIT University
Browse

Separation of speech sources using an Acoustic Vector Sensor

conference contribution
posted on 2024-10-31, 15:45 authored by M. Shujau, Christian Ritz, Ian Burnett
This paper investigates how the directional characteristics of an Acoustic Vector Sensor (AVS) can be used to separate speech sources. The technique described in this work takes advantage of the frequency domain direction of arrival estimates to identify the location, relative to the AVS array, of each individual speaker in a group of speakers and separate them accordingly into individual speech signals. Results presented in this work show that the technique can be used for real-time separation of speech sources using a single 20ms frame of speech, furthermore the results presented show that there is an average improvement in the Signal to Interference Ratio (SIR) for the proposed algorithm over the unprocessed recording of 15.1 dB and an average improvement of 5.4 dB in terms of Signal to Distortion Ratio (SDR) over the unprocessed recordings. In addition to the SIR and SDR results, Perceptual Evaluation of Speech Quality (PESQ) and listening tests both show an improvement in perceptual quality of 1 Mean Opinion Score (MOS) over unprocessed recordings.

History

Start page

1

End page

6

Total pages

6

Outlet

The 2011 IEEE International Workshop on Multimedia Signal Processing

Editors

Wen Gao, Anthony Vetro, Zhengyou Zhang

Name of conference

The 2011 IEEE International Workshop on Multimedia Signal Processing

Publisher

IEEE Signal Processing Society

Place published

HangZhou, China

Start date

2011-10-17

End date

2011-10-19

Language

English

Copyright

© 2011 IEEE.

Former Identifier

2006030252

Esploro creation date

2020-06-22

Fedora creation date

2012-02-24

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC