RMIT University
Browse

Learning Acoustic Word Embeddings with Dynamic Time Warping Triplet Networks

journal contribution
posted on 2024-11-02, 13:49 authored by Denis Shitov, Elena PirogovaElena Pirogova, Tadeusz Wysocki, Margaret LechMargaret Lech
In the last years, acoustic word embeddings (AWEs) have gained significant interest in the research community. It applies specifically to the application of acoustic embeddings in the Query-by-Example Spoken Term Detection (QbE-STD) search and related word discrimination tasks. It has been shown that AWEs learned for the word or phone classification in one or several languages can outperform approaches that use dynamic time warping (DTW). In this paper, a new method of learning AWEs in the DTW framework is proposed. It employs a multitask triplet neural network to generate the AWEs. The triplet network learns acoustic representations of words through a comparison of DTW distances. In addition, a multitask objective, including a conventional word classification component, and a triplet loss component is proposed. The triplet loss component applies the DTW distance for the word discrimination task. The multitask objective ensures that the embeddings can be used with DTW directly. Experimental validation shows that the proposed approach is well-suited, but not necessarily restricted to the QbE-STD search. A comparison with several baseline methods shows that the new method leads to a significant improvement of the results on the word discrimination task. An evaluation of the word clustering in the learned embedding space is presented.

History

Journal

IEEE Access

Volume

8

Number

9104974

Start page

103327

End page

103338

Total pages

12

Publisher

IEEE

Place published

United States

Language

English

Copyright

© 2020 IEEE This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

Former Identifier

2006100390

Esploro creation date

2020-09-08

Usage metrics

    Scholarly Works

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC