RMIT University
Browse

Prediction of Inter-Personal Trust and Team Familiarity From Speech: A Double Transfer Learning Approach

journal contribution
posted on 2024-11-02, 16:59 authored by Catherine Sandoval Rodriguez, April Panganiban, Melissa Stolar, Robert Bolia, Margaret LechMargaret Lech
Speech classification is one of the most convenient objective measures of internal state exhibited during a problem-solving task that requires verbal communication. This study investigates the hypothesis of speech acoustic characteristics being indicative of trust between team members and team members' familiarity with each other. Speech recordings from 27 dyadic teams (26 males and 28 females) were made during a distributed threat perception task, determining safe points along a route through the town to be visited by a VIP. Before the threat detection mission, 26 team members knew each other, and the remaining 28 had no prior knowledge of their partners. Two levels (Low Trust and High Trust) of two trust constructs, TTP (Trust, Trustworthiness, Propensity to trust), and RIS (Reliance Intentions Scale), were estimated based on numerical responses to pre- and post-mission surveys. Speech recordings of individual speakers were divided into 1-second intervals and converted into RGB images of amplitude spectrograms. The images were classified using a pre-trained convolutional neural network ResNet-18 fine-tuned to recognize either the trust level or familiarity. In the baseline classification scenario, the speech was classified using a single transfer learning into Low/High-trust categories separately for RIS and TTP constructs before and after the mission yielding an average classification accuracy of 82%-86%. Single transfer learning classification into Know/Unknown-partners categories led to 85% accuracy. Application of double transfer learning, i.e., first tuning the ResNet-18 on Know/Unknown labels and then on Low/High-trust, increased the trust classification accuracy up to 89%. When tuning the ResNet-18 on Low/High-trust and then on Known/Unknown labels, the accuracy of partner familiarity recognition was also increased up to 89%. These results support the hypothesis of speech acoustics being indicative of trust and familiarity between team members and show that by adding prior related knowledge to the model, more efficient learning can be achieved without increasing the training data size.

History

Journal

IEEE Access

Volume

8

Start page

225437

End page

225447

Total pages

11

Publisher

IEEE

Place published

United States

Language

English

Copyright

© 2020 IEEE

Former Identifier

2006107279

Esploro creation date

2021-05-27

Usage metrics

    Scholarly Works

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC