RMIT University
Browse

A categorical analysis of coreference resolution errors in biomedical texts

journal contribution
posted on 2024-11-02, 19:45 authored by Miji Choi, Justin Zobel, Cornelia VerspoorCornelia Verspoor
Background: Coreference resolution is an essential task in information extraction from the published biomedical literature. It supports the discovery of complex information by linking referring expressions such as pronouns and appositives to their referents, which are typically entities that play a central role in biomedical events. Correctly establishing these links allows detailed understanding of all the participants in events, and connecting events together through their shared participants. Results: As an initial step towards the development of a novel coreference resolution system for the biomedical domain, we have categorised the characteristics of coreference relations by type of anaphor as well as broader syntactic and semantic characteristics, and have compared the performance of a domain adaptation of a state-of-the-art general system to published results from domain-specific systems in terms of this categorisation. We also develop a rule-based system for anaphoric coreference resolution in the biomedical domain with simple modules derived from available systems. Our results show that the domain-specific systems outperform the general system overall. Whilst this result is unsurprising, our proposed categorisation enables a detailed quantitative analysis of the system performance. We identify limitations of each system and find that there remain important gaps in the state-of-the-art systems, which are clearly identifiable with respect to the categorisation. Conclusion: We have analysed in detail the performance of existing coreference resolution systems for the biomedical literature and have demonstrated that there clear gaps in their coverage. The approach developed in the general domain needs to be tailored for portability to the biomedical domain. The specific framework for class-based error analysis of existing systems that we propose has benefits for identifying specific limitations of those systems. This in turn provides insights for further system development.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1016/j.jbi.2016.02.015
  2. 2.
    ISSN - Is published in 15320464

Journal

Journal of Biomedical Informatics

Volume

60

Start page

309

End page

318

Total pages

10

Publisher

Academic Press

Place published

United States

Language

English

Copyright

© 2016 Elsevier Inc. All rights reserved.

Former Identifier

2006114762

Esploro creation date

2022-05-17

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC