RMIT University
Browse

Cross-modal Clinical Graph Transformer For Ophthalmic Report Generation

conference contribution
posted on 2024-11-03, 14:43 authored by Mingjie Li, Wenjia Cai, Cornelia VerspoorCornelia Verspoor, Shirui Pan, Xiaodan Liang, Xiaojun ChangXiaojun Chang
Automatic generation of ophthalmic reports using data-driven neural networks has great potential in clinical practice. When writing a report, ophthalmologists make inferences with prior clinical knowledge. This knowledge has been neglected in prior medical report generation methods. To endow models with the capability of incorporating expert knowledge, we propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG), in which clinical relation triples are injected into the visual features as prior knowledge to drive the decoding procedure. However, two major common Knowledge Noise (KN) issues may affect models' effectiveness. 1) Existing general biomedical knowledge bases such as the UMLS may not align meaningfully to the specific context and language of the report, limiting their utility for knowledge injection. 2) Incorporating too much knowledge may divert the visual features from their correct meaning. To overcome these limitations, we design an automatic information extraction scheme based on natural language processing to obtain clinical entities and relations directly from in-domain training reports. Given a set of ophthalmic images, our CGT first restores a sub-graph from the clinical graph and injects the restored triples into visual features. Then visible matrix is employed during the encoding procedure to limit the impact of knowledge. Finally, reports are predicted by the encoded cross-modal features via a Transformer decoder. Extensive experiments on the large-scale FFA-IR benchmark demonstrate that the proposed CGT is able to outperform previous benchmark methods and achieve state-of-the-art performances.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1109/CVPR52688.2022.02000
  2. 2.
    ISBN - Is published in 9781665469463 (urn:isbn:9781665469463)

Start page

1

End page

1

Total pages

1

Outlet

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Name of conference

IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR)

Publisher

IEEE

Place published

United States of America

Start date

2022-06-19

End date

2022-06-24

Language

English

Copyright

©IEEE 2022

Former Identifier

2006114779

Esploro creation date

2022-11-19

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC