RMIT University
Browse

FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark

conference contribution
posted on 2024-11-03, 14:36 authored by Mingjie Li, Wenjia Cai, Rui Liu, Flora SalimFlora Salim, Cornelia VerspoorCornelia Verspoor, Xiaojun ChangXiaojun Chang
The automatic generation of long and coherent medical reports given medical images (e.g. Chest X-ray and Fundus Fluorescein Angiography (FFA)) has great potential to support clinical practice. Researchers have explored advanced methods from computer vision and natural language processing to incorporate medical domain knowledge for the generation of readable medical reports. However, existing medical report generation (MRG) benchmarks lack both explainable annotations and reliable evaluation tools, hindering the current research advances from two aspects: firstly, existing methods can only predict reports without accurate explanation, undermining the trustworthiness of the diagnostic methods; secondly, the comparison among the predicted reports from different MRG methods is unreliable using the evaluation metrics of natural-language generation (NLG). To address these issues, in this paper, we propose an explainable and reliable MRG benchmark based on FFA Images and Reports (FFA-IR). Specifically, FFA-IR is large, with 10,790 reports along with 1,048,584 FFA images from clinical practice; it includes explainable annotations, based on a schema of 46 categories of lesions; and it is bilingual, providing both English and Chinese reports for each case. Besides using the widely used NLG metrics, we propose a set of nine human evaluation criteria to evaluate the generated reports. We envision FFA-IR as a testbed for explainable and reliable medical report generation. We also hope that it can broadly accelerate medical imaging research and facilitate interaction between the fields of medical imaging, computer vision, and natural language processing.

History

Start page

1

End page

14

Total pages

14

Outlet

Advances in Neural Information Processing Systems 34 pre-proceedings (NeurIPS Datasets and Benchmarks 2021)

Name of conference

35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks.

Publisher

PhysioNet

Place published

United States

Start date

2021-12-06

End date

2021-12-14

Language

English

Former Identifier

2006110435

Esploro creation date

2022-01-21

Usage metrics

    Scholarly Works

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC