RMIT University
Browse

Analysis of predictive performance and reliability of classifiers for quality assessment of medical evidence revealed important variation by medical area

journal contribution
posted on 2024-11-03, 09:52 authored by Simon Suster, Timothy Baldwin, Cornelia VerspoorCornelia Verspoor
Objectives: A major obstacle in deployment of models for automated quality assessment is their reliability. To analyze their calibration and selective classification performance. Study Design and Setting: We examine two systems for assessing the quality of medical evidence, EvidenceGRADEr and RobotReviewer, both developed from Cochrane Database of Systematic Reviews (CDSR) to measure strength of bodies of evidence and risk of bias (RoB) of individual studies, respectively. We report their calibration error and Brier scores, present their reliability diagrams, and analyze the risk–coverage trade-off in selective classification. Results: The models are reasonably well calibrated on most quality criteria (expected calibration error [ECE] 0.04–0.09 for EvidenceGRADEr, 0.03–0.10 for RobotReviewer). However, we discover that both calibration and predictive performance vary significantly by medical area. This has ramifications for the application of such models in practice, as average performance is a poor indicator of group-level performance (e.g., health and safety at work, allergy and intolerance, and public health see much worse performance than cancer, pain, and anesthesia, and Neurology). We explore the reasons behind this disparity. Conclusion: Practitioners adopting automated quality assessment should expect large fluctuations in system reliability and predictive performance depending on the medical area. Prospective indicators of such behavior should be further researched.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1016/j.jclinepi.2023.04.006
  2. 2.
    ISSN - Is published in 08954356

Journal

Journal of Clinical Epidemiology

Volume

159

Start page

58

End page

69

Total pages

12

Publisher

Elsevier

Place published

United States

Language

English

Copyright

© 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/ licenses/by-nc-nd/4.0/).

Former Identifier

2006124276

Esploro creation date

2023-08-23