In jurisdictions which implement Diagnosis-Related Group (DRG) based payment systems, DRGs are used as a major determinant of the funding received for inpatient care; hence, service providers often dedicate auditing staff to check that episodes have been assigned the correct DRG. The mis-classification of inpatient episode DRGs has significant revenue impacts on health care providers. In activity based funding systems, admissions with similar diagnoses and treatments are grouped to the same DRG.

This study first implements Bayesian logistic regression models with weakly informative prior distributions to estimate the probability that a given episode requires a DRG revision, comparing these models with each other and to classical maximum likelihood estimation. The best performing weakly informative Bayesian model improved overall classification performance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. The best performing model has been operationalised and within a period of 6 months lead to dramatically improved audit efficiency at a major metropolitan health care provider in Melbourne, Australia. One metric for the efficiency of clinical coding audits is the revenue recovered per audited episode, which rose by 40% over this period.

While weakly informative Bayesian models can be used effectively to predict DRG mis-classification, they do not sufficiently exploit apriori expert information. A new, 'Hybrid' prior approach is introduced which utilises guesses elicited from a clinical coding auditor. The Hybrid prior switches to non-informative priors where elicited information is unreliable. This method is compared to weakly informative Bayesian models and maximum likelihood estimates. Based on repeated cross-validation, performance was greatest for the Hybrid prior model which significantly outperformed benchmark models. The Hybrid prior has an average AUC (area under the curve) rank of 1.6. The next best approach has an average AUC rank of 2.6. This observed difference in average rank exceeds the critical difference (CD) for statistical significance as per the Nemenyi pairwise comparison test.

Clinical coding of hospital admissions can erroneously omit diagnosis and procedure codes. A consequence of these omissions is that the condition and treatment of the patient are not fully captured, which can impact DRG assignment and hospital revenue. A real-time recommender system based on Bayesian networks is developed and the performance of this recommender is evaluated against baseline approaches using a testing strategy which simulates coding errors by removing codes from episodes and counting how many of the removed codes are recommended for addition. Performance is also based on how many recommended codes were not removed (superfluous recommendations), which should be minimised because they are a nuisance to the coder.

A standard approach to such a problem may involve analysis of historical patterns of associations between clinical codes to determine how often codes go together in an episode. Using association analysis on its own without any domain expert filtering offers one baseline approach to this problem. Another baseline model is to apply manual expert validation over the association analysis, which can produce a higher performing recommender (this recommender is referred to as the expert validated list), but is time consuming and labour-intensive for subject matter experts. The new methodology developed in this research project provides a high performance recommender system while reducing the dependence on labour intensive effort by clinical coding experts. The proposed AI recommender, which is built using a combination of Bayesian Networks, proved to be the best performing recommender, generating 96% of the number of correct recommendations produced by the expert validated recommender, while producing 68% fewer superfluous recommendations.

Increasingly, many hospitals are attempting to provide more accurate information about Emergency Department (ED) wait time to their patients. Estimation of ED wait time, which is an important problem in health information management, usually depends on what is known about the patient and also the status of the ED at the time of presentation. A new model is developed to estimate ED wait time for prospective low acuity patients accessing information online prior to arrival. In this scenario, the key challenge is that little is known or assumed about the prospective patient and their condition. An informative Bayesian quantile regression framework has been developed to provide an estimated wait time range for prospective patients, using the median and 90th percentiles. This methodology incorporates government statistics on ED wait times and elicited expert opinion. The use of informative priors offers a novel approach to ED wait time prediction.

This informative prior approach was compared to two baseline models: classical quantile regression and Bayesian quantile regression with weakly informative priors on prediction accuracy. On the pinball loss metric, the proposed method performs best for median prediction (L(Q_50; y) = 13; 161:35) compared to weakly informative Bayesian quantile regression (L(Q_50; y) = 13; 170:57) and frequentist quantile regression (L(Q_50; y) = 13; 170:70). The proposed model also performs better on 90th percentile prediction (L(Q_90; y) = 7; 388:59) than weakly informative Bayesian quantile regression (L(Q_90;y) = 7; 425:74) and frequentist quantile regression (L(Q_90; y) = 7; 436:62). These results provide evidence of the value contributed by elicited expert guesses that are guided by government wait time statistics.