Causal inference and decision-making in complex evolving systems can pose significant challenges, particularly when the analysis involves working with irregular data---the type of data containing varying intervals, scales, non-equidistant data points, fidelities, geographic or heterogeneous data comprising a mix of count, continuous, binary or categorical data. This research aims to develop techniques for understanding causal relationships and providing decision-making support while handling the complexities of irregular data. It also aims to understand the possible trade-off between model performance and interpretability for decision-making in the context of a fast-evolving system.
To achieve the research aims, a framework based on an additive structure to the heterogeneous multi-output Gaussian process model is proposed. The proposed approach enhances model interpretability by modelling the high-dimensional latent functions that influence the observed heterogeneous responses as a sum of more interpretable single-dimensional latent functions. This provides further insights into the complex non-linear causal relationships that may have otherwise been undiscoverable.
The research demonstrates the proposed framework's effectiveness in explaining causality and providing decision-making support in an experimental study using air combat simulations containing multiple heterogeneous responses. While it achieves comparable performances to the more flexible but less interpretable frameworks in various air combat scenarios, the proposed model considerably enhances the interpretation of the model outputs.
The contributions in this research are diverse, covering topics surrounding causal inference and decision analysis and improving its capabilities for handling irregular data and multiple heterogeneous response variables---including responses with heteroskedasticity. It highlights the importance of jointly modelling multiple heterogeneous responses with possible associations as it improves model performance and incorporates shared information from the other response variables. The work presents techniques for identifying an overfit model involving heteroskedastic responses, which would have easily gone unnoticed using other techniques. It proposes a simplified alternative to the model's structure through horseshoe priors, reducing its dependence on the number of latent functions required. This research contributes to understanding model performance versus model interpretability trade-offs and assesses the impact on performance when measured using different model evaluation methods. Finally, the research may be beneficial when it comes to analysing irregular data in several applications and domains, including complex and fast-evolving systems, where decision-makers require interpretable and reliable insights.