Runtime verification of scientific codes using statistics
journal contribution
posted on 2024-11-02, 06:57authored byMinh DinhMinh Dinh, David Abramson, Chao Jin
Runtime verification of large-scale scientific codes is difficult because they often involve thousands of processes, and generate very large data structures. Further, the programs often embody complex algorithms making them difficult for non-experts to follow. Notably, typical scientific codes implement mathematical models that often possess predictable statistical features. Therefore, incorporating statistical analysis techniques in the verification process allows using program's state to reveal unusual details of the computation at runtime. In our earlier work, we proposed a statistical framework for debugging large-scale applications. In this paper, we argue that such framework can be useful in the runtime verification process of scientific codes. We demonstrate how two production simulation programs are verified using statistics. The system is evaluated on a 20,000-core Cray XE6.