RMIT University
Browse

The randomized information coefficient: assessing dependencies in noisy data

journal contribution
posted on 2024-11-02, 19:57 authored by Simone Romano, Nguyen Vinh, Cornelia VerspoorCornelia Verspoor, James Bailey
When differentiating between strong and weak relationships using information theoretic measures, the variance plays an important role: the higher the variance, the lower the chance to correctly rank the relationships. We propose the randomized information coefficient (RIC), a mutual information based measure with low variance, to quantify the dependency between two sets of numerical variables. We first formally establish the importance of achieving low variance when comparing relationships using the mutual information estimated with grids. Second, we experimentally demonstrate the effectiveness of RIC for (i) detecting noisy dependencies and (ii) ranking dependencies for the applications of genetic network inference and feature selection for regression. Across these tasks, RIC is very competitive over other 16 state-of-the-art measures. Other prominent features of RIC include its simplicity and efficiency, making it a promising new method for dependency assessment.

History

Journal

Machine Learning

Volume

107

Issue

3

Start page

509

End page

549

Total pages

41

Publisher

Springer New York LLC

Place published

United States

Language

English

Copyright

© The Author(s) 2017

Former Identifier

2006114689

Esploro creation date

2022-07-13

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC