RMIT University
Browse

Extending information-theoretic validity indices for fuzzy clustering

journal contribution
posted on 2024-11-02, 01:46 authored by Yang Lei, James Bezdek, Jeffrey ChanJeffrey Chan, Vinh Nguyen, Simone Romano, James Bailey
Previously, eight popular information-theoretic based cluster validity indices have been generalized and tested for probabilistic partitions built by the expectation-maximization (EM) algorithm for the Gaussian mixture model. But the analysis was limited to probabilistic clusters and there were limited explanations for differences in the performance of the indices. In this paper, we extend the tests to partitions found by fuzzy c-Means (FCM) and provide further explanations and insights about the performance of these indices. Of the eight generalized indices, we advocate a normalized version of the soft mutual information cluster validity index (NMIsM) as the best overall choice, as it outperforms the other seven indices for both FCM and EM according to our tests on synthetic and real data. The superiority of NMIsM is most pronounced for datasets with overlapped and/or varying sized clusters. Finally, we provide a theoretical analysis which helps explain the superior performance of NMIsM compared to the other three normalizations of soft mutual information.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1109/TFUZZ.2016.2584644
  2. 2.
    ISSN - Is published in 10636706

Journal

IEEE Transactions of Fuzzy Systems

Volume

25

Issue

4

Start page

1013

End page

1018

Total pages

6

Publisher

IEEE

Place published

United States

Language

English

Copyright

© 2016 IEEE

Former Identifier

2006069049

Esploro creation date

2020-06-22

Fedora creation date

2018-09-21

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC