RMIT University
Browse

Feature selection for high dimensional imbalanced class data using harmony search

journal contribution
posted on 2024-11-02, 01:52 authored by Alireza Moayedikia, Kok-Leong OngKok-Leong Ong, Yee Ling BooYee Ling Boo, William Yeoh, Richard Jensen
Misclassification costs of minority class data in real-world applications can be very high. This is a challenging problem especially when the data is also high in dimensionality because of the increase in overfitting and lower model interpretability. Feature selection is recently a popular way to address this problem by identifying features that best predict a minority class. This paper introduces a novel feature selection method call SYMON which uses symmetrical uncertainty and harmony search. Unlike existing methods, SYMON uses symmetrical uncertainty to weigh features with respect to their dependency to class labels. This helps to identify powerful features in retrieving the least frequent class labels. SYMON also uses harmony search to formulate the feature selection phase as an optimisation problem to select the best possible combination of features. The proposed algorithm is able to deal with situations where a set of features have the same weight, by incorporating two vector tuning operations embedded in the harmony search process. In this paper, SYMON is compared against various benchmark feature selection algorithms that were developed to address the same issue. Our empirical evaluation on different micro-array data sets using G-Mean and AUC measures confirm that SYMON is a comparable or a better solution to current benchmarks.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1016/j.engappai.2016.10.008
  2. 2.
    ISSN - Is published in 09521976

Journal

Engineering Applications of Artificial Intelligence

Volume

57

Start page

38

End page

49

Total pages

12

Publisher

Elsevier

Place published

United Kingdom

Language

English

Copyright

© 2016 Elsevier Ltd. All rights reserved

Former Identifier

2006067706

Esploro creation date

2020-06-22

Fedora creation date

2016-11-09

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC