RMIT University
Browse

Missing Value Imputation via Clusterwise Linear Regression

Download (4.85 MB)
journal contribution
posted on 2024-11-23, 11:22 authored by Napsu Karmitsa, Sona TaheriSona Taheri, Adil Baghirov, Pauliina Makinen
In this paper a new method of preprocessing incomplete data is introduced. The method is based on clusterwise linear regression and it combines two well-known approaches for missing value imputation: linear regression and clustering. The idea is to approximate missing values using only those data points that are somewhat similar to the incomplete data point. A similar idea is used also in clustering based imputation methods. Nevertheless, here the linear regression approach is used within each cluster to accurately predict the missing values, and this is done simultaneously to clustering. The proposed method is tested using some synthetic and real-world data sets and compared with other algorithms for missing value imputations. Numerical results demonstrate that the proposed method produces the most accurate imputations in MCAR and MAR data sets with a clear structure and the percentages of missing data no more than 25%

History

Related Materials

  1. 1.
    DOI - Is published in 10.1109/TKDE.2020.3001694
  2. 2.
    ISSN - Is published in 10414347

Journal

IEEE Transactions on Knowledge & Data Engineering

Volume

34

Issue

4

Start page

1889

End page

1901

Total pages

13

Publisher

IEEE

Place published

United States

Language

English

Copyright

© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Former Identifier

2006101884

Esploro creation date

2022-08-07

Open access

  • Yes

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC