Does Removing/Replacing Missing Values Improve The Models' Classification Performances?
Main Article Content
Keywords
Classification Models, Credit Scoring Context, Missing Values Replacement/Removal, Improved Predictive Accuracy
Abstract
The paper explores the effect of removing/replacing missing values on the classification performance of several models. The original data set, which contains a relatively large number of missing values, comes from the credit scoring context. This data set was not used to build the models, but it was converted to five other data sets with missing values either removed or replaced using different techniques. The models were built and tested on the five data sets. Preliminary computer simulation showed that the models created and tested on the four data sets in which missing values were replaced exhibited significantly better predictive performance than the model built and tested on the data set with missing values removed.