Comparison Of The Performance Of Several Data Mining Methods For Bad Debt Recovery In The Healthcare Industry

Main Article Content

Jozef Zurada
Subhash Lonial


healthcare, data mining


The healthcare industry, specifically hospitals and clinical organizations, are often plagued by unpaid bills and collection agency fees. These unpaid bills contribute significantly to the rising cost of healthcare. Unlike financial institutions, health care providers typically do not collect financial information about their patients.  This lack of information makes it difficult to evaluate whether a particular patient-debtor is likely to pay his/her bill.  In recent years, the industry has started to apply data mining tools to reduce bad-debt balance. This paper compares the effectiveness of five such tools - neural networks, decision trees, logistic regression, memory-based reasoning, and the ensemble model in evaluating whether a debt is likely to be repaid. The data analysis and evaluation of the performance of the models are based on a fairly large unbalanced data sample provided by a healthcare company, in which cases with recovered bad debts are underrepresented. Computer simulation shows that the neural network, logistic regression, and the combined model produced the best classification accuracy. More thorough interpretation of the results is obtained by analyzing the lift and receiver operating characteristic charts. We used the models to score all “unknown” cases, which were not pursued by a company. The best model classified about 34.8% of these cases into “good” cases. To collect bad debts more effectively, we recommend that a company first deploy and use the models, before it refers unrecovered cases to a collection agency.    


Download data is not yet available.
Abstract 385 | PDF Downloads 293