two stage comparison of classifier performances for highly imbalanced datasets

Goran Oreški; Stjepan Oreški

two stage comparison of classifier performances for highly imbalanced datasets

Clicks: 150

ID: 144195

2015

During the process of knowledge discovery in data, imbalanced learning data often emerges and presents a significant challenge for data mining methods. In this paper, we investigate the influence of class imbalanced data on the classification results of artificial intelligence methods, i.e. neural networks and support vector machine, and on the classification results of classical classification methods represented by RIPPER and the Naïve Bayes classifier. All experiments are conducted on 30 different imbalanced datasets obtained from KEEL (Knowledge Extraction based on Evolutionary Learning) repository. With the purpose of measuring the quality of classification, the accuracy and the area under ROC curve (AUC) measures are used. The results of the research indicate that the neural network and support vector machine show improvement of the AUC measure when applied to balanced data, but at the same time, they show the deterioration of results from the aspect of classification accuracy. RIPPER results are also similar, but the changes are of a smaller magnitude, while the results of the Naïve Bayes classifier show overall deterioration of results on balanced distributions. The number of instances in the presented highly imbalanced datasets has significant additional impact on the classification performances of the SVM classifier. The results have shown the potential of the SVM classifier for the ensemble creation on imbalanced datasets.

Reference Key	oreki2015journaltwo Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors	;Goran Oreški;Stjepan Oreški
Journal	advances in organometallic chemistry
Year	2015
DOI	DOI not found
URL	http://jios.foi.hr/index.php/jios/article/view/937
Keywords	classification algorithm imbalanced data reduction of class imbalanceinformation theory

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

Comments

Login to comment Register

No comments yet. Be the first to comment on this article.