CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

A Hybrid optimized ensemble algorithm based on Genetic Programming for imbalanced data classification

عنوان مقاله: A Hybrid optimized ensemble algorithm based on Genetic Programming for imbalanced data classification
شناسه ملی مقاله: ITCT12_024
منتشر شده در دوازدهمین کنفرانس بین المللی فناوری اطلاعات، کامپیوتر و مخابرات در سال 1400
مشخصات نویسندگان مقاله:

Maliheh Roknizadeh - Faculty of Computer and Information Technology Engineering, Neyshabur Branch, Islamic Azad University. Neyshabur, Iran
Hossein Monshizadeh Naeen - Faculty of Computer and Information Technology Engineering, Neyshabur Branch, Islamic Azad University. Neyshabur, Iran

خلاصه مقاله:
One of the most significant current discussions in the field of data mining is the classification of imbalanced data. Several ways, such as algorithm level (internal) approaches, data level (external) techniques, and cost-sensitive methods, have been proposed in recent years. Although extensive research has been carried out on imbalanced data classification, some unsolved challenges remain, such as no attention to the importance of samples to balance, determining the appropriate number of classifiers, and no optimization of classifiers in the combination of classifiers. This paper aims to improve the efficiency of an ensemble method in the sampling of training data sets, especially in minority classes, and to determine better basic classifiers for combining classifiers. We proposed a hybrid ensemble algorithm based on Genetic Programming (GP) for two classes of imbalanced data classification. This study uses historical data from UCI Machine Learning Repository to assess minority classes in imbalanced datasets. The performance of our proposed algorithm is evaluated by the Rapid-miner studio v. ۷.۵. Experimental results show the performance of the proposed method on the specified data sets. The size of the training set shows ۴۰% and ۵۰% better accuracy than other dimensions of the minority class prediction.

کلمات کلیدی:
Bagging, Boosting, Hybrid ensemble method, Genetic programming algorithm, combination classifier, SMOTE

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1261189/