CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

improvement of learning from concept drifting data streams with unlabeled and mixed data

عنوان مقاله: improvement of learning from concept drifting data streams with unlabeled and mixed data
شناسه ملی مقاله: JR_IJMEC-4-12_028
منتشر شده در شماره 12 دوره 4 فصل Jul در سال 1393
مشخصات نویسندگان مقاله:

Farzaneh Azimi - Islamic Azad University Of Qazvin,Iran,
Karim Faez - Amirkabir University of Technology, Tehran, Iran,

خلاصه مقاله:
In data streams analysis, detecting concept drifting is a very important problem for real-time decision making. most existing work on classification of data streams assumes that all streaming data are labeled and the class labels are immediately available. However, in real-world applications, such as credit fraud and intrusion detection, this assumption is not always valid. Most of the existing clustering approaches are applicable to purely numerical or categorical data only, but not the both. With this motivation, we propose a semi supervised classification algorithm for data stream with unlabeled and mixed numerical and categorical data(SUNM), in which, a decision tree is adopted as the classification model. When growing a tree, a new clustering algorithm is installed to produce concept clusters and label unlabeled data at leaves. In view of deviations between history concept clusters and new ones, potential concept drifts are distinguished from noise. The experimental results show the efficacy of the propos approach.

کلمات کلیدی:
Data stream, Concept drift, Semi-supervised classification, clustering

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/443382/