Comparing classification algorithms of data mining in diagnosis of diabetes and assessing the effectiveness of k-fold cross validation in the accuracy of the constructed model
Publish place: International Conference on Engineering and Computer Science
Publish Year: 1395
نوع سند: مقاله کنفرانسی
زبان: English
View: 639
This Paper With 6 Page And PDF Format Ready To Download
- Certificate
- من نویسنده این مقاله هستم
این Paper در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
ICCSE01_030
تاریخ نمایه سازی: 14 شهریور 1396
Abstract:
One of the applications of data mining is in medicine and model construction for disease diagnosis. The more the modellearns from previous data, the more accurate it would perform. The essential issue is that, the training and testing data in classificationof data must be selected in a way that the model enjoys the most efficient learning from previous data and the highest accuracy indiagnosis of the disease. In this study, the Pima dataset of diabetics is applied, the models for predicting and diagnosing diabetes aredeveloped based on KNN, SVM, Nave Bayesian and Decision Tree classification methods and the accuracy of each model is evaluated.The effectiveness of k-fold validation on the accuracy of each model is assessed. According to the findings here, k-fold cross validationincreases the model accuracy and a classification technique would not always have the best performance and accuracy, while it dependson the nature and complexity of the dataset. The simulation is made by the tool named RapidMiner.
Keywords:
Authors
Nasim Nikbakhsh
Department of computer, Isfahan (khorasgan) Branch Islamic Azad University Isfahan, Iran
GholamReza Dehghani
Department of computer, Isfahan (khorasgan) Branch Islamic Azad University Isfahan, Iran
Farsad Dr.Zamani
Department of computer, Isfahan (khorasgan) Branch Islamic Azad University Isfahan, Iran