Comparison of result of machine learning algorithms in predicting heart disease

Sajad Yousefi; Maryam Poornajaf

Comparison of result of machine learning algorithms in predicting heart disease

Publish place: Frontiers in Health Informatics، Vol: 12، Issue: 1

Publish Year: 1402

نوع سند: مقاله ژورنالی

زبان: English

This Paper With 10 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/1841793

شناسه ملی سند علمی:

JR_IJIMI-12-1_005

تاریخ نمایه سازی: 14 آذر 1402

Abstract:

Introduction: Heart disease is, for the most part, alluding to conditions that include limited or blocked veins that can prompt a heart attack, chest torment or stroke. Earlier identification of heart disease may reduce the death rate. The cost of medical diagnosis makes it perverse to cure it for the large amount of people early. Using machine learning models performed on dataset. This article aims to find the most efficient and accurate machine learning models for disease prediction. Material and Methods: Several supervised machine learning algorithms were utilized to diagnosis and prediction of heart disease such as logistic regression, decision tree, random forest and KNN. The algorithms are applied to a dataset taken from the Kaggle site including ۷۰۰۰۰ samples. In algorithms, methods such as the importance of features, hold out validation, ۱۰-fold cross-validation, stratified ۱۰-fold cross-validation, leave one out cross-validation are the result of effective performance and increase accuracy. In addition, feature importance scores was estimated for each feature in some algorithms. These features were ranked based on feature importance score. All the work is done in the Anaconda environment based on python programming language and Scikit-learn library. Results: The algorithms performance is compared to each other so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F۱ score were evaluated for each model. As a result of evaluation, random forest algorithm with F۱ score ۹۲%, accuracy ۹۲% and AUC ROC ۹۵%, has better performance than other algorithms. Conclusion: The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the diagnosis and prediction of heart disease is compared to determine the most appropriate classifier.

Keywords:

F۱-Score , Machine Learning , Heart Disease , Classification , Importance Score , Accuracy

Authors

Sajad Yousefi

Faculty Member, Department of Electrical Engineering, Technical and Vocational University (TVU), Tehran, Iran

Maryam Poornajaf

Faculty Member, Department of Computer Engineering, Technical and Vocational University (TVU), Tehran, Iran