CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

Determination of the Distribution Pattern of Mortality Using Data Mining Technique in Golestan Province since ۲۰۰۷ to ۲۰۰۹

عنوان مقاله: Determination of the Distribution Pattern of Mortality Using Data Mining Technique in Golestan Province since ۲۰۰۷ to ۲۰۰۹
شناسه ملی مقاله: JR_JOBJ-3-2_006
منتشر شده در در سال 1394
مشخصات نویسندگان مقاله:

فاطمه باقری - Computer Engineering Department, Golestan University, Gorgan, Iran.
فاطمه آهنگری - Computer Engineering Department, Golestan University, Gorgan, Iran.
ناصر بهنام پور - health department, Golestan medical University, Gorgan, Iran.

خلاصه مقاله:
Background and objectives: Investigatingg the mortality in a population has been considered as one of the appropriate methods of health detection. Although, there are some problems such as lack of confidence in accuracy measurement and quality of data collection. Establishment of death registration systems and using international classification codes of diseases, and also mortality data integrating by responsible organizations have solved great parts of the previous problems. In this study, considering a set of parameters, the study population was divided into two groups: deceased under one year (infants) and over one year (adults).  Then both groups were clustered using the K-means method to identify different groups. Hidden models and useful patterns were also discovered using decision tree algorithms. Finally, a neural network algorithm was used to show the ranking of attributes in order of their importance. Methods: In this research, data of ۱۲,۸۶۵ deceased individuals in Golestan province since ۲۰۰۷ to ۲۰۰۹ is studied. The data has been obtained from the Health Center of Golestan province. The main characteristics used in this study are: deceased age, gender, cause of death, place of residence and place of death. K-means algorithm is used to cluster data. The decision tree algorithms and neural networks algorithm were also used for classification. Finally, results and rules were extracted. Due to different natures of causes of death in infants and adults, studying on these different groups is performed separately. Results: In clustering phase, the optimal number of clusters is obtained by Dunn index; eight clusters for infants and seven clusters for adults were obtained. Among four decision-tree algorithms (C۵.۰, QUEST, CHAID and CART), C۵.۰ algorithm with high correction rate, ۷۷.۳۷% in infants data and ۹۶.۸۶% in adults data was the best classifier algorithm. Age, gender and place of death were the most important variables that were detected by neural network algorithm. Conclusion: In the present study, the collected mortality data was clustered by considering the effective factors and the standard of International Classification of Diseases. The hidden patterns of mortality for infants and adults were extracted. Due to the explicit nature and the intelligibility of the decision tree algorithms, the results and extracted rules are very useful for specialists in this field.

کلمات کلیدی:
Data Mining, Clustering, Classification, Decision Tree, Mortality, داده کاوی, خوشه بندی, دسته بندی, درخت تصمیم , مرگ ومیر

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1412820/