Risk Classification of Imbalanced Data for Car Insurance Companies: Machine Learning Approaches

Farzan, Khamesian; Maryam, Esna-Ashari; Eric, Dei Ofosu-Hene; Farbod, Khanizadeh

Risk Classification of Imbalanced Data for Car Insurance Companies: Machine Learning Approaches

عنوان مقاله: Risk Classification of Imbalanced Data for Car Insurance Companies: Machine Learning Approaches
شناسه ملی مقاله: JR_IJMAC-12-3_001
منتشر شده در در سال 1401

مشخصات نویسندگان مقاله:

Farzan Khamesian - Insurance Research Center, Tehran, Iran
Maryam Esna-Ashari - Insurance Research Center, Tehran, Iran
Eric Dei Ofosu-Hene - Department of Accounting and Finance, Faculty of Business and Law, De Montfort University, Leicester, UK
Farbod Khanizadeh - Insurance Research Center, Tehran, Iran

خلاصه مقاله:

This paper presents a mechanism for insurance companies to assess the most effective features to classify the risk of their customers for third party liability (TPL) car insurance. Basically, the process of underwriting is carried out based on the expert experiences and the industry suffers from lack of a systematic method to categorize their policyholders with respect to the risk level. We analyzed ۱۳,۳۸۸ observations of an insurance claim dataset from body injury reports provided by an Iranian insurance company. The main challenge is the imbalanced dataset. Here we employ logistic regression and random forest with different resampling of the original data in order to increase the performance of models. Results indicate that the random forest with the hybrid resampling methods is the best classifier and furthermore, victim age, premium, car age and insured age are the most important factors for claims prediction.

کلمات کلیدی:

Machine Learning, supervised Learning, Imbalanced Data, Claim Risk, Classification

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1628665/