Risk Classification of Imbalanced Data for Car Insurance Companies: Machine Learning Approaches
Publish Year: 1401
نوع سند: مقاله ژورنالی
زبان: English
View: 154
This Paper With 10 Page And PDF Format Ready To Download
- Certificate
- من نویسنده این مقاله هستم
این Paper در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_IJMAC-12-3_001
تاریخ نمایه سازی: 22 فروردین 1402
Abstract:
This paper presents a mechanism for insurance companies to assess the most effective features to classify the risk of their customers for third party liability (TPL) car insurance. Basically, the process of underwriting is carried out based on the expert experiences and the industry suffers from lack of a systematic method to categorize their policyholders with respect to the risk level. We analyzed ۱۳,۳۸۸ observations of an insurance claim dataset from body injury reports provided by an Iranian insurance company. The main challenge is the imbalanced dataset. Here we employ logistic regression and random forest with different resampling of the original data in order to increase the performance of models. Results indicate that the random forest with the hybrid resampling methods is the best classifier and furthermore, victim age, premium, car age and insured age are the most important factors for claims prediction.
Keywords:
Authors
Farzan Khamesian
Insurance Research Center, Tehran, Iran
Maryam Esna-Ashari
Insurance Research Center, Tehran, Iran
Eric Dei Ofosu-Hene
Department of Accounting and Finance, Faculty of Business and Law, De Montfort University, Leicester, UK
Farbod Khanizadeh
Insurance Research Center, Tehran, Iran