Diagnosis and Classifying Cancer Subtypes Based On Gene Expression Data Using Cost-Sensitive Hybrid Deep Learning

Rashed, Akbari; Mahbobeh, Shamsi; Razieh, Hasehmi alam; Abdolreza, Rasouli Kenari

Diagnosis and Classifying Cancer Subtypes Based On Gene Expression Data Using Cost-Sensitive Hybrid Deep Learning

عنوان مقاله: Diagnosis and Classifying Cancer Subtypes Based On Gene Expression Data Using Cost-Sensitive Hybrid Deep Learning
شناسه ملی مقاله: AIMS01_227
منتشر شده در اولین کنگره بین المللی هوش مصنوعی در علوم پزشکی در سال 1402

مشخصات نویسندگان مقاله:

Rashed Akbari - Qom University of Technology
Mahbobeh Shamsi - Qom University of Technology
Razieh Hasehmi alam - Qom University of Technology
Abdolreza Rasouli Kenari - Qom University of Technology

خلاصه مقاله:

Accurately identifying cancer subgroups is crucial for effective diagnosis and prediction of canceroutcomes. Deep-learning methods have gained popularity in recent years for this purpose. However,the optimal performance of a deep neural network relies heavily on its architecture, whichmakes it challenging to determine the best structure. Moreover, the large number of genes in geneexpression datasets and the imbalanced data distribution among different classes pose significantobstacles to achieving high accuracy in cancer subgroup classification models.To address these issues, we propose a novel convolutional neural network model that uses acost-sensitive learning approach to improve the accuracy of minority class identification. Wealso utilized three techniques, namely, the Fisher ratio, anomaly sets, and a combination of both,to reduce the number of genes in the preprocessing stage. Our cost-sensitive approach involvescreating a cost matrix based on class distribution and utilizing it in the loss function of the convolutionalneural network to calculate the error rate.We evaluated our proposed method using two cancer datasets, and compared the results withthose of a convolutional model without feature selection and cost-sensitive learning. We used fourstandard metrics, namely, accuracy, recall, precision, and F۱-score, to measure performance. Ourresults demonstrate that selecting appropriate genes and using a cost-sensitive learning approachsignificantly improves the performance of our proposed method, achieving increases of ۱۱%,۱۰%, ۱۸%, and ۲۱% in accuracy, recall, precision, and F۱-score, respectively.Overall, our approach demonstrated the effectiveness of combining cost-sensitive learning andfeature selection techniques to address the challenges of imbalanced data and a large number ofgenes in cancer subgroup classification. This study has important implications for improving theaccuracy and efficiency of cancer diagnosis and prediction using deep learning methods.

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1703175/