Performance evaluation of different machine learning classification models on expression profiles of tumor educated platelets data

Publish Year: 1400
نوع سند: مقاله کنفرانسی
زبان: English
View: 211

نسخه کامل این Paper ارائه نشده است و در دسترس نمی باشد

  • Certificate
  • من نویسنده این مقاله هستم

این Paper در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

IBIS10_032

تاریخ نمایه سازی: 5 تیر 1401

Abstract:

Since liquid biopsy is less invasive than tissue biopsy, studies on liquid biopsy biomarkers for the earlydetection of cancer are taken into consideration. Expression profiles of tumor-educated platelets (TEP) inliquid biopsy can be used as one of the biomarkers. Using classification machine learning models, given thefeature space derived from the expression data of TEPs, has given us the ability to predict data categories.Here, the aim is performance evaluation of different classification models for diagnosis of cancer-based onexpression profiles of platelets. First, expression profiles of TEPs in ۲۳۰ patients with breast, liver, colorectal,brain, pancreatic, and lung cancers in addition to profiles of ۵۵ healthy individuals were downloaded fromthe GEO database (GSE۶۸۰۸۶). Thereafter, the data were normalized using the edgeR package (R softwareversion ۴.۱.۰) and ۲۰۰۰ genes with the highest variance were selected. Then, different types of classificationmodels namely SVM, LDA, logistic regression, boosting, classification tree, and random forest, wereevaluated on the feature selected data in ۱۰-fold cross-validation. In addition, the variable importance ofselected genes was obtained using polynomial SVM. Then, pathway enrichment analysis was performedusing H, C۶, and C۷ gene sets of MSigDB database using preranked GSEA method. The results showed thatthe polynomial SVM has the highest performance on the validation set (accuracy ~ ۹۵%, mean AUC ~ ۰.۹۹۴,sd AUC ~ ۰.۰۰۹۳). Also, the linear SVM model had the second-best performance on validation set (meanAUC ~ ۰.۹۹۱۷). In pathway enrichment analysis ۱۰ immunological pathways were enriched in cancersamples compared to healthy donors. Overall, the results showed that polynomial SVM can be a model withgood performance for classifying TEP data. All in all, the results of this study indicate that the expressionprofile of TEPs can be considered as a candidate biomarker in liquid biopsy.

Keywords:

Authors

Sajedeh Bahonar

Department of Bioinformatics, Institute Biochemistry and Biophysics, University of Tehran, Tehran, Iran

Fahimeh Palizban

Department of Bioinformatics, Institute Biochemistry and Biophysics, University of Tehran, Tehran, Iran

Hesam Montazeri

Department of Bioinformatics, Institute Biochemistry and Biophysics, University of Tehran, Tehran, Iran