CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

ResidualConv۱D: A Deep Learning Approach for Enhancing Splice Site Prediction across Genomic Contexts

عنوان مقاله: ResidualConv۱D: A Deep Learning Approach for Enhancing Splice Site Prediction across Genomic Contexts
شناسه ملی مقاله: ICAIFT01_003
منتشر شده در نخستین همایش "هوش مصنوعی و فناوری های آینده نگر" در سال 1402
مشخصات نویسندگان مقاله:

Mohammad Reza Rezvan - Department of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
Ali Ghanbari Sorkhi - Department of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
Jamshid Pirgaz - Department of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
Mohammad Mehdi Pourhashem Kallehbasti - Department of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran

خلاصه مقاله:
This study addresses the challenge of accurately predicting splice sites, a crucial element in understanding gene expression and protein synthesis. We assume that conventional prediction methods may lack the specificity and adaptability required for diverse genomic contexts. To improve this, we present a novel method that integrates two-Gram features and One-Hot encoding with a Deep Convolutional Neural Network (ResidualConv۱D) model. Our approach begins with using the two-Gram technique to capture nucleotide dependencies at splice sites. These sequences are then enriched with two-Gram features using one-hot encoding. The core of our methodology is the ResidualConv۱D model, which employs convolutional blocks with residual connections to detect complex sequence patterns effectively. Our results indicate a significant advancement in splice site prediction accuracy. The model particularly excels in the HS۳D acceptor and Arabidopsis thaliana donor datasets, outperforming the established Ensemble Splice algorithm. In the HS۳D acceptor dataset, the model achieved an accuracy of ۹۴.۱۸% and an F۱-score of ۹۴.۲۴%, demonstrating its effectiveness. Additionally, it shows competitive performance in a range of metrics across various datasets, highlighting its robustness in different genomic environments. In conclusion, our innovative combination of two-Gram features, one-hot encoding, and the ResidualConv۱D model substantially improves the accuracy of splice site prediction across diverse species. This improvement in prediction capability could be pivotal in advancing the understanding of gene splicing mechanisms.

کلمات کلیدی:
splice site prediction, Two-Gram features, ResidualConv۱D, genomic contexts, accuracy

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1902222/