Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data
عنوان مقاله: Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data
شناسه ملی مقاله: JR_JMSI-11-1_005
منتشر شده در در سال 1400
شناسه ملی مقاله: JR_JMSI-11-1_005
منتشر شده در در سال 1400
مشخصات نویسندگان مقاله:
Hannane Mohammadi Nodeh - Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences
Mohammad Amin Tabatabaiefar - Department of Medical Genetics, School of Medicine, Isfahan University of Medical Sciences- Department of Bioinformatics, Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Scien
Mohammadreza Sehhati - Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences
خلاصه مقاله:
Hannane Mohammadi Nodeh - Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences
Mohammad Amin Tabatabaiefar - Department of Medical Genetics, School of Medicine, Isfahan University of Medical Sciences- Department of Bioinformatics, Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Scien
Mohammadreza Sehhati - Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences
Background: Careful design in the primary steps of a next‑generation sequencing study is critical
for obtaining successful results in downstream analysis. Methods: In this study, a framework is
proposed to evaluate and improve the sequence mapping in targeted regions of the reference genome.
In this regard, simulated short reads were produced from the coding regions of the human genome
and mapped to a Customized Target‑Based Reference (CTBR) by the alignment tools that have been
introduced recently. The short reads produced by different sequencing technologies aligned to the
standard genome and also CTBR with and without well‑defined mutation types where the amount
of unmapped and misaligned reads and runtime was measured for comparison. Results: The results
showed that the mapping accuracy of the reads generated from Illumina Hiseq۲۵۰۰ using Stampy
as the alignment tool whenever the CTBR was used as reference was significantly better than
other evaluated pipelines. Using CTBR for alignment significantly decreased the mapping error in
comparison to other expanded or more limited references. While intentional mutations were imported
in the reads, Stampy showed the minimum error of ۱.۶۷% using CTBR. However, the lowest error
obtained by stampy too using whole genome and one chromosome as references was ۳.۷۸% and
۲۰%, respectively. Maximum and minimum misalignment errors were observed on chromosome Y
and ۲۰, respectively. Conclusion: Therefore using the proposed framework in a clinical targeted
sequencing study may lead to predict the error and improve the performance of variant calling
regarding the genomic regions targeted in a clinical study.
کلمات کلیدی: Chromosomes, high‑throughput nucleotide sequencing, sequence analysis
صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1700119/