Application of statistical methods in quantitative chemistry

K. Khorshidian

Application of statistical methods in quantitative chemistry

Publish place: 08th Iranian Statistics Conference

Publish Year: 1385

نوع سند: مقاله کنفرانسی

زبان: English

This Paper With 14 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

این Paper در بخشهای موضوعی زیر دسته بندی شده است:

هوش مصنوعی > الگوریتم ژنتیک

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/85391

شناسه ملی سند علمی:

ISC08_016

تاریخ نمایه سازی: 26 دی 1388

Abstract:

In the present article we review the QSAR, a right path to the statistical data analysis in Chemistry. We deal with the actual problems which are encountered in applying statistical methods to real data in the area of chemical research. We try to solve the noisy broblems by categorizing statistical methods and building a well defined sequence of procedures that would be consider and applied as an algorithm, the QSAR, Quantitative structure-activity relationships (QSARs), as one of the most important areas in chemometrics, give information that are useful for drug design and medicinal chemistry, [17]. The object of constructing the QSAR models is finding one or more molecular descriptors that represent variation in the structural property of the molecules. The main problem in applying usual statistical techniques to chemical data is that in almost all cases the sample size is too small relative to the number of variables (descriptors) in the model. Always sample size is less than hundred, but descriptors are of rank of hundreds and in many cases more than thousand, reverse to what we desire. Multiple linear regression (MLR), principal component regression (PCR), partial least squares (PLS) regression and singular value decomposition are the mostly used modeling techniques in QSARs, when different types of cross-validation (CV) and bootstrapping procedures are applied to above mentioned techniques as iterative methods in order to converge to the optimal model. The application of these techniques require precise variable selection for building well fitted models. In addition to the above, nowadays genetic algorithm (GA) is well known as an interesting and more widely used variable selection method. A GA is a stochastic method to solve optimization problems defined by a fitness criterion applying evolution hypothesis of Darwin and different genetic functions, i.e. cross-over and mutation. A real research in drug chemistry with QSAR modeling is over-reviewed as an example, when sample size is 735 and size of variable pool is 1355.

Keywords:

Principal Component Regression (PCR) , Partial Least Squares (PLS) , Factor Selection , Correlation Ranking , Cross Validation , Genetic Algorithm.

Authors

K. Khorshidian

Stat. Dept., Shiraz Univ., Shiraz, Iran.

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :

_ Mainowski ER. Factor Anaلysis in Chemistry. Wiley: NewYork, 2002; ...
_ Rius A, Callao MP, Fee J, Rius FX. Assessing ...
_ Silva APD. Discarding vrariables in a principal comporent analysis: ...
basictool _ cbemometric, Cheometrics ه [22] Wold S, Sjostrom M, ...
Martens B. Reliable and relevant modeling of real world data: ...
Roy K, Ghosh G. QSTR with exterded topochemical aom indices-2: ...
Sutter JM, Kalivas J. Which pribcipal components 4o utilize for ...

نمایش کامل مراجع