Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B۲ production data
Publish place: Iranian Journal of Health Sciences، Vol: 8، Issue: 2
Publish Year: 1399
نوع سند: مقاله ژورنالی
زبان: English
View: 59
This Paper With 14 Page And PDF Format Ready To Download
- Certificate
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_JHES-8-2_002
تاریخ نمایه سازی: 8 آذر 1402
Abstract:
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variables. In addition, classical methods are affected by the presence of outliers and collinearity.
Methods: Nowadays, many real-world data sets carry structures of high-dimensional problems. To handle this problem, we used the least absolute shrinkage and selection operator (LASSO). Also, due to the flexibility and applicability of the semiparametric model in medical data, it can be used for modeling the genomic data. Motivated by these, here an improved robust approach in a high-dimensional data set was developed for the analysis of gene expression and prediction in the presence of outliers.
Results: Among the common problems in regression analysis, there was the problem of outliers. In the regression concept, an outlier is a point that fails to follow the main linear pattern of the data. The ordinary least-squares estimator was found potentially sensitive to the outliers; this fact provided necessary motivations to investigate robust estimations. Generally, the robust regression is among the most popular problems in the statistics community. In the present study, the least trimmed squares (LTS) estimation was applied to overcome the outlier problem.
Conclusions: We have proposed an optimization approach for semiparametric models to combat outliers in the data set. Especially, based on a penalization LASSO scheme, we have suggested a nonlinear integer programming problem as the semiparametric model which can be effectively solved by any evolutionary algorithm. We have also studied a real-world application related to the riboflavin production. The results showed that the proposed method was reasonably efficient in contrast to the LTS Method.
Keywords:
Authors
Mahdi Roozbeh
Faculty of Mathematics, Statistics & Computer Science, Semnan University, Semnan, Iran
Monireh Maanavi
Faculty of Mathematics, Statistics and Computer Science, Semnan University, Semnan, Iran
Saman Babaie-Kafaki
Faculty of Mathematics, Statistics & Computer Science, Semnan University, Semnan, Iran.
مراجع و منابع این Paper:
لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :