Regression modelling of count outcomes in presence of over-dispersion due to excess zeros over the follow-up time

Publish Year: 1397
نوع سند: مقاله کنفرانسی
زبان: English
View: 529

نسخه کامل این Paper ارائه نشده است و در دسترس نمی باشد

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

AMSMED19_027

تاریخ نمایه سازی: 1 دی 1397

Abstract:

Background and Objective: Count outcomes could be found in several research fields such as epidemiology studies, survival time studies and studies of healthcare services utilization. The Poisson regression in generalized linear framework is used to investigate the effect of covariates on this type of outcome. Count outcomes are exposed to excess variability (over-dispersion). In practice, over-dispersion occurs due to unobserved heterogeneity, clustering type of outcomes or excess zeros. Structural zeros and sampling zeros are two types of zeros that may be found in count data. Structural zeros occurs when the subject is not at risk. However, if the subject is indeed at some risk of the event, sampling zeros occurs. In the case of over-dispersion due to excess zeros, zero hurdle or zero inflated models are better alternatives for modelling the over-dispersion. Moreover, it should be noticed when a count outcome occurs during a time period (such as the number of involved lymph nodes during the follow-up time in patients with breast cancer or the number of attacks during a time period in multiple sclerosis patients), it is more relevant to model the rate of occurrence than the raw number. Materials and Methods: We propose to model the effect of covariates on this type of observed counts using zero hurdle negative binomial model in the presence of follow-up time as an offset term. Data from a retrospective study on patients with breast cancer were used to fit zero hurdle negative binomial model. The outcome variable was the number of involved lymph nodes in follow up interval and the main covariates were patient s age, tumor size (≤ 2/ 2-5/ > 5cm), estrogen receptor status (positive/ negative), progesterone receptor status (positive/ negative), human epidermal growth factor receptor 2 status (positive / negative) and tumor grade (I/ II/ III). Statistical analysis was performed using R software. Findings: Data were available for 165 patients with breast cancer. Zero hurdle negative binomial model predicted the number of involved lymph nodes with an Akaike information criterion (AIC) equal to 605.0507. This model indicated that predictor variables such as tumor grade and tumor size were significantly associated with negative nodes. Conclusion: The zero inflated negative binomial model can be useful for modelling count outcomes in presence of over-dispersion due to excess zeros in medical studies.

Keywords:

Over-dispersion , Excess zeros , Zero hurdle negative binomial , Lymph nodes

Authors

Elham Maraghi

Assistant professor, Department of Biostatistics and Epidemiology, Faculty of Public Health, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran

Mina Jahangiri

Ms.c in Biostatistic, Department of Biostatistics and Epidemiology, Faculty of Public Health, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran

Amal Saki Malehi

Ms.c student in Biostatistic, Department of Biostatistics and Epidemiology, Faculty of Public Health, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran

Soraya Moradi

Student Research Committee, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.