Imbalanced Learning Techniques for Land Subsidence Prediction: Ensemble Methods and Data Balancing Strategies

Publish Year: 1404
نوع سند: مقاله کنفرانسی
زبان: English
View: 59

This Paper With 15 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

EMGBC09_056

تاریخ نمایه سازی: 1 آذر 1404

Abstract:

Land subsidence poses significant threats to infrastructure, environment, and public safety, making accurate prediction essential for disaster risk reduction and sustainable resource management. This study addresses the critical challenge of class imbalance in land subsidence prediction datasets, where subsidence events are rare compared to stable ground conditions, leading to biased models that poorly detect actual subsidence occurrences. We propose and evaluate several imbalanced learning approaches, including random under-sampling, cost-sensitive algorithms, and ensemble methods (bagging and boosting), for predicting land subsidence in Chaharmahal and Bakhtiari province, Iran. The study utilizes a comprehensive dataset of ۵۱۶ subsidence locations identified through InSAR analysis, along with ۱۳ conditioning factors including geological, hydrological, environmental, and anthropogenic variables. Multiple imbalanced learning techniques are systematically compared using precision, recall, F۱-score, and ROC-AUC score metrics. Results demonstrate that random under-sampling followed by Random Forest achieves the most balanced performance with precision, recall, and F۱-score all reaching ۹۴% and ROC-AUC of ۹۸.۴%. While bagging method applied directly to imbalanced data achieves high recall (۹۶%) and ROC-AUC (۹۹%), it suffers from lower precision due to false positives. The fine-tuned models are used to generate land subsidence susceptibility maps for the entire study area, revealing that eastern and southeastern regions exhibit the highest susceptibility. Risk analysis shows that random under-sampling is more conservative method producing the most balanced risk distribution with ۹.۱% and ۸.۱% of areas classified as high and very high risk, respectively. The findings highlight the critical importance of addressing class imbalance for achieving reliable subsidence prediction. This research provides valuable insights for improving early warning systems and supporting informed decision-making for land subsidence risk management.

Authors

Khayyam Salehi

Department of Computer Science, Faculty of Mathematical Sciences, Shahrekord University, Iran

Maryam Karimi

Department of Computer Science, Faculty of Mathematical Sciences, Shahrekord University, Iran

Khosro Keyani

Department of Civil Engineering, Shahrekord Branch, Islamic Azad University, Shahrekord, Iran