Task failure prediction in cloud computing systems
Publish Year: 1404
نوع سند: مقاله کنفرانسی
زبان: English
View: 17
This Paper With 9 Page And PDF Format Ready To Download
- Certificate
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
ICIRT01_021
تاریخ نمایه سازی: 9 آذر 1404
Abstract:
As cloud data centers grow in scale and complexity, ensuring high service reliability and minimizing failures have become critical challenges. Despite technological advances, failures due to hardware and software issues persist, disrupting tasks, wasting resources, and impacting service reliability. Accurately predicting task or job failures before they occur is essential to reducing downtime and unnecessary resource usage. Traditional fault-tolerance methods like checkpointing and replication are insufficient for the complexity of modern systems. Consequently, machine learning and deep learning techniques have been adopted to analyze system logs and predict failures more accurately. Federated learning further enhances this by enabling decentralized data analysis across nodes, preserving privacy while improving prediction accuracy through collaborative learning. In this paper, we propose a fault prediction mechanism based on federated learning and a deep neural network to identify patterns leading to task failures. Our model achieved a high prediction accuracy of ۹۵.۳%, making it a robust solution for failure prediction in cloud computing environments.
Keywords:
Authors
Milad Mahdudi
School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, IRAN
Pooya Jamshidi
School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, IRAN
Shahpour Rahmani
School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, IRAN
Nasser Yazdani
School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, IRAN