A Review on Fault Tolerance Techniques for High Performance Computing

Ahmad fadaei Tehrani; Framarz Safi

A Review on Fault Tolerance Techniques for High Performance Computing

Publish place: National Conference on Computer Engineering and Information Technology Management

Publish Year: 1393

نوع سند: مقاله کنفرانسی

زبان: English

This Paper With 7 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/282626

شناسه ملی سند علمی:

CSITM01_085

تاریخ نمایه سازی: 10 شهریور 1393

Abstract:

Cloud computing is the next generation computing. There are new capacity and flexibilityto HPC (High Performance Computing) applications with using large number of virtual machines forcomputational intensive applications. Today’s high performance computing systems are typicallymanaged and operated by individual organizations in private. A cloud-based Infrastructure-as-a-Service (IaaS) approach for high performance computing applications promises cost savings andmore flexibility. High performance computing (HPC) systems may fail because of large workloadand number of servers. Fault tolerance techniques allow HPC systems on cloud to executecomputational intensive application with multiple of nodes. Fault tolerance can provide bestperformance of tasks in the presence of hardware and software faults. However, main failures aremostly hardware based. Also, system availability is very important and fault tolerance techniquesused to detect and predict faults. This paper gives an overview on most popular fault tolerancetechniques in HPC, prediction models and tools used in HPC.

Keywords:

High Performance Computing , Reactive Fault Tolerance , Proactive Fault Tolerance , Predictions models , Artificial Intelligent Computing , Time series models

Authors

Ahmad fadaei Tehrani

Dept.Computer, Najafabad Branch, Islamic Azad University of Najafabad

Framarz Safi

Dept.Computer, Najafabad Branch, Islamic Azad University of Najafabad

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :

T. C h alermarrewong _ _ S.C.W See., and Achalakul, ...

نمایش کامل مراجع