Optimal Preventive Maintenance Policy for Non-Identical Components: Traditional Renewal Theory vs Modern Reinforcement Learning

Shaghayegh, Eidi; Abdollah, Safari; Firoozeh, Haghighi

Optimal Preventive Maintenance Policy for Non-Identical Components: Traditional Renewal Theory vs Modern Reinforcement Learning

عنوان مقاله: Optimal Preventive Maintenance Policy for Non-Identical Components: Traditional Renewal Theory vs Modern Reinforcement Learning
شناسه ملی مقاله: JR_IJRRS-6-1_009
منتشر شده در در سال 1402

مشخصات نویسندگان مقاله:

Shaghayegh Eidi - School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
Abdollah Safari - School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
Firoozeh Haghighi - School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran

خلاصه مقاله:

This paper compares the traditional approach against reinforcement learning algorithms to find the optimal preventive maintenance policy for equipment composed of multi-non-identical components with different time-to-failure distributions. As an application, we used the data from military trucks, which consisted of multiple components with very different failure behavior, such as tires, transmissions, wheel rims, couplings, motors, brakes, steering wheels, and shifting gears. The literature proposes Four different strategies for preventive maintenance of these components. To find the optimal preventive manganocene policy, we used the traditional approach (renewal theory-based) and the conventional reinforcement learning algorithms and compared their performance. The main advantages of the latter approach are that, unlike the traditional approach, they are not required to estimate the model parameters (e.g., transition probabilities). Without any explicit mathematical formula, they converge to the optimal solution. Our results showed that the traditional approach works best when the component time-to-failure distributions are available. However, the reinforcement learning approach outperforms where no such information is available or the distributions are misspecified.

کلمات کلیدی:

Opportunistic maintenance, preventive maintenance, Markov decision process, Monte Carlo, Q-Learning, Reinforcement Learning

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1832048/