Investigating the Hyperparameters of Reinforcement Learning Effects on Orbit Raising Maneuver

Hamed Soleymani; Majid Bakhtiari; Kamran Daneshjou

Investigating the Hyperparameters of Reinforcement Learning Effects on Orbit Raising Maneuver

Publish place: The 22nd International Conference of Iranian Aerospace Society

Publish Year: 1402

نوع سند: مقاله کنفرانسی

زبان: English

This Paper With 7 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/2058606

شناسه ملی سند علمی:

AEROSPACE22_075

تاریخ نمایه سازی: 31 مرداد 1403

Abstract:

In recent years, significant advancements in the fieldof artificial intelligence have prompted space research,particularly in orbital missions, to increasinglyembrace these methods, with a specific focus onmachine learning. In this research, considering thedynamics of circular in-plane low-thrust orbit transferbased on the equinoctial differential equations as theenvironment for establishing agent interaction, acontinuous space for the problem variables which arethe six equinoctial orbital elements of a spacecraft, amodel-free algorithm called Actor-Critic algorithm, isimplemented. The action space which defined as athrust vector is applied to the environment under apolicy, and the agent is trained by Actor-Criticalgorithm, to be capable of performing the LEO toGEO low-thrust transfer. Effects of thehyperparameters such as the discount factor, learningrate and the number of nodes in actor and criticnetwork, are investigated in this scenario. It is shownthat increasing the discount factor and learning rateassists the trained agent in operating accurately in theenvironment of the orbital transfer problem. Increasein the number of nodes in the neural network cause anincrement in the learning time of the agent. Byincreasing amount of the discount factor near to ۱, theagent performs some further searches in theenvironment to find other possible optimal policies.After two training processes, one can use the trainedagent in different cases with similar dynamics to themain problem, and there is no need to adjust or resimulatethe parameters and dynamics of the problem.

Keywords:

Low-Thrust – Equinoctial orbital elements– Reinforcement Learning – Actor-Critic networks –Agent

Authors

Hamed Soleymani

Ph. D studentIran University of Science and Technology, School of New Technologies

Majid Bakhtiari

Assistant ProfessorIran University of Science and Technology, School of New Technologies

Kamran Daneshjou

Professor,Iran University of Science and Technology, School of Mechanical Engineering