Employing Chaos Theory for Exploration-Exploitation Balance in Reinforcement Learning

Publish Year: 1404
نوع سند: مقاله ژورنالی
زبان: English
View: 129

This Paper With 14 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

JR_JADM-13-2_003

تاریخ نمایه سازی: 12 شهریور 1404

Abstract:

The exploration-exploitation trade-off poses a significant challenge in reinforcement learning. For this reason, action selection methods such as ε-greedy and Soft-Max approaches are used instead of the greedy method. These methods use random numbers to select an action that balances exploration and exploitation. Chaos is commonly utilized across various scientific disciplines because of its features, including non-periodicity, unpredictability, ergodicity and pseudorandom behavior. In this paper, we employ numbers generated by different chaotic systems to select action and identify better maps in diverse states and quantities of actions. Based on our experiments on various environments such as the Multi-Armed Bandit (MAB), taxi-domain, and cliff-walking, we found that many of the chaotic methods increase the speed of learning and achieve higher rewards.

Authors

Habib Khodadadi

Department of Computer Engineering, Minab Branch, Islamic Azad University, Minab, Iran.

Vali Derhami

Computer Engineering Department, Yazd University, Yazd, Iran.

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :
  • V. Derhami, F. Alamian Harandi and M. B. Dowlatshahi, Reinforcement ...
  • F. Alamiyan-Harandi, V. Derhami and F. Jamshidi, “A new framework ...
  • RS. Sutton and AG. Barto, Reinforcement learning: An introduction. ۲nd ...
  • BH. Abed-alguni, “Action-selection method for reinforcement learning based on cuckoo ...
  • K. Morihiro, T. Isokawa, N. Matsui and H. Nishimura, “Effects ...
  • K. Morihiro, T. Isokawa, N. Matsui and H. Nishimura, “Reinforcement ...
  • K. Morihiro, N. Matsui and H. Nishimura, “Effects of chaotic ...
  • K. Morihiro, N. Matsui and H. Nishimura, “Chaotic exploration effects ...
  • AB. Potapov and MK. Ali, “Learning, exploration and chaotic policies”, ...
  • E. Pei, J. Jiang, L. Liu, Y. Li and Z. ...
  • B. Zarei and MR. Meybodi, “Improving learning ability of learning ...
  • EN. Lorenz, “Deterministic nonperiodic flow”, Journal of atmospheric sciences, vol. ...
  • G. Chen and T. Ueta, “Yet another chaotic attractor”, International ...
  • H. Khodadadi and V. Derhami, “Improving Speed and Efficiency of ...
  • M. Mollaeefar, A. Sharif and M. Nazari, “A novel encryption ...
  • RY. Chen, J. Schulman, P. Abbeel and S. Sidor, “UCB ...
  • M. Tokic, “Adaptive ε-greedy exploration in reinforcement learning based on ...
  • M. Tokic and G. Palm, “Value-difference based exploration: adaptive control ...
  • V. Derhami, V. Johari Majd, MN. Ahmadabadi, “Exploration and exploitation ...
  • YL. He, XL. Zhang, W. Ao and JZ. Huang, “Determining ...
  • M. Guo, Y. Liu and J. Malec, “A new Q-learning ...
  • C. Chen, D. Dong, HX. Li, J. Chu and TJ. ...
  • RA. Bianchi, CH. Ribeiro CH and AHR. Costa, “Heuristically Accelerated ...
  • A. Ecoffet, J. Huizinga, J. Lehman, K.O. Stanley and J. ...
  • T. Lin and A. Jabri, “MIMEx: intrinsic rewards from masked ...
  • G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, ...
  • Z. Hua and Y. Zhou, “Exponential chaotic model for generating ...
  • AH. Gandomi and XS. Yang,” Chaotic bat algorithm”, Journal of ...
  • Jr I. Fister, M. Perc, SM. Kamal and I. Fister, ...
  • H. Lu, X. Wang, Z. Fei and M. Qiu, “The ...
  • X. Zhang and Y. Cao, “A novel chaotic map and ...
  • C. Zhu, “A novel image encryption scheme based on improved ...
  • A. Rezaee Jordehi, “A chaotic artificial immune system optimization algorithm ...
  • PP. Singh, “A chaotic system with large Lyapunov exponent: Nonlinear ...
  • N. Nguyen, L. Pham-Nguyen, MB. Nguyen and G ...
  • Kaddoum, “A low power circuit design for chaos-key based data ...
  • KZ. Zamli, F. Din, HS. Alhadawi, “Exploring a Q-learning-based chaotic ...
  • L. Moysis, A. Tutueva, C. Volos, D. Butusov, JM. Munoz-Pacheco, ...
  • L. Skanderova, I. Zelinka, “Arnold cat map and sinai as ...
  • نمایش کامل مراجع