A reinforcement learning approach to coordinate exploration with limited communication in continuous action games

Abdel RodríguezPeter VrancxRicardo GrauAnn Nowé2026-03-222026-03-22201610.1017/s026988891500020xhttps://doi.org/10.1017/s026988891500020xhttps://andeanlibrary.org/handle/123456789/52210Citaciones: 5Abstract Learning automata are reinforcement learners belonging to the class of policy iterators. They have already been shown to exhibit nice convergence properties in a wide range of discrete action game settings. Recently, a new formulation for a continuous action reinforcement learning automata (CARLA) was proposed. In this paper, we study the behavior of these CARLA in continuous action games and propose a novel method for coordinated exploration of the joint-action space. Our method allows a team of independent learners, using CARLA, to find the optimal joint action in common interest settings. We first show that independent agents using CARLA will converge to a local optimum of the continuous action game. We then introduce a method for coordinated exploration which allows the team of agents to find the global optimum of the game. We validate our approach in a number of experiments.enReinforcement learningComputer scienceAction (physics)Action selectionConvergence (economics)Artificial intelligenceLearning automataAutomatonClass (philosophy)ReinforcementA reinforcement learning approach to coordinate exploration with limited communication in continuous action gamesarticle