Browsing by Autor "Alvaro Prado"

Now showing 1 - 3 of 3

A Double Deep Q-Learning Network-Based Path Planning Approach for Autonomous Mobile Robots in Mining Environments
(2024) Carmen Morales; Kevin Marlon Soza Mamani; Alvaro Prado
Motion planning comprises one of the main corner-stones in autonomous mobile robotics, where obstacle avoidance and path planning efficiency are quintessential for the success of maneuverability applications. However, real-time implemen-tation of path planning is limited by adaptive scenarios, high dimensional maps, and time constraints. This paper proposes a Double Deep Q-Network approach for path planning and obstacle avoidance of skid-steer mobile robots due to its ability to explore an extended navigation workspace, and to reduce over estimation bias produced by sparse rewards. The proposed DDPG approach was compared to Q-Iearning and Deep Q-Network (DQN) algorithms to examine path planning performance under changing simulation environments, intended to be similar to those found in mining. Results from several exploration trails show that DDQN enhances the path length and significantly outperforms QL and DQN regarding path following time, reducing it about 26 % and 17 %, respectively. Ongoing research is expected to have an impact on the energy resources of the robot in mining scenarios.
Deep Reinforcement Learning Via Nonlinear Model Predictive Control for Thermal Process with Variable Longtime Delay
(2024) Kevin Marlon Soza Mamani; Óscar Camacho; Alvaro Prado
The main concern in thermal process control revolves around uncertainties and disturbances, arising from external processes, unmodeled dynamics, or simplified characteristics, to name a few. For instance, a primary source of uncertainties involves disturbances and long-time delays, which typically lead to loose robust control performance. This paper develops a robust control technique based on Reinforcement-Learning (RL) strategies via Deep Deterministic Policy Gradient (DDPG), integrating Nonlinear Model Predictive Control (NMPC). The NMPC works as a policy generator and the DDPG strategy is devoted to evaluating the learning process. While NMPC was able to approach tracking performance, the combined scheme with DDPG allowed further robust performance in terms of adaptation to changing thermal process conditions such as external disturbances and variations to internal model parameters. Indeed, combining strategies (NMPC-based DDPG) rendered unnecessary offline design of a terminal cost and constraints typically required in traditional robustified NMPC strategies. The RL agent was trained, tested, and validated in a simulation environment using a thermal process with longtime delay. Results demonstrated that the proposed NMPC-based DDPG technique achieved nearly similar tracking performance compared to traditional NMPC strategies, even maintaining control objectives. However, the proposed control strategy exhibited enhanced adaptivity regarding NMPC under the presence of disturbances and model parameter variations. The latter findings are expected to have an impact on the energy resources of real thermal processes in the industry.
Integrating Model Predictive Control with Deep Reinforcement Learning for Robust Control of Thermal Processes with Long Time Delays
(Multidisciplinary Digital Publishing Institute, 2025) Kevin Marlon Soza Mamani; Alvaro Prado
Thermal processes with prolonged and variable delays pose considerable difficulties due to unpredictable system dynamics and external disturbances, often resulting in diminished control effectiveness. This work presents a hybrid control strategy that synthesizes deep reinforcement learning (DRL) strategies with nonlinear model predictive control (NMPC) to improve the robust control performance of a thermal process with a long time delay. In this approach, NMPC cost functions are formulated as learning functions to achieve control objectives in terms of thermal tracking and disturbance rejection, while an actor–critic (AC) reinforcement learning agent dynamically adjusts control actions through an adaptive policy based on the exploration and exploitation of real-time data about the thermal process. Unlike conventional NMPC approaches, the proposed framework removes the need for predefined terminal cost tuning and strict constraint formulations during the control execution at runtime, which are typically required to ensure robust stability. To assess performance, a comparative study was conducted evaluating NMPC against AC-based controllers built upon policy gradient algorithms such as the deep deterministic policy gradient (DDPG) and the twin delayed deep deterministic policy gradient (TD3). The proposed method was experimentally validated using a temperature control laboratory (TCLab) testbed featuring long and varying delays. Results demonstrate that while the NMPC–AC hybrid approach maintains tracking control performance comparable to NMPC, the proposed technique acquires adaptability while tracking and further strengthens robustness in the presence of uncertainties and disturbances under dynamic system conditions. These findings highlight the benefits of integrating DRL with NMPC to enhance reliability in thermal process control and optimize resource efficiency in thermal applications.