Browsing by Autor "Kevin Marlon Soza Mamani"

Now showing 1 - 7 of 7

A Double Deep Q-Learning Network-Based Path Planning Approach for Autonomous Mobile Robots in Mining Environments
(2024) Carmen Morales; Kevin Marlon Soza Mamani; Alvaro Prado
Motion planning comprises one of the main corner-stones in autonomous mobile robotics, where obstacle avoidance and path planning efficiency are quintessential for the success of maneuverability applications. However, real-time implemen-tation of path planning is limited by adaptive scenarios, high dimensional maps, and time constraints. This paper proposes a Double Deep Q-Network approach for path planning and obstacle avoidance of skid-steer mobile robots due to its ability to explore an extended navigation workspace, and to reduce over estimation bias produced by sparse rewards. The proposed DDPG approach was compared to Q-Iearning and Deep Q-Network (DQN) algorithms to examine path planning performance under changing simulation environments, intended to be similar to those found in mining. Results from several exploration trails show that DDQN enhances the path length and significantly outperforms QL and DQN regarding path following time, reducing it about 26 % and 17 %, respectively. Ongoing research is expected to have an impact on the energy resources of the robot in mining scenarios.
Deep Reinforcement Learning Via Nonlinear Model Predictive Control for Thermal Process with Variable Longtime Delay
(2024) Kevin Marlon Soza Mamani; Óscar Camacho; Alvaro Prado
The main concern in thermal process control revolves around uncertainties and disturbances, arising from external processes, unmodeled dynamics, or simplified characteristics, to name a few. For instance, a primary source of uncertainties involves disturbances and long-time delays, which typically lead to loose robust control performance. This paper develops a robust control technique based on Reinforcement-Learning (RL) strategies via Deep Deterministic Policy Gradient (DDPG), integrating Nonlinear Model Predictive Control (NMPC). The NMPC works as a policy generator and the DDPG strategy is devoted to evaluating the learning process. While NMPC was able to approach tracking performance, the combined scheme with DDPG allowed further robust performance in terms of adaptation to changing thermal process conditions such as external disturbances and variations to internal model parameters. Indeed, combining strategies (NMPC-based DDPG) rendered unnecessary offline design of a terminal cost and constraints typically required in traditional robustified NMPC strategies. The RL agent was trained, tested, and validated in a simulation environment using a thermal process with longtime delay. Results demonstrated that the proposed NMPC-based DDPG technique achieved nearly similar tracking performance compared to traditional NMPC strategies, even maintaining control objectives. However, the proposed control strategy exhibited enhanced adaptivity regarding NMPC under the presence of disturbances and model parameter variations. The latter findings are expected to have an impact on the energy resources of real thermal processes in the industry.
Flocking Model for Self-Organized Swarms
(2019) Kevin Marlon Soza Mamani; Fabio Richard Diaz Palacios
The algorithms of self-organized swarm control refer first to two basic behaviors, these are aggregation and flocking. The present work focuses its research on coordinated movement behavior that is defined as the ability of a group of individuals (usually composed of hundreds or thousands) to move and maneuver in a coordinated manner as if they were a single structure. Such behavior refers us to studies carried out in the field of trajectory control and aggregation behavior, both being keys for the development of a coordinated movement control algorithm. Therefore, control of the system starts from the combination of these studies. The control is a leading robot model which can be designated for any unit according to the assigned task. On the other hand, all robots except the leader keep the group attached and keeping a safe distance. The simulations of the system were developed for three units and later for twelve, observing the cohesion and uniformity of the swarm in movement.
Integrating Model Predictive Control with Deep Reinforcement Learning for Robust Control of Thermal Processes with Long Time Delays
(Multidisciplinary Digital Publishing Institute, 2025) Kevin Marlon Soza Mamani; Alvaro Prado
Thermal processes with prolonged and variable delays pose considerable difficulties due to unpredictable system dynamics and external disturbances, often resulting in diminished control effectiveness. This work presents a hybrid control strategy that synthesizes deep reinforcement learning (DRL) strategies with nonlinear model predictive control (NMPC) to improve the robust control performance of a thermal process with a long time delay. In this approach, NMPC cost functions are formulated as learning functions to achieve control objectives in terms of thermal tracking and disturbance rejection, while an actor–critic (AC) reinforcement learning agent dynamically adjusts control actions through an adaptive policy based on the exploration and exploitation of real-time data about the thermal process. Unlike conventional NMPC approaches, the proposed framework removes the need for predefined terminal cost tuning and strict constraint formulations during the control execution at runtime, which are typically required to ensure robust stability. To assess performance, a comparative study was conducted evaluating NMPC against AC-based controllers built upon policy gradient algorithms such as the deep deterministic policy gradient (DDPG) and the twin delayed deep deterministic policy gradient (TD3). The proposed method was experimentally validated using a temperature control laboratory (TCLab) testbed featuring long and varying delays. Results demonstrate that while the NMPC–AC hybrid approach maintains tracking control performance comparable to NMPC, the proposed technique acquires adaptability while tracking and further strengthens robustness in the presence of uncertainties and disturbances under dynamic system conditions. These findings highlight the benefits of integrating DRL with NMPC to enhance reliability in thermal process control and optimize resource efficiency in thermal applications.
Low-Computational-Load Real-time Path Planning and Trajectory Control based on Artificial Potential Fields
(2023) Kevin Marlon Soza Mamani; Marcelo Saavedra Alcoba
The presented project shows the theoretical development, design and subsequent implementation of a low-computational-load path planning system and the corresponding movement control for differential type robots. The model contemplates the use of Artificial Potential Fields for both aspects, control and planning. The theoretical development is based on the combination of the Local Minimum Method with a kinematic motion control model. The model simulations are performed in a Python programming environment (for the generation of the optimal path based on image processing) and in Matlab (for initial trajectory tracking simulation). The subsequent implementation is carried out by means of the GLADIUS ME32A robot in a controlled environment. Likewise, the system is powered by peripheral artificial vision and Wi-Fi communication.
MIMC-VADOC Model for Autonomous Multi-robot Formation Control Applied to Differential Robots
(2022) Kevin Marlon Soza Mamani; Jhon Ordoñez
The present work highlights the development of a decentralized formation control model focused on mobile differential type robots. It refers its study mainly to the control design, subsequent simulation and implementation of control systems. The control theory starts from two different robot motion models. The first is related to trajectory and position control, meanwhile the second includes multi-robot control systems, especially associated to robotic swarms and potential fields. Subsequently, a list of formation control requirements is proposed. Based on this, the potential field-based multi-robot formation control model is developed. After simulations, the global model feedback and communication systems are implemented on real differential mobile robots. Finally, the complete control system is tested and compared with other models within a controlled indoor environment.
Postprocessing Optimization of RRT* Using Machine Learning and Information Theory for Robotic Navigation
(2025) Marcelo Saavedra Alcoba; Edgar Salazar Florez; Brayan G. Duran Toconas; Kevin Marlon Soza Mamani
This work presents an alternative post-processing approach for optimizing mobile robot trajectories by combining vector quantization techniques with information theory. We developed an algorithm based on Vector Quantization (VQ) and Kullback-Leibler Divergence (VQKL) that maintains the original RRT*'s obstacle avoidance capabilities. When comparing VQKL and VQ with the Ramer-Douglas-Peucker (RDP) algorithm, our methods demonstrate significant superiority: VQ achieves a 13% reduction in path length (versus RDP's 10%) while VQKL achieves 14%, along with an 83% (VQ) and 84% (VQKL) reduction in node count compared to the original RRT* output. These results are obtained through an adaptive optimization process that iteratively adjusts centroids using a progressive annealing scheme. To ensure trajectory feasibility, we implemented a validation system that verifies both geometric deviation from the original path and collision-free operation with obstacles. Extensive simulations across 20 different environments with 100 trials each confirm that our method generates significantly shorter, more efficient, and safer trajectories, establishing a viable alternative for robotic path optimization.