Collaborative twin actors framework using deep deterministic policy gradient for flexible batch processes.
Journal:
Neural networks : the official journal of the International Neural Network Society
Published Date:
Apr 12, 2025
Abstract
Due to its inherent efficiency in the process industry for achieving desired products, batch processing is widely acknowledged for its repetitive nature. Batch-to-batch learning control has traditionally been esteemed as a robust strategy for batch process control. However, the presence of flexible operating conditions in practical batch systems often leads to a lack of prior learning information, hindering learning control from optimizing performance. This article presents a novel approach to flexible batch process control using deep reinforcement learning (DRL) with twin actors. Specifically, a collaborative twin-actor-based deep deterministic policy gradient (CTA-DDPG) method is proposed to generate control policies and ensure safe operation across varying trial lengths and initial conditions. This approach involves the sequential construction of two sets of actor-critic networks with a shared critic. The first set explores meta-policy during an offline stage, while the second set enhances control performance using a supplementary agent during an online stage. To ensure robust policy transfer and efficient learning, a policy integration mechanism and a spatial-temporal experience replay strategy are incorporated, facilitating transfer stability and learning efficiency. The performance of CTA-DDPG is evaluated using both numerical examples and nonlinear injection molding process for tracking control. The results demonstrate the effectiveness and superiority of the proposed method in achieving desired control outcomes.