Effects of Structural Reward Shaping on Biophysical Properties in RL-Trained Plasmid Generators

Journal: bioRxiv

Published Date: May 19, 2026

Abstract

We compare the efficacy and distributional effects of supervised fine-tuning (SFT) and reinforcement learning (RL) post-training for PlasmidGPT, a foundation model for whole-plasmid generation, using Group Relative Policy Optimization (GRPO) for the RL model. Using a biologically motivated reward function encoding functional annotations, length constraints, and repeat penalties, the RL model achieves a 71.6% quality control pass rate across 8 prompts on 4,000 sequences, compared to 4.3% for the pretrained baseline and 11.0% for SFT. A five-model reward ablation identifies the cassette arrangement bonus, which rewards correct promoter[->]CDS[->]terminator ordering, as the critical reward component. Rejection-sampling baselines indicate that the gain is not recovered by sampling more heavily from the base model. Beyond directly optimized features, RL-generated sequences converge toward real plasmid distributions in 3-mer composition, ORF length, and thermodynamic stability, properties we categorize as reward-correlated or indirectly shaped by the structural reward signal. Minimum free energy density independently converges to the real-plasmid regime under both SFT and RL despite these being parallel post-training paths. On a small curated hold-out set, RL improves continuation log-likelihood over the pretrained baseline on every sequence (mean {Delta} = +0.83 nats), with no degradation in next-token prediction.

Authors

Thiel
M.; Cunningham
A.; Barnes
C. P.

External Resources

View on bioRxiv Access via DOI

Effects of Structural Reward Shaping on Biophysical Properties in RL-Trained Plasmid Generators

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Effects of Structural Reward Shaping on Biophysical Properties in RL-Trained Plasmid Generators

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals