Convergent genome- and gene-level constraints shape repeated environmental adaptation in grasses
Journal:
bioRxiv
Published Date:
Jun 3, 2026
Abstract
Grasses (Poaceae) dominate terrestrial ecosystems and sustain global food security, yet the genomic principles enabling their repeated adaptation to extreme environments remain unresolved. Combining dense phylogenomic sampling, global environmental data, and genomic large language models (gLLMs), we characterize the mutational targets underlying environmental adaptation across 707 genomes from 569 species spanning 17 climate zones. We identify 19-30 phylogenetically independent transitions into extreme temperature, water, and soil environments, accompanied by convergent shifts in genome-scale molecular properties, including the Nitrogen-to-Carbon balance and the biosynthetic cost of the proteome. Our gLLMs-informed phylogenetic mixed modeling framework identifies 330 genes that repeatedly underlie distinct axes of environmental adaptation, highlighting the importance of protein modification and localization within extracellular and organellar compartments. Overlaying independent convergent adaptation tests identifies 17 high-confidence candidates for further characterization. Together, our results show that grass adaptation is canalized by layered constraints at genome-wide and gene-specific scales, producing predictable evolutionary trajectories.