scPlantAnnotate: an accurate and robust transformer-based model for plant cell type annotation.
Journal:
Journal of advanced research
Published Date:
Jan 17, 2026
Abstract
INTRODUCTION: Accurate cell type annotation remains a major bottleneck in plant single-cell RNA sequencing (scRNA-seq), where existing tools are often adapted from animal studies and perform sub-optimally on plant data. The lack of plant-specific computational frameworks limits the construction of plant cell atlases and downstream biological discovery. OBJECTIVES: We develop and evaluate scPlantAnnotate, a Transformer-based reference annotation framework tailored for plant scRNA-seq data, and benchmark it against state-of-the-art deep learning and conventional methods across multiple plant species. METHODS: Species-specific scPlantAnnotate models were trained using curated datasets from Arabidopsis thaliana, Zea mays, Oryza sativa, and Glycine max. We compared scPlantAnnotate with leading baselines under both standard random-split evaluation and a more stringent leave-one-dataset-out setting, which tests robustness to completely unseen datasets and tissue types. RESULTS: scPlantAnnotate consistently outperforms existing approaches across all four species under random-split evaluation. In the leave-one-dataset-out setting for A. thaliana, where performance drops markedly for all methods due to strong batch effects and dataset heterogeneity, scPlantAnnotate nonetheless achieves the highest Accuracy, Macro-F1, Balanced Accuracy, and Macro-AUROC on average and ranks first on most held-out datasets. These results demonstrate improved robustness to dataset shifts, a critical yet underexplored challenge in plant scRNA-seq analysis. A freely accessible web server enables users to annotate their own datasets using pretrained models. CONCLUSION: scPlantAnnotate provides a plant-specific, Transformer-based framework for single-cell annotation that delivers state-of-the-art performance and enhanced robustness to unseen datasets. By addressing limitations of existing tools and enabling scalable reference-based annotation, scPlantAnnotate supports the development of comprehensive plant cell atlases and facilitates broader use of single-cell genomics in plant biology.
Authors
Keywords
No keywords available for this article.