Integration of single-cell and bulk RNA sequencing data using machine learning identifies oxidative stress-related genes LUM and PCOLCE2 as potential biomarkers for heart failure.
Journal:
International journal of biological macromolecules
PMID:
39929468
Abstract
Oxidative stress (OS) is a pivotal mechanism driving the progression of cardiovascular diseases, particularly heart failure (HF). However, the comprehensive characterisation of OS-related genes in HF remains largely unexplored. In the present study, we analysed single-cell RNA sequencing datasets from the Gene Expression Omnibus and OS gene sets from GeneCards. We identified 167 OS-related genes potentially linked to HF by applying algorithms, such as AUCell, UCell, singscore, ssgsea, and AddModuleScore, combined with correlation analysis. Subsequently, we used feature selection algorithms, including least absolute shrinkage and selection operator, XGBoost, Boruta, random forest, gradient boosting machines, decision trees, and support vector machine recursive feature elimination, to identify lumican (LUM) and procollagen C-endopeptidase enhancer 2 (PCOLCE2) as key biomarker candidates with significant diagnostic potential. Bulk RNA-sequencing confirmed their elevated expression in patients with HF, highlighting their predictive utility. Single-cell analysis further revealed their upregulation primarily in fibroblasts, emphasising their cell-specific role in HF. To validate these findings, we developed a transverse aortic constriction-induced HF mouse model that showed enhanced cardiac OS activity and significant PCOLCE2 upregulation in the HF group. These results provide strong evidence of the involvement of OS-related mechanisms in HF. Herein, we propose a diagnostic strategy that provides novel insights into the molecular mechanisms underlying HF. However, further studies are required to validate its clinical utility and ensure its application in the diagnosis of HF.