Machine learning-based identification of cuproptosis-related lncRNA biomarkers in diffuse large B-cell lymphoma.

Journal: Cell biology and toxicology
PMID:

Abstract

Multiple machine learning techniques were employed to identify key long non-coding RNA (lncRNA) biomarkers associated with cuproptosis in Diffuse Large B-Cell Lymphoma (DLBCL). Data from the TCGA and GEO databases facilitated the identification of 126 significant cuproptosis-related lncRNAs. Various feature selection methods, such as Univariate Filtering, Lasso, Boruta, and Random Forest, were integrated with a Transformer-based model to develop a robust prognostic tool. This model, validated through fivefold cross-validation, demonstrated high accuracy and robustness in predicting risk scores. MALAT1 was pinpointed using permutation feature importance from machine learning methods and was further validated in DLBCL cell lines, confirming its substantial role in cell proliferation. Knockdown experiments on MALAT1 led to reduced cell proliferation, underscoring its potential as a therapeutic target. This integrated approach not only enhances the precision of biomarker identification but also provides a robust prognostic model for DLBCL, demonstrating the utility of these lncRNAs in personalized treatment strategies. This study highlights the critical role of combining diverse machine learning methods to advance DLBCL research and develop targeted cancer therapies.

Authors

  • Wenhao Ouyang
    Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Department of Medical Oncology, Breast Tumor Centre, Phase I Clinical Trial Centre, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.
  • Zijia Lai
    Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, 510120, Guangdong, China.
  • Hong Huang
    Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland, Department of Microbiology and Immunology and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, SIB Swiss Institute of Bioinformatics, 1 Rue Michel Servet, 1211 Geneva, Switzerland, Department of Medicine and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA, School of Information, University of South Florida, Tampa, FL, 33647, USA, Genomics Division, Lawrence Berkeley National Lab, 1 Cyclotron Rd., Berkeley, 94720 CA USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland, ETH Zurich, Department of Computer Science, Universitätstr. 19, 8092 Zürich, Switzerland, SIB Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zürich, Switzerland and University College London, Gower St, London WC1E 6BT, UK.
  • Li Ling
    Key Laboratory of Food Quality and Safety of Guangdong Province, College of Food Science, South China Agricultural University, Guangzhou, 510642, China.