Machine learning-based prediction of distant metastasis risk in invasive ductal carcinoma of the breast.

Journal: PloS one
PMID:

Abstract

More than 90% of deaths due to breast cancer (BC) are due to metastasis-related complications, with invasive ductal carcinoma (IDC) of the breast being the most common pathologic type of breast cancer and highly susceptible to metastasis to distant organs. BC patients who develop cancer metastases are more likely to have a poor prognosis and poor quality of life, so it is extremely important to recognize and diagnose whether distant metastases have occurred in IDC as early as possible. In this study, we develop a non-invasive breast cancer classification system for detecting cancer metastasis. We used Anaconda-Jupyter notebooks to develop various Python programming modules for text mining, data processing, and machine learning (ML) methods. A risk prediction model was constructed based on four algorithms: Random Forest, XGBoost, Logistic Regression, and SVM. Additionally, we developed a hybrid model based on a voting mechanism using these four algorithms as the base models. The models were compared and evaluated by the following metrics: accuracy, precision, recall, F1-score, and area under the ROC curve (AUC) values. The experimental results show that the hybrid model based on the voting mechanism exhibits the best prediction performance (accuracy: 0.867, precision: 0.929, recall: 0.805, F1-score: 0.856, AUC: 0.94). This stable risk prediction model provides a valuable reference support for doctors in assessing and diagnosing the risk of IDC hematogenous metastasis. It also improves the work efficiency of doctors and strives to provide patients with increased chances of survival.

Authors

  • Jingru Dong
    Shihezi University Medical College School of Medical College, Shihezi University, Shihezi, Xinjiang, China.
  • Ruijiao Lei
    Shihezi University Medical College School of Medical College, Shihezi University, Shihezi, Xinjiang, China.
  • Feiyang Ma
    Shihezi University Medical College School of Medical College, Shihezi University, Shihezi, Xinjiang, China.
  • Lu Yu
    State Key Laboratory Breeding Base of Green Pesticide and Agricultural Bioengineering, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Guizhou University, Huaxi District, Guiyang 550025, China.
  • Lanlan Wang
  • Shangzhi Xu
    Shihezi University Medical College School of Medical College, Shihezi University, Shihezi, Xinjiang, China.
  • Yunhua Hu
    Department of Public Health, Shihezi University School of Medicine, 832000, China.
  • Jialin Sun
    Shihezi University Medical College School of Medical College, Shihezi University, Shihezi, Xinjiang, China.
  • Wenwen Zhang
    Rutgers, the State University of New Jersey, New Brunswick, NJ, USA.
  • Haixia Wang
    Department of Ultrasound, Luohu People's Hospital, Shenzhen, China.
  • Li Zhang
    Department of Animal Nutrition and Feed Science, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.