Serum peptidomics by MALDI-TOF MS coupled with machine learning approaches for diagnosis of primary liver cancer.

Journal: Analytical and bioanalytical chemistry
Published Date:

Abstract

Primary liver cancer (PLC) is one of the most common malignant tumors worldwide. Due to its insidious onset, 70 to 80% of patients are diagnosed at an advanced stage. Only about 20% of patients present detectable symptoms in the early stages, and this lack of obvious early signs portends a poor prognosis. In this study, a diagnostic strategy for PLC was developed by integrating high-throughput serum peptidomics, chemometric tools, and machine learning (ML) algorithms. A total of 433 serum samples from the PLC group and 145 samples from the healthy control (HC) group were enrolled and analyzed by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). The raw MS data were first preprocessed, and the global pattern similarities and differences between PLC and HC groups were investigated from a fingerprinting perspective. Feature selection was performed using partial least squares discriminant analysis (PLS-DA), random forest (RF), and least absolute shrinkage and selection operator (LASSO) algorithms. The integration of the three algorithms identified 12 key features with high representativeness and discriminatory power for PLC diagnosis. These features facilitated the subsequent construction and evaluation of models using nine ML algorithms. Among them, the support vector machine (SVM)-based model demonstrated the best discriminant performance with sensitivity, accuracy, and specificity of 99%, 95%, and 95%, respectively. The Shapley additive explanation (SHAP) method was applied to rank feature importance and further interpret the diagnostic model of the SVM method. This method has certain application potential for large-scale screening and detection of PLC and provides theĀ corresponding clinical reference value.

Authors

Keywords

No keywords available for this article.