Breathomics Analysis for Early Diagnosis of Lung Cancer Based on PTR-TOF MS: A Large Sample Size Cross-Sectional Study.

Journal: Respirology (Carlton, Vic.)
Published Date:

Abstract

INTRODUCTION: Lung cancer remains a leading cause of cancer mortality globally, emphasising the critical need for non-invasive and cost-effective early screening methods. Breath analysis, detecting disease-specific volatile organic compounds (VOCs), presents a promising diagnostic avenue. METHODS: This cross-sectional study enrolled 4515 participants, including 4099 nonmalignant controls and 416 lung cancer patients. Exhaled breath samples were analysed using proton transfer reaction time-of-flight mass spectrometry (PTR-TOF MS). Machine learning algorithms, particularly Light Gradient Boosting Machine (LGBM), were employed to construct classification models for distinguishing lung cancer from healthy controls and early-stage lung cancer from benign pulmonary nodules. Model interpretability was assessed using SHAP values. RESULTS: The LGBM model demonstrated superior performance, achieving 95% sensitivity, 98% specificity, and 98% accuracy for discriminating lung cancer from healthy controls. For the clinically challenging task of distinguishing early-stage lung cancer from benign nodules, LGBM achieved 97% sensitivity, 98% specificity, and 98% accuracy. SHAP analysis identified alpha-pinene (m/z 137) and methyl methacrylate (m/z 101) as the most significant VOCs. CONCLUSION: This large-scale study validates PTR-TOF MS based breath analysis combined with machine learning as a robust, non-invasive tool for early lung cancer detection. The LGBM model, supported by SHAP interpretability, offers high diagnostic accuracy in large cohorts. Future work will expand to diverse histological subtypes and multicenter validation. TRIAL REGISTRATION: ChiCTR2500101879.

Authors

Keywords

No keywords available for this article.