Revolutionizing Lung Cancer Detection: Evaluating AI Models for VOC Analysis and Unveiling Key Exhaled Biomarkers
Journal:
medRxiv
Published Date:
Jan 1, 2025
Abstract
Volatile Organic Compounds (VOCs) are organic chemicals that readily vaporize at room temperature and are emitted from diverse sources, including paints, building materials, and biological processes1. While the understanding of VOC production within the human body remains limited, it is well-established that VOCs arising from cellular metabolic activities accumulate in significant concentrations in the bloodstream2. As cellular processes are disrupted by bacterial or viral infections or the onset of cancer, these metabolic pathways undergo substantial alterations, leading to distinct changes in the VOC profiles generated by lymphocytes and other cell types. These alterations manifest in the VOC composition present in the blood and exhaled breath3. This study seeks to advance the field by identifying optimal AI-based analytical methods for lung cancer diagnosis, assessing the accuracy of non-invasive VOC-based diagnostic approaches, and characterizing the key VOCs influenced by the metabolic shifts associated with cancer cells. This analysis was conducted using a dataset comprising 427 anonymized cases, including patients with benign lung nodules, malignant lung cancers, and healthy controls. Using this dataset we trained a neural network, support vector machine (SVM), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and XGBoost model, allowing for the accuracy of predictions provided by each model to be compared. To generate multiple comparisons and metrics of evaluation, the dataset was split into single “case” comparisons comparing patients without tumors and those with benign/malignant tumors individually, and one case comparing all three classes together. This, combined with train-test splits of 70/30 (where 70% of the dataset is randomly used for training the model and 30% to evaluate the model’s effectiveness), 80/20, and 90/10, allows for a thorough evaluation of the ROC score, precision, and F1 score of each model across multiple conditions. In addition to evaluating the viability of VOC-based diagnosis using artificial intelligence models, this study aimed to identify the key VOCs in lung cancer that contribute to a successful diagnosis. To determine important VOCs for diagnosis, the dataset was analyzed using a Random Forest Model, a multiple decision tree-based model, and an XGBoost model (using SHAP values), independently ranking and outputting the 5 most important VOCs used in distinguishing lung cancer, benign lung nodules, and healthy patients. Using the Random Forest (RF) model, we identified the VOCs most critical for distinguishing among all three classes—benign lung nodules, malignant lung cancers, and controls. The top VOCs included C13H22O, C4H8O2, C4H8O, C7H11O, and C5H10O, followed by C2H4O2, C6H10O2, and C6H12O. When differentiating between cancer and control classes, the most important VOCs were C4H8O, C4H8O2, C13H22O, C11H22O, and C7H6O. For distinguishing benign and control classes, the key VOCs identified were C13H22O, C7H11O, C4H8O2, C6H10O2, and C2H4O2. When comparing VOC profiles between patients with benign and cancerous nodules and in comparison of those with benign nodules against control patients, XGBoost was the most accurate and effective for successful diagnosis. However, when used for distinguishing cancerous nodules and control patients, neural networks outperformed the XGBoost models with a weighted F1 score of 0.96 and an ROC score of 0.96 compared to the F1 score of 0.94 and ROC score of 0.94 in the XGBoost model. CNNs and RNNs also provided ample performance (lower than their decision tree and hyperplane-based counterparts) but each with its strengths. CNNs provided higher “top end” performance with accuracies across control and cancer classes being disproportionately higher, while RNNs had a higher average with all three classes having similar accuracies (i.e. performed better across benign classes). In their best comparisons (using case 4 and a 75/25 split) the CNN had an overall accuracy of 0.77 while the RNN had an accuracy of 0.79 (using a multiclass dataset and an 80/20 split). The VOCs identified in this study provide a foundation for future research aiming to accurately diagnose lung cancer using AI-analysis methods and elucidate the metabolic changes that enable effective classification (Table 1.3). The machine learning methods evaluated in this paper each provide their strengths, and this study attempts to quantify the benefits of each of these methods through various comparisons mirroring what these models would have to analyze in a screening environment. By highlighting these advantages, this research aims to guide future efforts in synthesizing models to further enhance the accuracy and reliability of VOC-based lung cancer diagnostic methods.