A comparative study on TB incidence and HIVTB coinfection using machine learning models on WHO global TB dataset.
Journal:
Scientific reports
PMID:
40258881
Abstract
Tuberculosis, a deadly and contagious disease caused by Mycobacterium tuberculosis, remains a significant global public health threat. HIV co-infection significantly increases the risk of active TB recurrence and prolongs medical treatment for tuberculosis (TB). The study focuses on using advanced machine learning (ML) techniques to predict TB incidence and HIV-TB co-infection using data from the 2023 World Health Organization (WHO) Global TB burden database. The estimated rate for all types of tuberculosis per 100,000 people (E_inc_100k) and the estimated rate of HIV-positive tuberculosis incidence per 100,000 people (e_inc_tbhiv_100k) are the two main goal factors in the dataset. F1 score, accuracy, precision, recall, and the Receiver Operating Characteristic-Area Under the Curve (ROC-AUC) were among the important metrics used to evaluate the model's performance. With 99.7% accuracy, 99.80% precision, 99.6% recall, a 99.7% F1 score, and a 99.7% ROC-AUC score, the Extreme Gradient Boosting (XGB) model outperformed other models for e_inc_100k. The e_inc_tbhiv_100k records outstanding performance from the Gradient Boosting (GB) model, with 98.58% accuracy, 98.32% precision, 98.73% recall, a 98.53% F1 score, and a 98.58% ROC-AUC score. Finally, the study aligns with the UNAIDS and WHO End TB Strategy, indicating a progression in combating TB and TB-HIV co-infection in public health workflow.