Constructing a predictive model for acute mastitis in lactating women based on machine learning.
Journal:
Scientific reports
Published Date:
Aug 22, 2025
Abstract
Acute lactational mastitis is a frequently occurring complication for lactating women, exerting a certain degree of influence on their physical condition, breastfeeding, mental health, and daily life. The etiology of this disease is complex, and the early symptoms lack typicality. Delayed diagnosis often occurs, which further progresses to abscess or more severe infections, adversely affecting the therapeutic effect and eventually leading to a prolonged recovery process and triggering other complications. Nevertheless, currently, the research regarding the risk factors of acute mastitis that occurs during lactation remains unfinished. This study employed a retrospective case-control study approach and collected relevant data from 369 patients with acute mastitis and 447 healthy controls. The involved data covered indicators such as age, parity, history of breast surgery, etc. By using machine learning (ML) algorithms (Logistic Regression (LR), Naive Bayes (NB), XGBoost, Multilayer Perceptron (MLP)) to train and validate the above data, it aimed to construct a predictive model of the risk factors for the occurrence of acute mastitis in lactating women, and simultaneously analyzed the other influences and effects of these factors on acute mastitis. The ML model demonstrated high accuracy in differentiating patients with acute mastitis from non-patients. We evaluated twelve indicators, namely age, primiparity, history of breast surgery, cracked, external breast trauma, puerperium, gestational diabetes, C-reactive protein (CRP), procalcitonin (PCT), neutrophils (NE), white blood cells (WBC), and abnormal nipple discharge, to determine their influence on the occurrence of acute mastitis in lactating women. Prediction models were established using four different ML algorithms. Through analysis, when comparing the four distinct ML models on the test set, the MLP model performed optimally across various evaluation metrics, including the highest area under the receiver operating characteristic (ROC) curve (AUROC) (0.898), sensitivity (0.820), test specificity (0.863), and F1 score (0.849), with an accuracy of 0.840. Decision Curve Analysis (DCA) indicates that within the majority of threshold ranges, the MLP can achieve the highest net benefit. Among these twelve indicators, five are significantly related to the occurrence of acute mastitis, namely age, cracked, CRP, NE, and WBC. We have successfully developed a prediction model for acute lactational mastitis and identified five key indicators closely related to its occurrence. This study effectively predicts the occurrence of acute lactational mastitis and provides a reference for the timely implementation of targeted clinical interventions as well as accurate diagnosis and treatment.