Artificial intelligence predicts sex-specific risk of metabolic dysfunction-associated steatotic liver disease.

Journal: Biology of sex differences
Published Date:

Abstract

BACKGROUND & AIMS: Metabolic dysfunction-associated steatotic liver disease (MASLD) exhibits well-established sex differences across its risk factors, disease progression, and liver-related and extrahepatic outcomes. We trained sex-specific machine learning (ML) algorithms using routine clinical data to evaluate sex-specific learning patterns and diagnostic performance. METHODS: In this cross-sectional study conducted at a cardiology referral center, 446 adults were enrolled. Participants were divided into training (127 men, 185 women) and test sets (55 men, 79 women). Eight ML classifiers were trained on the overall dataset and separately for men and women to predict MASLD presence and steatosis severity (no/mild/moderate-to-severe) as assessed by ultrasonography. Hyperparameters were tuned using grid search cross-validation, and model performance was evaluated on an unseen test set. RESULTS: The prevalence of MASLD was 63.6% among participants; 41.2% had mild, and 22.4% had moderate-to-severe steatosis. Compared to models trained on the overall dataset, sex-specific modeling improved diagnostic performance in men but remained suboptimal in women. For MASLD presence, top-performing models achieved AUC/F1 scores of 0.769/0.856 overall, 0.793/0.897 in men, and 0.681/0.794 in women. For steatosis severity, respective AUC/F1 scores were 0.761/0.671 overall, 0.723/0.608 in men, and 0.718/0.571 in women. Sensitivity analyses using stratified cross-validation confirmed the performance gap between men and women. Threshold analyses showed acceptable rule-in and rule-out performance in men but suboptimal performance in women. Feature-importance rankings differed substantially between sexes, indicating distinct sex-specific learning patterns. CONCLUSIONS: Artificial intelligence-based algorithms identify sex-specific learning patterns in MASLD steatosis risk prediction. Routine clinical variables appear more informative for men, while showing weaker algorithmic performance in women. This finding suggests that failure to train MASLD risk algorithms with sex-specific risk factors may increase the risk of misclassification in women. Therefore, achieving more equitable and clinically reliable models will require integrating women-specific risk factors and adopting sex-stratified data processing strategies within MASLD prediction frameworks to reduce diagnostic inequities and support more personalized care.

Authors

Keywords

No keywords available for this article.