Analysis of the Relationship Between and Cytokine Gene Expression in Hematological Malignancy: Leveraging Explained Artificial Intelligence and Machine Learning for Small Dataset Insights.
Journal:
International journal of medical sciences
PMID:
40303498
Abstract
This study measures expression of () and related cytokine genes in bone marrow mononuclear cells in patients with hematological malignancies, analyzing the relationship between them with an integrated framework of statistical analyses, machine learning (ML), and explainable artificial intelligence (XAI). While traditional dimensionality reduction techniques-such as principal component analysis, linear discriminant analysis, and t-distributed stochastic neighbor embedding-showed limited differentiation embedding, ML classifiers (k-Nearest Neighbors, Naïve Bayes Classifier, Random Forest, and XGBoost) successfully identified critical patterns. Notably, normalized caspase-1 counts consistently emerged as the most influential feature associated with NF-κB1 activity across disease groups, as highlighted by SHapley Additive exPlanations analyses. Systematic evaluation of ML performance on small datasets revealed that a minimum sample size of 15-24 is necessary for reliable classification outcomes, particularly in cohorts of acute myeloid leukemia and myelodysplastic syndrome. These findings underscore the pivotal role of caspase-1 to the NF-κB1 gene expression in hematologic malignancy diseases. Furthermore, this study demonstrates the feasibility of leveraging ML and XAI to derive meaningful insights from limited data, offering a robust strategy for biomarker discovery and precision medicine in rare hematological malignancies.