Enlightened prognosis: Hepatitis prediction with an explainable machine learning approach.

Journal: PloS one
PMID:

Abstract

Hepatitis is a widespread inflammatory condition of the liver, presenting a formidable global health challenge. Accurate and timely detection of hepatitis is crucial for effective patient management, yet existing methods exhibit limitations that underscore the need for innovative approaches. Early-stage detection of hepatitis is now possible with the recent adoption of machine learning and deep learning approaches. With this in mind, the study investigates the use of traditional machine learning models, specifically classifiers such as logistic regression, support vector machines (SVM), decision trees, random forest, multilayer perceptron (MLP), and other models, to predict hepatitis infections. After extensive data preprocessing including outlier detection, dataset balancing, and feature engineering, we evaluated the performance of these models. We explored three modeling approaches: machine learning with default hyperparameters, hyperparameter-tuned models using GridSearchCV, and ensemble modeling techniques. The SVM model demonstrated outstanding performance, achieving 99.25% accuracy and a perfect AUC score of 1.00 with consistency in other metrics with 99.27% precision, and 99.24% for both recall and F1-measure. The MLP and Random Forest proved to be in pace with the superior performance of SVM exhibiting an accuracy of 99.00%. To ensure robustness, we employed a 5-fold cross-validation technique. For deeper insight into model interpretability and validation, we employed an explainability analysis of our best-performed models to identify the most effective feature for hepatitis detection. Our proposed model, particularly SVM, exhibits better prediction performance regarding different performance metrics compared to existing literature.

Authors

  • Niloy Das
    Department of Information and Communication Engineering, Noakhali Science and Technology University, Noakhali, Chittagong, Bangladesh.
  • Md Bipul Hossain
    Department of Information and Communication Engineering, Noakhali Science and Technology University, Noakhali, Chittagong, Bangladesh.
  • Apurba Adhikary
    Department of Information and Communication Engineering, Noakhali Science and Technology University, Noakhali, Chittagong, Bangladesh.
  • Avi Deb Raha
    Computer Science and Engineering Discipline, Khulna University, Khulna, Khulna, Bangladesh.
  • Yu Qiao
    Department of English and American Studies, RWTH Aachen University, Aachen, North Rhine-Westphalia, Germany.
  • Md Mehedi Hassan
    School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, PR China.
  • Anupam Kumar Bairagi
    Computer Science and Engineering Discipline, Khulna University, Khulna 9208, Bangladesh.