A practical approach for colorectal cancer diagnosis based on machine learning.

Journal: PloS one
PMID:

Abstract

In this paper, we present the results of applying machine learning models to build a Colorectal Cancer Diagnosis system. The methodology encompasses six key steps: collecting raw data from Electronic Medical Records (EMRs), revising feature attributes with expert input, data preprocessing, model adaptation, training machine learning models (CART, Random Forest, and XGBOOST), and evaluating the results. Furthermore, based on analysis of experimental measurement parameter values, 21 feature attributes which relate to support early diagnose the Colorectal cancer disease are extracted. Among different models implemented in our case, XGBOOST is the most suitable model to solve this problem. The system assists clinicians to select clinical tests and medical procedures for a colorectal cancer patient. Therefore, patients can save the waiting time and medical examination costs. On the other hand, based on the achievements from this research, our approach can guide further applying machine learning in medicine.

Authors

  • Nguyen Hai Minh
    Thai Nguyen University, Information and Communication Technology, Thai Nguyen, Vietnam.
  • Tran Quang Quy
    Thai Nguyen University, Information and Communication Technology, Thai Nguyen, Vietnam.
  • Ngo Duc Tam
    Artificial Intelligence Research Center, VNU Information Technology Institute, Vietnam National University, Hanoi, Vietnam.
  • Tran Manh Tuan
    University of Information and Communication Technology, Thai Nguyen, Vietnam.
  • Le Hoang Son
    VNU University of Science Vietnam National University, Hanoi, Vietnam. sonlh@vnu.edu.vn.