Machine learning for predicting survival of colorectal cancer patients.

Journal: Scientific reports
Published Date:

Abstract

Colorectal cancer is one of the most incident types of cancer in the world, with almost 2 million new cases annually. In Brazil, the scenery is the same, around 41 thousand new cases were estimated in the last 3 years. This increase in cases further intensifies the interest and importance of studies related to the topic, especially using new approaches. The use of machine learning algorithms for cancer studies has grown in recent years, and they can provide important information to medicine, in addition to making predictions based on the data. In this study, five different classifications were performed, considering patients' survival. Data were extracted from Hospital Based Cancer Registries of São Paulo, which is coordinated by Fundação Oncocentro de São Paulo, containing patients with colorectal cancer from São Paulo state, Brazil, treated between 2000 and 2021. The machine learning models used provided us the predictions and the most important features for each one of the algorithms of the studies. Using part of the dataset to validate our models, the results of the predictors were around 77% of accuracy, with AUC close to 0.86, and the most important column was the clinical staging in all of them.

Authors

  • Lucas Buk Cardoso
    Núcleo de Sistemas Eletrônicos Embarcados, Instituto Mauá de Tecnologia, São Paulo, 09580-900, Brazil. lucas.cardoso@maua.br.
  • Vanderlei Cunha Parro
    Núcleo de Sistemas Eletrônicos Embarcados, Instituto Mauá de Tecnologia, São Paulo, 09580-900, Brazil.
  • Stela Verzinhasse Peres
    Information and Epidemiology, Fundação Oncocentro de São Paulo, São Paulo, 05409-012, Brazil.
  • Maria Paula Curado
    Epidemiology and Statistics on Cancer Group, A.C. Camargo Cancer Center, São Paulo, 01525-001, Brazil.
  • Gisele Aparecida Fernandes
    Epidemiology and Statistics on Cancer Group, A.C. Camargo Cancer Center, São Paulo, 01525-001, Brazil.
  • Victor Wünsch Filho
    Information and Epidemiology, Fundação Oncocentro de São Paulo, São Paulo, 05409-012, Brazil.
  • Tatiana Natasha Toporcov
    Epidemiology Department, Faculdade de Saude Pública da Universidade de São Paulo, São Paulo, 01246-904, Brazil.