BRAX, Brazilian labeled chest x-ray dataset.

Journal: Scientific data
Published Date:

Abstract

Chest radiographs allow for the meticulous examination of a patient's chest but demands specialized training for proper interpretation. Automated analysis of medical imaging has become increasingly accessible with the advent of machine learning (ML) algorithms. Large labeled datasets are key elements for training and validation of these ML solutions. In this paper we describe the Brazilian labeled chest x-ray dataset, BRAX: an automatically labeled dataset designed to assist researchers in the validation of ML models. The dataset contains 24,959 chest radiography studies from patients presenting to a large general Brazilian hospital. A total of 40,967 images are available in the BRAX dataset. All images have been verified by trained radiologists and de-identified to protect patient privacy. Fourteen labels were derived from free-text radiology reports written in Brazilian Portuguese using Natural Language Processing.

Authors

  • Eduardo P Reis
    Hospital Israelita Albert Einstein - Big Data Analytics, São Paulo, Brazil. eduardo.reis@einstein.br.
  • Joselisa P Q de Paiva
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Maria C B da Silva
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Guilherme A S Ribeiro
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Victor F Paiva
    Hospital Israelita Albert Einstein - Big Data Analytics, São Paulo, Brazil.
  • Lucas Bulgarelli
    MIT Critical Data, Laboratory for Computational Physiology, Harvard-MIT Health Sciences & Technology, Massachusetts Institute of Technology, Cambridge, USA; Big Data Analytics Department, Hospital Israelita Albert Einstein, São Paulo, Brazil. Electronic address: lucas1@mit.edu.
  • Henrique M H Lee
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Paulo V Santos
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Vanessa M Brito
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Lucas T W Amaral
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Gabriel L Beraldo
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Jorge N Haidar Filho
    Hospital Israelita Albert Einstein - Big Data Analytics, São Paulo, Brazil.
  • Gustavo B S Teles
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Gilberto Szarf
    Hospital Israelita Albert Einstein - Imaging Department, São Paulo, Brazil.
  • Tom Pollard
    MIT Critical Data, Laboratory for Computational Physiology, Harvard-MIT Health Sciences & Technology, MIT, Cambridge, Massachusetts, United States.
  • Alistair E W Johnson
  • Leo A Celi
    Beth Israel Deaconess Medical Center, Pulmonary Division and Harvard Medical School, Boston, MA 02215, USA.
  • Edson Amaro
    Department of Radiology, School of Medicine, University of Sao Paulo, São Paulo, Brazil.