Application of the performance of machine learning techniques as support in the prediction of school dropout.

Journal: Scientific reports
PMID:

Abstract

This article presents a study, intending to design a model with 90% reliability, which helps in the prediction of school dropouts in higher and secondary education institutions, implementing machine learning techniques. The collection of information was carried out with open data from the 2015 Intercensal Survey and the 2010 and 2020 Population and Housing censuses carried out by the National Institute of Statistics and Geography, which contain information about the inhabitants and homes. in the 32 federal entities of Mexico. The data were homologated and twenty variables were selected, based on the correlation. After cleaning the data, there was a sample of 1,080,782 records in total. Supervised learning was used to create the model, automating data processing with training and testing, applying the following techniques, Artificial Neural Networks, Support Vector Machines, Linear Ridge and Lasso Regression, Bayesian Optimization, Random Forest, the first two with a reliability greater than 99% and the last with 91%.

Authors

  • Auria Lucia Jiménez-Gutiérrez
    Centro Universitario de los Lagos, Universidad de Guadalajara, Enrique Díaz de León 1144, Paseos de la Montaña, 47460, Lagos de Moreno, Jalisco, Mexico. auria.jimenez@academicos.udg.mx.
  • Cinthya Ivonne Mota-Hernández
    Universidad del Valle de México, Calz. de Tlalpan 3016 y 3058 Ex Hacienda Coapa, 04910, Alcaldía Coyoacán, CDMX, Mexico.
  • Efrén Mezura-Montes
    Centro de Investigación en Inteligencia Artificial, Universidad Veracruzana, Sebastián Camacho 5, Centro, 91000 Xalapa, VER, Mexico.
  • Rafael Alvarado-Corona
    Centro de Estudios Tecnológicos Industrial y de Servicios N°06 "Ignacio Manuel Altamirano" , Cuitláhuac No. 50 Esq, Av. Tlahuac, Los Reyes Culhuacan, 09840, Iztapalapa, Ciudad de México, Mexico.