Comparative analysis of automated foul detection in football using deep learning architectures.

Journal: Scientific reports
PMID:

Abstract

Automated foul detection in football represents a challenging task due to the dynamic nature of the game, the variability in player movements, and the ambiguity in differentiating fouls from regular physical contact. This study presents a comprehensive comparative evaluation of eight state-of-the-art Deep Learning (DL) architectures - EfficientNetV2, ResNet50, VGG16, Xception, InceptionV3, MobileNetV2, InceptionResNetV2, and DenseNet121 - applied to the task of automated foul detection in football. The models were trained and evaluated using a curated dataset comprising 7000 images, which was split into 70% for training (4,900 images), 20% for validation (1,400 images), and 10% for testing (700 images). To ensure fair evaluation, the test set was balanced to contain 350 images depicting foul events and 350 images representing non-foul scenarios, although perfect balance was subject to class distribution constraints. Performance was assessed across multiple metrics, including test accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC). The results demonstrate that InceptionResNetV2 achieved the highest test accuracy of 87.57% and a strong F1-score of 0.8966, closely followed by DenseNet121, which attained the highest precision of 0.9786 and an AUC of 0.9641, indicating superior discriminatory power. Lightweight models such as MobileNetV2 also performed competitively, highlighting their potential for real-time deployment. The findings highlight the strengths and trade-offs between model complexity, accuracy, and generalizability, underscoring the viability of integrating DL architectures into existing football officiating systems, such as the Video Assistant Referee (VAR). Furthermore, the study emphasizes the importance of model explainability through techniques such as Gradient-weighted Class Activation Mapping++ (GradCAM++), ensuring that automated decisions can be accompanied by interpretable visual evidence. This comparative evaluation serves as a foundation for future research aimed at enhancing real-time foul detection through multimodal data fusion, temporal modeling, and improved domain adaptation techniques.

Authors

  • Abdallah Rabee
    Sports Health Sciences Department, Faculty of Physical Education, South Valley University, Qena, 1464091, Egypt.
  • Zakaria Anwar
    Sports Training and Kinesiology Sciences Department, Faculty of Physical Education, Suez University, Suez, 43512, Egypt.
  • Ahmed AbdelMoety
    Department of Electrical Engineering, Faculty of Engineering, South Valley University, Qena, Egypt.
  • Ahmed Abdelsallam
    Sports Health Sciences Department, Faculty of Physical Education, South Valley University, Qena, 1464091, Egypt.
  • Mahmoud Ali
    Department of Pathology, University Hospitals Coventry and Warwickshire NHS Trust, Coventry, UK.