CrimeNet: Neural Structured Learning using Vision Transformer for violence detection.

Journal: Neural networks : the official journal of the International Neural Network Society

PMID: 36774869

Abstract

The state of the art in violence detection in videos has improved in recent years thanks to deep learning models, but it is still below 90% of average precision in the most complex datasets, which may pose a problem of frequent false alarms in video surveillance environments and may cause security guards to disable the artificial intelligence system. In this study, we propose a new neural network based on Vision Transformer (ViT) and Neural Structured Learning (NSL) with adversarial training. This network, called CrimeNet, outperforms previous works by a large margin and reduces practically to zero the false positives. Our tests on the four most challenging violence-related datasets (binary and multi-class) show the effectiveness of CrimeNet, improving the state of the art from 9.4 to 22.17 percentage points in ROC AUC depending on the dataset. In addition, we present a generalisation study on our model by training and testing it on different datasets. The obtained results show that CrimeNet improves over competing methods with a gain of between 12.39 and 25.22 percentage points, showing remarkable robustness.

Authors

Fernando J Rendón-Segador

Dpto. de Lenguajes y Sistemas Informáticos, Universidad de Sevilla, Spain. Electronic address: frendon@us.es.
Juan A Álvarez-García

Dpto. de Lenguajes y Sistemas Informáticos, Universidad de Sevilla, 41012, Sevilla, Spain. Electronic address: jaalvarez@us.es.
Jose L Salazar-González

Dpto. de Lenguajes y Sistemas Informáticos, Universidad de Sevilla, Spain. Electronic address: jsalazar@us.es.
Tatiana Tommasi

Politecnico di Torino & Italian Institute of Technology, Italy. Electronic address: tatiana.tommasi@polito.it.

Keywords

Artificial Intelligence Generalization, Psychological Neural Networks, Computer Violence

External Resources

View on PubMed Access via DOI PubMed (36774869)

CrimeNet: Neural Structured Learning using Vision Transformer for violence detection.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals