An Explainable CNN and Vision Transformer-Based Approach for Real-Time Food Recognition.

Journal: Nutrients

PMID: 39861492

Abstract

BACKGROUND: Food image recognition, a crucial step in computational gastronomy, has diverse applications across nutritional platforms. Convolutional neural networks (CNNs) are widely used for this task due to their ability to capture hierarchical features. However, they struggle with long-range dependencies and global feature extraction, which are vital in distinguishing visually similar foods or images where the context of the whole dish is crucial, thus necessitating transformer architecture.

Authors

Kintoh Allen Nfor

Department of Computer Engineering, Inje University, Gimhae 50834, Republic of Korea.
Tagne Poupi Theodore Armand

Institute of Digital Anti-Aging Healthcare, Inje University, Gimhae 50834, Republic of Korea.
Kenesbaeva Periyzat Ismaylovna

Department of Computer Engineering, Inje University, Gimhae 50834, Republic of Korea.
Moon-Il Joo

Institute of Digital Anti-Aging Healthcare, Inje University, Gimhae, Republic of Korea.
Hee-Cheol Kim

Department of Computer Engineering/Institute of Digital Anti-Aging Healthcare, Inje University, Gimhae 50834, Korea. heeki@inje.ac.kr.

Keywords

Food Humans Image Processing, Computer-Assisted Neural Networks, Computer

External Resources

View on PubMed Access via DOI PubMed (39861492)

An Explainable CNN and Vision Transformer-Based Approach for Real-Time Food Recognition.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals