Agrast-6: Abridged VGG-Based Reflected Lightweight Architecture for Binary Segmentation of Depth Images Captured by Kinect.

Journal: Sensors (Basel, Switzerland)
Published Date:

Abstract

Binary object segmentation is a sub-area of semantic segmentation that could be used for a variety of applications. Semantic segmentation models could be applied to solve binary segmentation problems by introducing only two classes, but the models to solve this problem are more complex than actually required. This leads to very long training times, since there are usually tens of millions of parameters to learn in this category of convolutional neural networks (CNNs). This article introduces a novel abridged VGG-16 and SegNet-inspired reflected architecture adapted for binary segmentation tasks. The architecture has 27 times fewer parameters than SegNet but yields 86% segmentation cross-intersection accuracy and 93% binary accuracy. The proposed architecture is evaluated on a large dataset of depth images collected using the Kinect device, achieving an accuracy of 99.25% in human body shape segmentation and 87% in gender recognition tasks.

Authors

  • Karolis Ryselis
    Faculty of Informatics, Kaunas University of Technology, 44249 Kaunas, Lithuania.
  • Tomas Blažauskas
    Faculty of Informatics, Kaunas University of Technology, 44249 Kaunas, Lithuania.
  • Robertas Damaševičius
    Faculty of Applied Mathematics, Silesian University of Technology, Gliwice, Poland.
  • Rytis Maskeliūnas
    Department of Multimedia Engineering, Kaunas University of Technology, Kaunas, Lithuania.