PFENet: Towards precise feature extraction from sparse point cloud for 3D object detection.
Journal:
Neural networks : the official journal of the International Neural Network Society
PMID:
39827838
Abstract
Accurate 3D point cloud object detection is crucially important for autonomous driving vehicles. The sparsity of point clouds in 3D scenes, especially for smaller targets like pedestrians and bicycles that contain fewer points, makes detection particularly challenging. To solve this problem, we propose a single-stage voxel-based 3D object detection method, namely PFENet. Firstly, we design a robust voxel feature encoding network that incorporates a stacked triple attention mechanism to enhance the extraction of key features and suppress noise. Moreover, a 3D sparse convolution layer dynamically adjusts feature processing based on output location importance, improving small object recognition. Additionally, the attentional feature fusion module in the region proposal network merges low-level spatial features with high-level semantic features, and broadens the receptive field through atrous spatial pyramid pooling to capture multi-scale features. Finally, we develop multiple detection heads for more refined feature extraction and object classification, as well as more accurate bounding box regression. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed method.