Faster than Fast: Accelerating Oriented FAST Feature Detection on Low-end Embedded GPUs
Journal:
arXiv
Published Date:
Jun 8, 2025
Abstract
The visual-based SLAM (Simultaneous Localization and Mapping) is a technology
widely used in applications such as robotic navigation and virtual reality,
which primarily focuses on detecting feature points from visual images to
construct an unknown environmental map and simultaneously determines its own
location. It usually imposes stringent requirements on hardware power
consumption, processing speed and accuracy. Currently, the ORB (Oriented FAST
and Rotated BRIEF)-based SLAM systems have exhibited superior performance in
terms of processing speed and robustness. However, they still fall short of
meeting the demands for real-time processing on mobile platforms. This
limitation is primarily due to the time-consuming Oriented FAST calculations
accounting for approximately half of the entire SLAM system. This paper
presents two methods to accelerate the Oriented FAST feature detection on
low-end embedded GPUs. These methods optimize the most time-consuming steps in
Oriented FAST feature detection: FAST feature point detection and Harris corner
detection, which is achieved by implementing a binary-level encoding strategy
to determine candidate points quickly and a separable Harris detection strategy
with efficient low-level GPU hardware-specific instructions. Extensive
experiments on a Jetson TX2 embedded GPU demonstrate an average speedup of over
7.3 times compared to widely used OpenCV with GPU support. This significant
improvement highlights its effectiveness and potential for real-time
applications in mobile and resource-constrained environments.