eNCApsulate: NCA for Precision Diagnosis on Capsule Endoscopes
Journal:
arXiv
Published Date:
Apr 30, 2025
Abstract
Wireless Capsule Endoscopy is a non-invasive imaging method for the entire
gastrointestinal tract, and is a pain-free alternative to traditional
endoscopy. It generates extensive video data that requires significant review
time, and localizing the capsule after ingestion is a challenge. Techniques
like bleeding detection and depth estimation can help with localization of
pathologies, but deep learning models are typically too large to run directly
on the capsule. Neural Cellular Automata (NCA) for bleeding segmentation and
depth estimation are trained on capsule endoscopic images. For monocular depth
estimation, we distill a large foundation model into the lean NCA architecture,
by treating the outputs of the foundation model as pseudo ground truth. We then
port the trained NCA to the ESP32 microcontroller, enabling efficient image
processing on hardware as small as a camera capsule. NCA are more accurate
(Dice) than other portable segmentation models, while requiring more than 100x
fewer parameters stored in memory than other small-scale models. The visual
results of NCA depth estimation look convincing, and in some cases beat the
realism and detail of the pseudo ground truth. Runtime optimizations on the
ESP32-S3 accelerate the average inference speed significantly, by more than
factor 3. With several algorithmic adjustments and distillation, it is possible
to eNCApsulate NCA models into microcontrollers that fit into wireless capsule
endoscopes. This is the first work that enables reliable bleeding segmentation
and depth estimation on a miniaturized device, paving the way for precise
diagnosis combined with visual odometry as a means of precise localization of
the capsule -- on the capsule.