MANTIS: A Mixed-Signal Near-Sensor Convolutional Imager SoC Using Charge-Domain 4b-Weighted 5-to-84-TOPS/W MAC Operations for Feature Extraction and Region-of-Interest Detection
Journal:
arXiv
Published Date:
Nov 12, 2024
Abstract
Recent advances in artificial intelligence have prompted the search for
enhanced algorithms and hardware to support the deployment of machine learning
at the edge. More specifically, in the context of the Internet of Things (IoT),
vision chips must be able to fulfill tasks of low to medium complexity, such as
feature extraction or region-of-interest (RoI) detection, with a sub-mW power
budget imposed by the use of small batteries or energy harvesting. Mixed-signal
vision chips relying on in- or near-sensor processing have emerged as an
interesting candidate, thanks to their favorable tradeoff between energy
efficiency (EE) and computational accuracy compared to digital systems for
these specific tasks. In this paper, we introduce a mixed-signal convolutional
imager system-on-chip (SoC) codenamed MANTIS, featuring a unique combination of
large 16$\times$16 4b-weighted filters, operation at multiple scales, and
double sampling, well suited to the requirements of medium-complexity tasks.
The main contributions are (i) circuits called DS3 units combining delta-reset
sampling, image downsampling, and voltage downshifting, and (ii) charge-domain
multiply-and-accumulate (MAC) operations based on switched-capacitor amplifiers
and charge sharing in the capacitive DAC of the successive-approximation ADCs.
MANTIS achieves peak EEs normalized to 1b operations of 4.6 and 84.1 TOPS/W at
the accelerator and SoC levels, while computing feature maps with a root mean
square error ranging from 3 to 11.3$\%$. It also demonstrates a face RoI
detection with a false negative rate of 11.5$\%$, while discarding 81.3$\%$ of
image patches and reducing the data transmitted off chip by 13$\times$ compared
to the raw image.