Automating Versatile Time-Series Analysis with Tiny Transformers on Embedded FPGAs
Journal:
arXiv
Published Date:
May 23, 2025
Abstract
Transformer-based models have shown strong performance across diverse
time-series tasks, but their deployment on resource-constrained devices remains
challenging due to high memory and computational demand. While prior work
targeting Microcontroller Units (MCUs) has explored hardware-specific
optimizations, such approaches are often task-specific and limited to 8-bit
fixed-point precision. Field-Programmable Gate Arrays (FPGAs) offer greater
flexibility, enabling fine-grained control over data precision and
architecture. However, existing FPGA-based deployments of Transformers for
time-series analysis typically focus on high-density platforms with manual
configuration. This paper presents a unified and fully automated deployment
framework for Tiny Transformers on embedded FPGAs. Our framework supports a
compact encoder-only Transformer architecture across three representative
time-series tasks (forecasting, classification, and anomaly detection). It
combines quantization-aware training (down to 4 bits), hardware-aware
hyperparameter search using Optuna, and automatic VHDL generation for seamless
deployment. We evaluate our framework on six public datasets across two
embedded FPGA platforms. Results show that our framework produces integer-only,
task-specific Transformer accelerators achieving as low as 0.033 mJ per
inference with millisecond latency on AMD Spartan-7, while also providing
insights into deployment feasibility on Lattice iCE40. All source code will be
released in the GitHub repository
(https://github.com/Edwina1030/TinyTransformer4TS).