Semi-Supervised Echocardiography Video Segmentation via Adaptive Spatio-Temporal Tensor Semantic Awareness and Memory Flow.
Journal:
IEEE transactions on medical imaging
PMID:
40031067
Abstract
Accurate segmentation of cardiac structures in echocardiography videos is vital for diagnosing heart disease. However, challenges such as speckle noise, low spatial resolution, and incomplete video annotations hinder the accuracy and efficiency of segmentation tasks. Existing video-based segmentation methods mainly utilize optical flow estimation and cross-frame attention to establish pixel-level correlations between frames, which are usually sensitive to noise and have high computational costs. In this paper, we present an innovative echocardiography video segmentation framework that exploits the inherent spatio-temporal correlation of echocardiography video feature tensors. Specifically, we perform adaptive tensor singular value decomposition (t-SVD) on the video semantic feature tensor within a learnable 3D transform domain. By utilizing learnable thresholds, we preserve the principal singular values to reduce redundancy in the high-dimensional spatio-temporal feature tensor and enforce its potential low-rank property. Through this process, we can capture the temporal evolution of the target tissue by effectively utilizing information from limited labeled frames, thus overcoming the constraints of sparse annotations. Furthermore, we introduce a memory flow method that propagates relevant information between adjacent frames based on the multi-scale affinities to precisely resolve frame-to-frame variations of dynamic tissues, thereby improving the accuracy and continuity of segmentation results. Extensive experiments conducted on both public and private datasets validate the superiority of our proposed method over state-of-the-art methods, demonstrating improved performance in echocardiography video segmentation.