Time-Lapse Video-Based Embryo Grading via Complementary Spatial-Temporal Pattern Mining
Journal:
arXiv
Published Date:
Jun 5, 2025
Abstract
Artificial intelligence has recently shown promise in automated embryo
selection for In-Vitro Fertilization (IVF). However, current approaches either
address partial embryo evaluation lacking holistic quality assessment or target
clinical outcomes inevitably confounded by extra-embryonic factors, both
limiting clinical utility. To bridge this gap, we propose a new task called
Video-Based Embryo Grading - the first paradigm that directly utilizes
full-length time-lapse monitoring (TLM) videos to predict embryologists'
overall quality assessments. To support this task, we curate a real-world
clinical dataset comprising over 2,500 TLM videos, each annotated with a
grading label indicating the overall quality of embryos. Grounded in clinical
decision-making principles, we propose a Complementary Spatial-Temporal Pattern
Mining (CoSTeM) framework that conceptually replicates embryologists'
evaluation process. The CoSTeM comprises two branches: (1) a morphological
branch using a Mixture of Cross-Attentive Experts layer and a Temporal
Selection Block to select discriminative local structural features, and (2) a
morphokinetic branch employing a Temporal Transformer to model global
developmental trajectories, synergistically integrating static and dynamic
determinants for grading embryos. Extensive experimental results demonstrate
the superiority of our design. This work provides a valuable methodological
framework for AI-assisted embryo selection. The dataset and source code will be
publicly available upon acceptance.