Combining Ultrasound Imaging and Molecular Testing in a Multimodal Deep Learning Model for Risk Stratification of Indeterminate Thyroid Nodules.

Journal: Thyroid : official journal of the American Thyroid Association
Published Date:

Abstract

Indeterminate cytology (Bethesda III and IV) represents 15-30% of biopsied thyroid nodules and require additional diagnostic testing. Molecular testing (MT) is a commonly used diagnostic tool that evaluatesmalignancy risk through next generation sequencing of fine needle aspiration (FNA) samples. While MT achieves high sensitivity (97-100%) in ruling out malignancy, its specificity and positive predictive value (PPV) remain relatively low. This study proposes a multimodal deep learning model that integrates ultrasound (US) imaging with MT to improve risk stratification by enhancing PPV while maintaining high sensitivity. Combining these modalities leverages complementary information from both molecular and imaging data, addressing limitations in current approaches and offering a robust framework for evaluating indeterminate nodules. We retrospectively analyzed 333 patients with indeterminate thyroid nodules (259 benign, 74 malignant) at UCLA Medical Center between 2016 and 2022. We evaluated four configurations: whole frame US images, 256 × 256 patches, 128 × 128 patches, and an ensemble model combining the first three configurations. The clinical baseline consisted of Bethesda cytology and MT results. Models were assessed using five fold cross validation stratified by surgical outcomes. The clinical baseline (Bethesda + MT) achieved an AUROC of 0.728 [0.68, 0.78] with sensitivity of 0.946 [0.88, 1.00], specificity of 0.664 [0.60, 0.73], and PPV of 0.448 [0.41, 0.48]. The proposed ensemble model demonstrated improved performance, achieving an AUROC of 0.831 [0.77, 0.89] with a sensitivity of 0.946 [0.88, 1.00], specificity of 0.703 [0.66, 0.75], and PPV of 0.477 [0.46, 0.50]. These improvements were statistically significant ( = 0.0008). Our multimodal model enhances MT performance by providing statistically significant improvements in PPV and specificity while maintaining high sensitivity. Our framework could be leveraged to reduce the number of benign thyroid resections in patients with indeterminate nodules. However, this study is limited by its single center dataset, lack of external validation, and the use of binarized MT outputs rather than granular malignancy risk probabilities. Future work should validate these findings across diverse populations and larger external datasets for more comprehensive risk stratification.

Authors

  • Shreeram Athreya
    Department of Electrical and Computer Engineering, UCLA, Los Angeles, California, USA.
  • Andrew Melehy
    Department of Surgery, UCLA, Los Angeles, California, USA.
  • Sujit Silas Armstrong Suthahar
    Department of Bioengineering, UCLA, Los Angeles, California, USA.
  • Vedrana Ivezic
    Biomedical Artificial Intelligence Research Lab, UCLA Department of Bioengineering, Los Angeles, CA 90024, USA.
  • Ashwath Radhachandran
    Dascena, Inc., Houston, TX, United States.
  • Vivek R Sant
    Division of Endocrine Surgery, UT Southwestern Medical Center, Dallas, TX 75390, USA.
  • Chace Moleta
    Department of Pathology and Laboratory Medicine, UCLA, Los Angeles, California, USA.
  • Henry Zheng
    Department of Radiological Sciences, University of California Los Angeles, Los Angeles, California, United States.
  • Maitraya Patel
    Department of Radiological Sciences, UCLA, Los Angeles, California, USA.
  • Rinat Masamed
    Department of Radiology, University of California, Los Angeles, Los Angeles, CA 90095, USA.
  • Masha Livhits
    Department of Surgery, UCLA, Los Angeles, California, USA.
  • Michael Yeh
    Department of Surgery, UCLA, Los Angeles, California, USA.
  • Corey W Arnold
    Department of Bioengineering; University of California, Los Angeles, CA.
  • William Speier
    Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, USA.