Efficient Semantic Similarity Computing with Optimized BERT Models.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Bridging diverse terminologies and ensuring precise information retrieval, semantic similarity in medical language is key to improve healthcare outcomes. Semantic similarity measures how closely pieces of text share the same meaning, a crucial element in Natural Language Processing (NLP) for better understanding and interpreting data. While Large Language Models (LLMs) are known for their versatility and text generation abilities, BERT (Bidirectional Encoder Representations from Transformers) excels at text analysis and identifying semantic similarities. However, as AI models become more advanced, researchers face challenges related to model size, computational demands, and deployment constraints (e.g., energy, memory, and latency). To address these issues, model optimization techniques can drastically reduce memory usage and speed up inference. In this work, we leverage the open-source Microsoft Olive tool to find the best optimizations, then apply a dynamic quantization process. We evaluate our approach on the DEFT 2020 Text Mining Challenge, slightly improving performance metrics while achieving a 20x average speed-up and reducing memory usage by around 70%.

Authors

  • Natalia Grabar
    Université Lille 3, France.
  • Idriss Jairi
    Univ. Lille, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, F-59000 Lille, France.
  • Hayfa Zgaya-Biau
    Univ. Lille, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, F-59000 Lille, France.