On the effectiveness of compact biomedical transformers.

Journal: Bioinformatics (Oxford, England)

Published Date: Mar 1, 2023

Abstract

MOTIVATION: Language models pre-trained on biomedical corpora, such as BioBERT, have recently shown promising results on downstream biomedical tasks. Many existing pre-trained models, on the other hand, are resource-intensive and computationally heavy owing to factors such as embedding size, hidden dimension and number of layers. The natural language processing community has developed numerous strategies to compress these models utilizing techniques such as pruning, quantization and knowledge distillation, resulting in models that are considerably faster, smaller and subsequently easier to use in practice. By the same token, in this article, we introduce six lightweight models, namely, BioDistilBERT, BioTinyBERT, BioMobileBERT, DistilBioBERT, TinyBioBERT and CompactBioBERT which are obtained either by knowledge distillation from a biomedical teacher or continual learning on the Pubmed dataset. We evaluate all of our models on three biomedical tasks and compare them with BioBERT-v1.1 to create the best efficient lightweight models that perform on par with their larger counterparts.

Authors

Omid Rohanian

Department of Engineering Science, University of Oxford, Oxford, UK.
Mohammadmahdi Nouriborji

NLPie Research, Oxford, UK.
Samaneh Kouchaki

Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK.
David A Clifton

Keywords

Datasets as Topic Natural Language Processing PubMed

External Resources

View on PubMed Access via DOI PubMed (36825820)

On the effectiveness of compact biomedical transformers.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals