BACKGROUND: Unstructured text in medical records, such as Electronic Health Records, contain an enormous amount of valuable information for research; however, it is difficult to extract and structure important information because of frequent typograp...
Pre-trained natural language processing models on a large natural language corpus can naturally transfer learned knowledge to protein domains by fine-tuning specific in-domain tasks. However, few studies focused on enriching such protein language mod...
The steady rise of online shopping goes hand in hand with the development of increasingly complex ML and NLP models. While most use cases are cast as specialized supervised learning problems, we argue that practitioners would greatly benefit from gen...
There are currently >1.3 million human -omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The maj...
International journal of neural systems
Nov 4, 2022
"Fake news" refers to the deliberate dissemination of news with the purpose to deceive and mislead the public. This paper assesses the accuracy of several Machine Learning (ML) algorithms, using a style-based technique that relies on textual informat...
BACKGROUND: Biomedical named entity recognition (BioNER) is a basic and important task for biomedical text mining with the purpose of automatically recognizing and classifying biomedical entities. The performance of BioNER systems directly impacts do...
Computational intelligence and neuroscience
Oct 30, 2022
The Arabic syntactic diacritics restoration problem is often solved using long short-term memory (LSTM) networks. Handcrafted features are used to augment these LSTM networks or taggers to improve performance. A transformer-based machine learning tec...
The Internet of Things is a paradigm that interconnects several smart devices through the internet to provide ubiquitous services to users. This paradigm and Web 2.0 platforms generate countless amounts of textual data. Thus, a significant challenge ...
Product information (PI) is a vital part of any medicinal product approved for use within the European Union and consists of a summary of products characteristics (SmPC) for healthcare professionals and package leaflet (PL) for patients, together wit...
Neural networks : the official journal of the International Neural Network Society
Oct 19, 2022
Distant supervision (DS) can automatically generate annotated data for relation extraction (RE) with knowledge bases and corpora. The existing DS methods that train on bags selected by attention mechanism are susceptible to noisy bags and neglect use...