MOTIVATION: The explosive increase of biomedical literature has made information extraction an increasingly important tool for biomedical research. A fundamental task is the recognition of biomedical named entities in text (BNER) such as genes/protei...
MOTIVATION: Virus phylogeographers rely on DNA sequences of viruses and the locations of the infected hosts found in public sequence databases like GenBank for modeling virus spread. However, the locations in GenBank records are often only at the cou...
MOTIVATION: Most gene prioritization methods model each disease or phenotype individually, but this fails to capture patterns common to several diseases or phenotypes. To overcome this limitation, we formulate the gene prioritization task as the fact...
MOTIVATION: Computational gene prioritization can aid in disease gene identification. Here, we propose pBRIT (prioritization using Bayesian Ridge regression and Information Theoretic model), a novel adaptive and scalable prioritization tool, integrat...
SUMMARY: Second use of clinical data commonly involves annotating biomedical text with terminologies and ontologies. The National Center for Biomedical Ontology Annotator is a frequently used annotation service, originally designed for biomedical dat...
Journal of the American Medical Informatics Association : JAMIA
May 1, 2018
OBJECTIVE: Unlocking the data contained within both structured and unstructured components of electronic health records (EHRs) has the potential to provide a step change in data available for secondary research use, generation of actionable medical i...
This study reports proof-of-principle early detection of chemotherapeutic-associated skin adverse drug reactions from social health networks using a deep learning–based signal generation pipeline to capture how patients describe cutaneous eruptions.
Studies in health technology and informatics
Jan 1, 2018
Medical reports often contain a lot of relevant information in the form of free text. To reuse these unstructured texts for biomedical research, it is important to extract structured data from them. In this work, we adapted a previously developed inf...
Studies in health technology and informatics
Jan 1, 2018
We introduce 3000PA, a clinical document corpus composed of 3,000 EPRs from three different clinical sites, which will serve as the backbone of a national reference language resource for German clinical NLP. We outline its design principles, results ...
Journal of the American Medical Informatics Association : JAMIA
Jan 1, 2018
The gap between domain experts and natural language processing expertise is a barrier to extracting understanding from clinical text. We describe a prototype tool for interactive review and revision of natural language processing models of binary con...