AIMC Topic: Data Curation

Clear Filters Showing 91 to 100 of 147 articles

A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text.

AMIA ... Annual Symposium proceedings. AMIA Symposium
Clinical Named Entity Recognition (NER) is a critical task for extracting important patient information from clinical text to support clinical and translational research. This study explored the neural word embeddings derived from a large unlabeled c...

Finding Cervical Cancer Symptoms in Swedish Clinical Text using a Machine Learning Approach and NegEx.

AMIA ... Annual Symposium proceedings. AMIA Symposium
Detection of early symptoms in cervical cancer is crucial for early treatment and survival. To find symptoms of cervical cancer in clinical text, Named Entity Recognition is needed. In this paper the Clinical Entity Finder, a machine-learning tool tr...

Scaling Out and Evaluation of OBSecAn, an Automated Section Annotator for Semi-Structured Clinical Documents, on a Large VA Clinical Corpus.

AMIA ... Annual Symposium proceedings. AMIA Symposium
"Identifying and labeling" (annotating) sections improves the effectiveness of extracting information stored in the free text of clinical documents. OBSecAn, an automated ontology-based section annotator, was developed to identify and label sections ...

SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data.

Database : the journal of biological databases and curation
There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, ret...

Using distant supervised learning to identify protein subcellular localizations from full-text scientific articles.

Journal of biomedical informatics
Databases of curated biomedical knowledge, such as the protein-locations reflected in the UniProtKB database, provide an accurate and useful resource to researchers and decision makers. Our goal is to augment the manual efforts currently used to cura...

Moving the mountain: analysis of the effort required to transform comparative anatomy into computable anatomy.

Database : the journal of biological databases and curation
The diverse phenotypes of living organisms have been described for centuries, and though they may be digitized, they are not readily available in a computable form. Using over 100 morphological studies, the Phenoscape project has demonstrated that by...

The Confidence Information Ontology: a step towards a standard for asserting confidence in annotations.

Database : the journal of biological databases and curation
Biocuration has become a cornerstone for analyses in biology, and to meet needs, the amount of annotations has considerably grown in recent years. However, the reliability of these annotations varies; it has thus become necessary to be able to assess...

Ontology application and use at the ENCODE DCC.

Database : the journal of biological databases and curation
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a catalog of genomic annotations. To date, the project has generated over 4000 experiments across more than 350 cell lines and tissues using a wide array o...

Shared resources, shared costs--leveraging biocuration resources.

Database : the journal of biological databases and curation
The manual curation of the information in biomedical resources is an expensive task. This article argues the value of this approach in comparison with other apparently less costly options, such as automated annotation or text-mining, then discusses w...

mycoCLAP, the database for characterized lignocellulose-active proteins of fungal origin: resource and text mining curation support.

Database : the journal of biological databases and curation
Enzymes active on components of lignocellulosic biomass are used for industrial applications ranging from food processing to biofuels production. These include a diverse array of glycoside hydrolases, carbohydrate esterases, polysaccharide lyases and...