AIMC Topic: Data Curation

Clear Filters Showing 71 to 80 of 142 articles

RysannMD: A biomedical semantic annotator balancing speed and accuracy.

Journal of biomedical informatics
Recently, both researchers and practitioners have explored the possibility of semantically annotating large and continuously evolving collections of biomedical texts such as research papers, medical reports, and physician notes in order to enable the...

Semi-supervised medical entity recognition: A study on Spanish and Swedish clinical corpora.

Journal of biomedical informatics
OBJECTIVE: The goal of this study is to investigate entity recognition within Electronic Health Records (EHRs) focusing on Spanish and Swedish. Of particular importance is a robust representation of the entities. In our case, we utilized unsupervised...

EHR-based phenotyping: Bulk learning and evaluation.

Journal of biomedical informatics
In data-driven phenotyping, a core computational task is to identify medical concepts and their variations from sources of electronic health records (EHR) to stratify phenotypic cohorts. A conventional analytic framework for phenotyping largely uses ...

Building a comprehensive syntactic and semantic corpus of Chinese clinical texts.

Journal of biomedical informatics
OBJECTIVE: To build a comprehensive corpus covering syntactic and semantic annotations of Chinese clinical texts with corresponding annotation guidelines and methods as well as to develop tools trained on the annotated corpus, which supplies baseline...

NegAIT: A new parser for medical text simplification using morphological, sentential and double negation.

Journal of biomedical informatics
Many different text features influence text readability and content comprehension. Negation is commonly suggested as one such feature, but few general-purpose tools exist to discover negation and studies of the impact of negation on text readability ...

Structuring Legacy Pathology Reports by openEHR Archetypes to Enable Semantic Querying.

Methods of information in medicine
BACKGROUND: Clinical information is often stored as free text, e.g. in discharge summaries or pathology reports. These documents are semi-structured using section headers, numbered lists, items and classification strings. However, it is still challen...

Automatic query generation using word embeddings for retrieving passages describing experimental methods.

Database : the journal of biological databases and curation
Information regarding the physical interactions among proteins is crucial, since protein-protein interactions (PPIs) are central for many biological processes. The experimental techniques used to verify PPIs are vital for characterizing and assessing...

OntoBrowser: a collaborative tool for curation of ontologies by subject matter experts.

Bioinformatics (Oxford, England)
UNLABELLED: The lack of controlled terminology and ontology usage leads to incomplete search results and poor interoperability between databases. One of the major underlying challenges of data integration is curating data to adhere to controlled term...

Training and evaluation corpora for the extraction of causal relationships encoded in biological expression language (BEL).

Database : the journal of biological databases and curation
Success in extracting biological relationships is mainly dependent on the complexity of the task as well as the availability of high-quality training data. Here, we describe the new corpora in the systems biology modeling language BEL for training an...

Crowdsourcing and curation: perspectives from biology and natural language processing.

Database : the journal of biological databases and curation
Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodol...