AIMC Topic: Data Curation

Clear Filters Showing 31 to 40 of 142 articles

Generalisation Gap of Keyword Spotters in a Cross-Speaker Low-Resource Scenario.

Sensors (Basel, Switzerland)
Models for keyword spotting in continuous recordings can significantly improve the experience of navigating vast libraries of audio recordings. In this paper, we describe the development of such a keyword spotting system detecting regions of interest...

A localization strategy combined with transfer learning for image annotation.

PloS one
This study aims to solve the overfitting problem caused by insufficient labeled images in the automatic image annotation field. We propose a transfer learning model called CNN-2L that incorporates the label localization strategy described in this stu...

Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning.

Nature biotechnology
A principal challenge in the analysis of tissue imaging data is cell segmentation-the task of identifying the precise boundary of every cell in an image. To address this problem we constructed TissueNet, a dataset for training segmentation models tha...

The value of human data annotation for machine learning based anomaly detection in environmental systems.

Water research
Anomaly detection is the process of identifying unexpected data samples in datasets. Automated anomaly detection is either performed using supervised machine learning models, which require a labelled dataset for their calibration, or unsupervised mod...

Harnessing clinical annotations to improve deep learning performance in prostate segmentation.

PloS one
PURPOSE: Developing large-scale datasets with research-quality annotations is challenging due to the high cost of refining clinically generated markup into high precision annotations. We evaluated the direct use of a large dataset with only clinicall...

Biomedical Knowledge Graphs Construction From Conditional Statements.

IEEE/ACM transactions on computational biology and bioinformatics
Conditions play an essential role in biomedical statements. However, existing biomedical knowledge graphs (BioKGs) only focus on factual knowledge, organized as a flat relational network of biomedical concepts. These BioKGs ignore the conditions of t...

Recurrent Neural Networks to Automatically Identify Rare Disease Epidemiologic Studies from PubMed.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
Rare diseases affect between 25 and 30 million people in the United States, and understanding their epidemiology is critical to focusing research efforts. However, little is known about the prevalence of many rare diseases. Given a lack of automated ...

Ontology-driven weak supervision for clinical entity classification in electronic health records.

Nature communications
In the electronic health record, using clinical notes to identify entities such as disorders and their temporality (e.g. the order of an event relative to a time index) can inform many important analyses. However, creating training data for clinical ...

Pneumothorax detection in chest radiographs: optimizing artificial intelligence system for accuracy and confounding bias reduction using in-image annotations in algorithm training.

European radiology
OBJECTIVES: Diagnostic accuracy of artificial intelligence (AI) pneumothorax (PTX) detection in chest radiographs (CXR) is limited by the noisy annotation quality of public training data and confounding thoracic tubes (TT). We hypothesize that in-ima...

Classification aware neural topic model for COVID-19 disinformation categorisation.

PloS one
The explosion of disinformation accompanying the COVID-19 pandemic has overloaded fact-checkers and media worldwide, and brought a new major challenge to government responses worldwide. Not only is disinformation creating confusion about medical scie...